T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning Paper • 2603.03790 • Published 3 days ago • 105
DFlash Collection Block Diffusion for Flash Speculative Decoding • 8 items • Updated about 13 hours ago • 20
FlashSVD: Memory-Efficient Inference with Streaming for Low-Rank Models Paper • 2508.01506 • Published Aug 2, 2025 • 2
SADA: Stability-guided Adaptive Diffusion Acceleration Paper • 2507.17135 • Published Jul 23, 2025 • 2
KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems Paper • 2510.12872 • Published Oct 14, 2025 • 4
KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems Paper • 2510.12872 • Published Oct 14, 2025 • 4 • 2
HippoMM: Hippocampal-inspired Multimodal Memory for Long Audiovisual Event Understanding Paper • 2504.10739 • Published Apr 14, 2025 • 2
SADA: Stability-guided Adaptive Diffusion Acceleration Paper • 2507.17135 • Published Jul 23, 2025 • 2
IoT-MCP: Bridging LLMs and IoT Systems Through Model Context Protocol Paper • 2510.01260 • Published Sep 25, 2025 • 3
FlashSVD: Memory-Efficient Inference with Streaming for Low-Rank Models Paper • 2508.01506 • Published Aug 2, 2025 • 2
KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems Paper • 2510.12872 • Published Oct 14, 2025 • 4
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published Sep 29, 2025 • 140
CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models Paper • 2505.19235 • Published May 25, 2025 • 4