In-Video Instructions: Visual Signals as Generative Control Paper • 2511.19401 • Published Nov 24, 2025 • 30
Introducing Visual Perception Token into Multimodal Large Language Model Paper • 2502.17425 • Published Feb 24, 2025 • 16
CoT-Valve: Length-Compressible Chain-of-Thought Tuning Paper • 2502.09601 • Published Feb 13, 2025 • 14
Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient Paper • 2411.17787 • Published Nov 26, 2024 • 12
AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising Paper • 2406.06911 • Published Jun 11, 2024 • 12