deepseek-ai/DeepSeek-V3.2-Exp Text Generation • 685B • Updated Nov 18, 2025 • 74.6k • • 930
Qwen/Qwen3-VL-235B-A22B-Instruct Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 230k • • 348
CoMP: Continual Multimodal Pre-training for Vision Foundation Models Paper • 2503.18931 • Published Mar 24, 2025 • 30
Inst-IT: Boosting Multimodal Instance Understanding via Explicit Visual Prompt Instruction Tuning Paper • 2412.03565 • Published Dec 4, 2024 • 10
SliMM Collection A Simple LMM baseline with Dynamic Visual Resolution • 5 items • Updated Dec 15, 2024