EVA: Efficient Reinforcement Learning for End-to-End Video Agent Paper • 2603.22918 • Published 2 days ago • 31
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper • 2603.15726 • Published 10 days ago • 179
view article Article NEO-unify: Building Native Multimodal Unified Models End to End 21 days ago • 103
view article Article NEO-unify: Building Native Multimodal Unified Models End to End 21 days ago • 103
SenseNova-MARS: Empowering Multimodal Agentic Reasoning and Search via Reinforcement Learning Paper • 2512.24330 • Published Dec 30, 2025 • 36
sensenova/SenseNova-SI-1.2-InternVL3-8B Image-Text-to-Text • 8B • Updated Dec 10, 2025 • 1.53k • 10
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published Dec 22, 2025 • 67
Scaling Spatial Intelligence with Multimodal Foundation Models Paper • 2511.13719 • Published Nov 17, 2025 • 48
NEO1_0 Collection From Pixels to Words -- Towards Native Vision-Language Primitives at Scale • 7 items • Updated Jan 27 • 9
SenseNova-SI Collection Scaling Spatial Intelligence with Multimodal Foundation Models • 12 items • Updated about 5 hours ago • 16
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published Nov 14, 2025 • 194
Running on CPU Upgrade Featured 3.06k The Smol Training Playbook 📚 3.06k The secrets to building world-class LLMs
Vlaser: Vision-Language-Action Model with Synergistic Embodied Reasoning Paper • 2510.11027 • Published Oct 13, 2025 • 23
VR-Thinker: Boosting Video Reward Models through Thinking-with-Image Reasoning Paper • 2510.10518 • Published Oct 12, 2025 • 19