Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding Paper • 2604.05015 • Published 3 days ago • 197
Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Paper • 2604.06132 • Published 2 days ago • 95
MiMo-Embodied: X-Embodied Foundation Model Technical Report Paper • 2511.16518 • Published Nov 20, 2025 • 26
VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding? Paper • 2404.05955 • Published Apr 9, 2024
The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinism Paper • 2407.10457 • Published Jul 15, 2024 • 24
AgentBank: Towards Generalized LLM Agents via Fine-Tuning on 50000+ Interaction Trajectories Paper • 2410.07706 • Published Oct 10, 2024
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining Paper • 2505.07608 • Published May 12, 2025 • 82
PiTe: Pixel-Temporal Alignment for Large Video-Language Model Paper • 2409.07239 • Published Sep 11, 2024 • 15
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining Paper • 2505.07608 • Published May 12, 2025 • 82
MiMo: Unlocking the Reasoning Potential of Language Model -- From Pretraining to Posttraining Paper • 2505.07608 • Published May 12, 2025 • 82 • 6