π-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows Paper • 2605.14678 • Published 9 days ago • 102
Many-Shot CoT-ICL: Making In-Context Learning Truly Learn Paper • 2605.13511 • Published 15 days ago • 32
δ-mem: Efficient Online Memory for Large Language Models Paper • 2605.12357 • Published 16 days ago • 123
OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis Paper • 2603.20278 • Published Mar 17 • 98
Beyond Semantic Similarity: Rethinking Retrieval for Agentic Search via Direct Corpus Interaction Paper • 2605.05242 • Published 25 days ago • 116
Listwise Policy Optimization: Group-based RLVR as Target-Projection on the LLM Response Simplex Paper • 2605.06139 • Published 21 days ago • 66
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding Paper • 2502.08946 • Published Feb 13, 2025 • 193
HumanNet: Scaling Human-centric Video Learning to One Million Hours Paper • 2605.06747 • Published 21 days ago • 52
OpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty Trajectories Paper • 2605.04036 • Published 23 days ago • 68
MiA-Signature: Approximating Global Activation for Long-Context Understanding Paper • 2605.06416 • Published 21 days ago • 55
Query-focused and Memory-aware Reranker for Long Context Processing Paper • 2602.12192 • Published Feb 12 • 58
Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling Paper • 2512.23959 • Published Dec 30, 2025 • 111
ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning Paper • 2512.05111 • Published Dec 4, 2025 • 50
Figure It Out: Improving the Frontier of Reasoning with Active Visual Thinking Paper • 2512.24297 • Published Dec 30, 2025 • 6
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation Paper • 2512.23576 • Published Dec 29, 2025 • 66