PhyGDPO: Physics-Aware Groupwise Direct Preference Optimization for Physically Consistent Text-to-Video Generation Paper • 2512.24551 • Published 2 days ago • 10
SVBench: Evaluation of Video Generation Models on Social Reasoning Paper • 2512.21507 • Published 8 days ago • 7
GaMO: Geometry-aware Multi-view Diffusion Outpainting for Sparse-View 3D Reconstruction Paper • 2512.25073 • Published 1 day ago • 21
VL-LN Bench: Towards Long-horizon Goal-oriented Navigation with Active Dialogs Paper • 2512.22342 • Published 6 days ago • 8
YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection Paper • 2512.23273 • Published 4 days ago • 11
OmniAgent: Audio-Guided Active Perception Agent for Omnimodal Audio-Video Understanding Paper • 2512.23646 • Published 3 days ago • 13
SurgWorld: Learning Surgical Robot Policies from Videos via World Modeling Paper • 2512.23162 • Published 4 days ago • 8
An Information Theoretic Perspective on Agentic System Design Paper • 2512.21720 • Published 7 days ago • 6
Act2Goal: From World Model To General Goal-conditioned Policy Paper • 2512.23541 • Published 3 days ago • 21
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation Paper • 2512.23705 • Published 3 days ago • 37
MAI-UI Technical Report: Real-World Centric Foundation GUI Agents Paper • 2512.22047 • Published 6 days ago • 25
VA-π: Variational Policy Alignment for Pixel-Aware Autoregressive Generation Paper • 2512.19680 • Published 10 days ago • 10
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models Paper • 2512.19995 • Published 10 days ago • 14
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning Paper • 2512.20605 • Published 9 days ago • 59