Graph-Based Chain-of-Thought Pruning for Reducing Redundant Reflections in Reasoning LLMs Paper • 2604.05643 • Published 8 days ago • 12
Combee: Scaling Prompt Learning for Self-Improving Language Model Agents Paper • 2604.04247 • Published 10 days ago • 29
FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling Paper • 2604.06916 • Published 7 days ago • 31
MARS: Enabling Autoregressive Models Multi-Token Generation Paper • 2604.07023 • Published 7 days ago • 36
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published 6 days ago • 273
Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability Paper • 2604.06628 • Published 7 days ago • 308
Demystifying When Pruning Works via Representation Hierarchies Paper • 2603.24652 • Published 9 days ago • 19
ACES: Who Tests the Tests? Leave-One-Out AUC Consistency for Code Generation Paper • 2604.03922 • Published 10 days ago • 53
ThinkTwice: Jointly Optimizing Large Language Models for Reasoning and Self-Refinement Paper • 2604.01591 • Published 13 days ago • 40
How Well Do Agentic Skills Work in the Wild: Benchmarking LLM Skill Usage in Realistic Settings Paper • 2604.04323 • Published 9 days ago • 39
SkillX: Automatically Constructing Skill Knowledge Bases for Agents Paper • 2604.04804 • Published 9 days ago • 31
ClawArena: Benchmarking AI Agents in Evolving Information Environments Paper • 2604.04202 • Published 10 days ago • 36
LightThinker++: From Reasoning Compression to Memory Management Paper • 2604.03679 • Published 11 days ago • 33
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 9 days ago • 106
Revision or Re-Solving? Decomposing Second-Pass Gains in Multi-LLM Pipelines Paper • 2604.01029 • Published 13 days ago • 7