-
Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs
Paper • 2605.09063 • Published • 80 -
NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation
Paper • 2605.10813 • Published • 16 -
AutoMedBench: Towards Medical AutoResearch with Agentic AI Models
Paper • 2606.01961 • Published • 25
Song Dingjie
songdj
AI & ML interests
None yet
Recent Activity
authored a paper about 18 hours ago
OpenSkill: Open-World Self-Evolution for LLM Agents upvoted a paper 1 day ago
OpenSkill: Open-World Self-Evolution for LLM Agents updated a collection 4 days ago
AI ScientistOrganizations
None yet