Shaobai Jiang
shaobaij
AI & ML interests
None yet
Recent Activity
upvoted a paper 5 minutes ago
Does Your Reasoning Model Implicitly Know When to Stop Thinking? upvoted a paper about 3 hours ago
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training upvoted a paper about 17 hours ago
Benchmarks Saturate When The Model Gets Smarter Than The Judge Organizations
None yet