-
starriver030515/hapo_data
Viewer • Updated • 1.59k • 44 -
starriver030515/Qwen2.5-Math-1.5B-16k
Text Generation • 2B • Updated • 1 -
starriver030515/Qwen2.5-Math-7B-32k
Text Generation • 8B • Updated • 5 -
From Uniform to Heterogeneous: Tailoring Policy Optimization to Every Token's Nature
Paper • 2509.16591 • Published • 2
Zheng Liu
starriver030515
AI & ML interests
None yet
Recent Activity
updated
a dataset
about 16 hours ago
OpenDataArena/MMFineReason-Full-2.3M-Qwen3-VL-235B-Thinking
liked
a dataset
about 22 hours ago
OpenDataArena/MMFineReason-SFT-123K-Qwen3-VL-235B-Thinking
authored
a paper
5 days ago
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods