·
AI & ML interests
Large Language Models; Reasoning; Reinforcement Learning
Recent Activity
Organizations
TianHongZXY/CHIMERA-4B-SFT
4B • Updated
• 60
• 2
TianHongZXY/CHIMERA-4B-RL
4B • Updated
• 39
• 3
4B • Updated
• 1
TianHongZXY/Qwen2.5-Math-7B-GRPO
8B • Updated
• 3
TianHongZXY/OpenR1-Math-46k-8192-Qwen2.5-7B-Instruct-GRPO-clip_0.28
Updated
TianHongZXY/Qwen2.5-Math-7B-W-REINFORCE
8B • Updated
• 2
• 1
TianHongZXY/Qwen3-4B-GRPO
4B • Updated
• 6
4B • Updated
• 1
4B • Updated
• 1
• 1
TianHongZXY/Qwen2.5-Math-7B-PPO
8B • Updated
• 3
TianHongZXY/Qwen2.5-Math-7B-PSR
8B • Updated
• 3
TianHongZXY/Qwen2.5-Math-7B-NSR
8B • Updated
• 3
• 2