20 2

liyaxuan

lllyx

AI & ML interests

None yet

Recent Activity

upvoted a paper about 7 hours ago

Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL

upvoted a paper about 7 hours ago

MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction

upvoted a paper about 7 hours ago

Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning

View all activity

Organizations

None yet

Collections 1

Papers 2

arxiv:2604.13016

arxiv:2510.08483

spaces 1

ML Patch

👁

Submit data for inference and view results

models 2

lllyx/Qwen3-1.7B-SFT

Text Generation • 2B • Updated 7 days ago • 696 • 1

lllyx/Qwen3-4B-Base-GRPO

Text Generation • 4B • Updated 7 days ago • 142 • 1

datasets 0

None public yet

liyaxuan

AI & ML interests

Recent Activity

Organizations

Collections 1

lllyx/Qwen3-1.7B-SFT

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

lllyx/Qwen3-4B-Base-GRPO

lllyx/Qwen3-1.7B-SFT

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

lllyx/Qwen3-4B-Base-GRPO

Papers 2

spaces 1

ML Patch

models 2

lllyx/Qwen3-1.7B-SFT

lllyx/Qwen3-4B-Base-GRPO

datasets 0

liyaxuan

AI & ML interests

Recent Activity

Organizations

Collections 1

Papers 2

spaces 1

ML Patch

models 2 Sort: Recently updated

datasets 0

models 2