This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip
liyaxuan
lllyx
·
AI & ML interests
None yet
Recent Activity
upvoted a paper about 7 hours ago
Beyond SFT-to-RL: Pre-alignment via Black-Box On-Policy Distillation for Multimodal RL upvoted a paper about 7 hours ago
MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction upvoted a paper about 7 hours ago
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement LearningOrganizations
None yet