TeichAI/Qwen3-14B-Claude-4.5-Opus-High-Reasoning-Distill-GGUF Text Generation • 15B • Updated 13 days ago • 89.2k • 283
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28, 2025 • 131