HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing Paper โข 2602.03560 โข Published 25 days ago โข 45
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper โข 2506.01939 โข Published Jun 2, 2025 โข 188
iFairy: the First 2-bit Complex LLM with All Parameters in {pm1, pm i} Paper โข 2508.05571 โข Published Aug 7, 2025 โข 2
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method โข 30 items โข Updated 4 days ago โข 123
LLM For Smartphone Collection These are some of the best llm that can run on a smartphone. These models go toe-to-toe with much larger models, and are great for use on the go. โข 4 items โข Updated Dec 31, 2025 โข 24
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models Paper โข 2512.08829 โข Published Dec 9, 2025 โข 21
Flow-GRPO: Training Flow Matching Models via Online RL Paper โข 2505.05470 โข Published May 8, 2025 โข 88
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper โข 2502.06703 โข Published Feb 10, 2025 โข 152
Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions Paper โข 2412.08737 โข Published Dec 11, 2024 โข 54