12 311

Inflammable1230

AI & ML interests

None yet

Recent Activity

liked a model 3 days ago

Qwen/Qwen3.5-35B-A3B

liked a Space 4 days ago

webml-community/microgpt-playground

reacted to robtacconelli's post with 🔥 4 days ago

🏆 Nacrith: a 135M model that out-compresses everything on natural language What if a tiny LM could compress english text better than _every_ compressor out there — classical or neural, small or large? Nacrith pairs SmolLM2-135M with an ensemble of online predictors and high-precision arithmetic coding. What's inside The standard LLM+arithmetic coding approach wastes ~75% of CDF precision on large vocabularies. Our CDF-24 fix alone recovers 0.5 bpb. On top: a token N-gram that skips the GPU on predictable tokens, an adaptive bias head, llama.cpp backend (7× faster than PyTorch), multi-GPU parallel compression, and a binary file format (NC06) — the first LLM-based binary compressor we know of. Runs on a GTX 1050 Ti. ~500 MB weights, ~1.2 GB VRAM per worker. 💻 Code: https://github.com/robtacconelli/Nacrith-GPU ⭐ Space: https://huggingface.co/spaces/robtacconelli/Nacrith-GPU 📄 Paper: https://huggingface.co/papers/2602.19626 Try it, break it, share your results — all feedback welcome. ⭐ on the repo appreciated! Results across all systems we tested: - alice29.txt → 0.918 bpb (−44% vs CMIX, −20% vs ts_zip) — below the 2nd-order Shannon entropy bound - enwik8 (100 MB) → 0.9389 bpb (−8% vs FineZip/LLMZip's 8B model, −15% vs ts_zip) - Unseen text → 0.723 bpb on a doc published after training cutoff — no memorization, 26% better than FineZip/LLMZip on the same model SmolLM2-135M by https://huggingface.co/HuggingFaceTB

View all activity

Organizations

None yet

upvoted a collection 12 days ago

Qwen3.5

Collection

13 items • Updated about 9 hours ago • 482

upvoted a paper 16 days ago

HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing

Paper • 2602.03560 • Published 25 days ago • 45

upvoted a paper about 1 month ago

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2, 2025 • 188

upvoted a changelog about 1 month ago

Changelog

Sort Models by Parameter Size

Jan 22

• 34

upvoted a paper about 1 month ago

iFairy: the First 2-bit Complex LLM with All Parameters in {pm1, pm i}

Paper • 2508.05571 • Published Aug 7, 2025 • 2

upvoted 2 collections about 2 months ago

Cerebras REAP

Collection

Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 30 items • Updated 4 days ago • 123

LLM For Smartphone

Collection

These are some of the best llm that can run on a smartphone. These models go toe-to-toe with much larger models, and are great for use on the go. • 4 items • Updated Dec 31, 2025 • 24

upvoted a paper 3 months ago

InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models

Paper • 2512.08829 • Published Dec 9, 2025 • 21

upvoted a paper 10 months ago

Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 88

upvoted an article 10 months ago

Article

Bamba-9B-v2 - Fast and powerful!

Apr 29, 2025

•

upvoted 2 papers about 1 year ago

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10, 2025 • 152

Euclid: Supercharging Multimodal LLMs with Synthetic High-Fidelity Visual Descriptions

Paper • 2412.08737 • Published Dec 11, 2024 • 54

Inflammable1230

AI & ML interests

Recent Activity

Organizations

Inflammable1230's activity

Sort Models by Parameter Size

Bamba-9B-v2 - Fast and powerful!