Open Source just got a lot bigger. 🚀

SKT-SURYA-H (2.544T) is officially out!
✅ Heterogeneous MoE
✅ 131K Context
✅ 3.76TB Weights (898 shards)
Massive respect to the team for keeping it open for the community! 🤝🇮🇳

Link: sKT-Ai-Labs/SKT-SURYA-H

#OpenSource #MachineLearning #AITools #Bharat

liked a model 7 days ago

sKT-Ai-Labs/SKT-SURYA-H

2.6T • Updated 5 days ago • 843 • 21

upvoted 2 collections 8 days ago

Qwen3.5

Collection

21 items • Updated Mar 9 • 1.57k

Qwen3.6

Collection

4 items • Updated 3 days ago • 224

reacted to victor's post with 🔥 10 days ago

Post

4976

Want to share my enthusiasm for zai-org/GLM-5.1 here too 🔥

I think we have it: our open source Claude Code = GLM-5.1 + Pi (https://pi.dev/) - Built a Three.js racing game to eval and it's extremely impressive. Thoughts:

- One-shot car physics with real drift mechanics (this is hard)

- My fav part: Awesome at self iterating (with no vision!) created 20+ Bun.WebView debugging tools to drive the car programmatically and read game state. Proved a winding bug with vector math without ever seeing the screen

- 531-line racing AI in a single write: 4 personalities, curvature map, racing lines, tactical drifting. Built telemetry tools to compare player vs AI speed curves and data-tuned parameters

- All assets from scratch: 3D models, procedural textures, sky shader, engine sounds, spatial AI audio!

- Can do hard math: proved road normals pointed DOWN via vector cross products, computed track curvature normalized by arc length to tune AI cornering speed

You are going to hear about this model a lot in the next months - open source let's go - and thanks z-ai🚀🚀

4 replies

New activity in unsloth/MiniMax-M2.7-GGUF 10 days ago

PPL and KLD graphs for each quant

👍 7

#7 opened 11 days ago by

4cast

reacted to SeaWolf-AI's post with 🔥🤯 12 days ago

Post

5882

🧬 Darwin-27B-Opus: 86.9% on GPQA Diamond — World #5, Zero Training
We are excited to share Darwin-27B-Opus, a 27B model that achieved 86.9% on GPQA Diamond — ranking #5 globally on the HuggingFace leaderboard — without a single gradient update.

How? Darwin breeds pretrained models through evolutionary FFN crossbreeding. The father (Qwen3.5-27B) provides the reasoning architecture; the mother (Claude 4.6 Opus Reasoning Distilled) contributes structured chain-of-thought knowledge. CMA-ES automatically discovers optimal per-layer blending ratios — no human tuning required.

The result surpasses the original Qwen3.5-27B (85.5%), GLM-5.1 (744B, 86.2%), and Qwen3.5-122B (86.6%). A 27B model outperforming 744B — with zero training, zero data, one GPU, ~2 hours.

We also confirmed hybrid vigor on Korean benchmarks: Darwin-27B-KR (2nd generation offspring) surpassed both parents on CLIcK, winning 7 out of 11 categories. The evolutionary optimizer independently assigned 93% of FFN from the Korean-specialized mother while preserving 93% of attention from the reasoning-specialized father — autonomously validating our core principle: FFN carries knowledge, Attention carries reasoning.

📊 Public release: 10 days → 300+ community derivatives, 120K+ downloads.

🔗 Links:
Darwin-27B-Opus: FINAL-Bench/Darwin-27B-Opus
article: https://huggingface.co/blog/FINAL-Bench/darwin-gpqa
Darwin Family Collection: https://huggingface.co/collections/FINAL-Bench/darwin-family

If foundation models are raw ore, Darwin is the forge. We are just getting started. 🔥