Pre-training Dataset Samples Collection A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations. • 19 items • Updated 1 day ago • 16
view article Article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons Feb 4 • 28
AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions Paper • 2509.13523 • Published Sep 16 • 7
AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions Paper • 2509.13523 • Published Sep 16 • 7 • 2
AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions Paper • 2509.13523 • Published Sep 16 • 7
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 2 days ago • 81
SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation Paper • 2504.14396 • Published Apr 19 • 27
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub +2 Feb 12 • 81
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published Apr 8 • 182
Running 3.6k The Ultra-Scale Playbook 🌌 3.6k The ultimate guide to training LLM on large GPU Clusters