🌟 T1 Series Collection Instruction-tuned Gemma-3 models optimized for agentic workflows in Traditional Chinese. • 2 items • Updated 1 day ago • 3
🧠Traditional Chinese Reasoning Datasets Collection A curated collection of datasets designed to evaluate and train reasoning capabilities in Traditional Chinese across various domains. • 3 items • Updated Oct 13, 2025 • 9
view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks Aug 11, 2025 • 75
view article Article What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware Aug 8, 2025 • 29
view article Article How To Build a News Agent with GPT-OSS, Hugging Face Inference & Gradio Aug 14, 2025 • 25
view article Article Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training +3 Aug 8, 2025 • 90
Common Pile v0.1 Collection All resources related to Common Pile v0.1, an 8TB dataset of public domain and openly licensed text • 4 items • Updated Jun 6, 2025 • 39
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 Mar 12, 2025 • 480
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated 14 days ago • 96
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated 7 days ago • 672
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets Jun 4, 2024 • 79