minyi's picture

In a Training Loop 🔄

minyi

minyichen

·

min-yi-chen-68b6ab130

AI & ML interests

None yet

Recent Activity

liked a dataset about 23 hours ago

sujet-ai/Sujet-Finance-QA-Vision-100k

liked a dataset about 23 hours ago

sujet-ai/Sujet-Finance-Vision-10k

liked a dataset about 23 hours ago

Jackrong/DeepSeek-V4-Distill-8000x

View all activity

Organizations

upvoted 2 collections 4 months ago

🌟 T1 Series

Instruction-tuned Gemma-3 models optimized for agentic workflows in Traditional Chinese. • 5 items • Updated Mar 2 • 3

🧠 Traditional Chinese Reasoning Datasets

A curated collection of datasets designed to evaluate and train reasoning capabilities in Traditional Chinese across various domains. • 3 items • Updated Oct 13, 2025 • 9

upvoted 2 articles 8 months ago

Article

NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks

Aug 11, 2025

•

76

Article

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

Aug 8, 2025

•

34

upvoted 2 articles 9 months ago

Article

How To Build a News Agent with GPT-OSS, Hugging Face Inference & Gradio

Aug 14, 2025

•

25

Article

Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training

+3

Aug 8, 2025

•

97

upvoted 2 articles 10 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

Jul 8, 2025

•

770

Article

The Large Language Model Course

Jan 16, 2025

•

228

upvoted 2 articles 11 months ago

Article

Vision Language Models Explained

Apr 11, 2024

•

530

Article

Vision Language Models (Better, faster, stronger)

+3

May 12, 2025

•

609

upvoted a collection 11 months ago

Common Pile v0.1

All resources related to Common Pile v0.1, an 8TB dataset of public domain and openly licensed text • 4 items • Updated Jun 6, 2025 • 40

upvoted an article about 1 year ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

+2

Mar 12, 2025

•

495

upvoted 2 collections over 1 year ago

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated Mar 2 • 97

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated Mar 2 • 717

upvoted 5 articles over 1 year ago

Article

Merge Large Language Models with mergekit

Jan 9, 2024

•

155

Article

Uncensor any LLM with abliteration

Jun 13, 2024

•

845

Article

MTEB Leaderboard : User guide and best practices

Mar 13, 2024

•

9

Article

🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets

Jun 4, 2024

•

79

Article

🔥 Argilla 2.0: the data-centric tool for AI makers 🤗

Jul 30, 2024

•

39