AI & ML interests

Collection of JS libraries to interact with the Hugging Face Hub

Recent Activity

pcuenqΒ 
posted an update about 18 hours ago
view post
Post
762
πŸ‘‰ What happened in AI in 2025? πŸ‘ˆ

We prepared the 2025 version of the HF AI Timeline Grid, highlighting open vs API-based model releases, and allowing you to browse and filter by access, modality, and release type!

Play with it here:
2025-ai-timeline/2025-ai-timeline

Here's my personal quarterly TL;DR:

1️⃣ Q1 β€” Learning to Reason
Deepseek not only releases a top-notch reasoning model, but shows how to train them and compete with closed frontier models. OpenAI debuts Deep Research.

Significant milestones: DeepSeek R1 & R1-Zero, Qwen 2.5 VL, OpenAI Deep Research, Gemini 2.5 Pro (experimental)

2️⃣ Q2 β€” Multimodality and Coding
More LLMs embrace multimodality by default, and there's a surge in coding agents. Strong vision, audio, and generative models emerge.

Significant milestones: Llama 4, Qwen 3, Imagen 4, OpenAI Codex, Google Jules, Claude 4

3️⃣ Q3 β€” "Gold" rush, OpenAI opens up, the community goes bananas
Flagship models get gold in Math olympiads and hard benchmarks. OpenAI releases strong open source models and Google releases the much anticipated nano-banana for image generation and editing. Agentic workflows become commonplace.

Significant milestones: Gemini and OpenAI IMO Gold, gpt-oss, Gemini 2.5 Flash Image, Grok 4, Claude Sonnet 4.5

4️⃣ Q4 β€” Mistral returns, leaderboard hill-climbing
Mistral is back with updated model families. All labs release impressive models to wrap up the year!

Significant milestones: Claude Opus 4.5, DeepSeek Math V2, FLUX 2, GPT 5.1, Kimi K2 Thinking, Nano Banana Pro, GLM 4.7, Gemini 3, Mistral 3, MiniMax M2.1 🀯

Credits
πŸ™ NHLOCAL for the source data https://github.com/NHLOCAL/AiTimeline

🫑 @reach-vb for the original idea, design and recipe

πŸ™Œ @ariG23498 and yours truly for compiling and verifying the 2025 edition

πŸ₯³ Here's to 2026, wishing it becomes the best year ever for open releases and on-device-first use-cases! πŸ₯‚
coyotte508Β 
updated a model about 2 months ago
coyotte508Β 
published a model about 2 months ago
merveΒ 
posted an update 3 months ago
view post
Post
7547
deepseek-ai/DeepSeek-OCR is out! πŸ”₯ my take ‡️
> pretty insane it can parse and re-render charts in HTML
> it uses CLIP and SAM features concatenated, so better grounding
> very efficient per vision tokens/performance ratio
> covers 100 languages
Β·

Upload 99 files

#5 opened 3 months ago by
fellybikush

Delete index.html

#3 opened 3 months ago by
fellybikush
multimodalartΒ 
posted an update 3 months ago
view post
Post
8406
Want to iterate on a Hugging Face Space with an LLM?

Now you can easily convert any HF entire repo (Model, Dataset or Space) to a text file and feed it to a language model!

multimodalart/repo2txt

update jinja

#2 opened 3 months ago by
Xenova
merveΒ 
posted an update 4 months ago
view post
Post
6812
large AI labs open-sourced a ton of models last week πŸ”₯
here's few picks, find even more here merve/sep-16-releases-68d13ea4c547f02f95842f05 🀝
> IBM released a new Docling model with 258M params based on Granite (A2.0) πŸ“ ibm-granite/granite-docling-258M
> Xiaomi released 7B audio LM with base and instruct variants (MIT) XiaomiMiMo/mimo-audio-68cc7202692c27dae881cce0
> DecartAI released Lucy Edit, open Nano Banana 🍌 (NC) decart-ai/Lucy-Edit-Dev
> OpenGVLab released a family of agentic computer use models (3B/7B/32B) with the dataset πŸ’» OpenGVLab/scalecua-68c912cf56f7ff4c8e034003
> Meituan Longcat released thinking version of LongCat-Flash πŸ’­ meituan-longcat/LongCat-Flash-Thinking
  • 2 replies
Β·
merveΒ 
posted an update 4 months ago
view post
Post
3393
IBM just released small swiss army knife for the document models: granite-docling-258M on Hugging Face πŸ”₯

> not only a document converter but also can do document question answering, understand multiple languages 🀯
> best part: released with Apache 2.0 license πŸ‘ use it with your commercial projects!
> it supports transformers, vLLM and MLX from the get-go! πŸ€—
> built on SigLIP2 & granite-165M

model: ibm-granite/granite-docling-258M
demo: ibm-granite/granite-docling-258m-demo πŸ’—