view article Article Welcome Llama 4 Maverick & Scout on Hugging Face +5 burtenshaw, reach-vb, pcuenq, clem, rajatarya, jsulz, lysandre • Apr 5, 2025 • 149
view article Article SmolVLM2: Bringing Video Understanding to Every Device +5 orrzohar, mfarre, andito, merve, pcuenq, cyrilzakka, Xenova • Feb 20, 2025 • 343
view article Article SigLIP 2: A better multilingual vision language encoder +1 ariG23498, merve, qubvel-hf • Feb 21, 2025 • 217
view article Article Open-source DeepResearch – Freeing our search agents +3 m-ric, albertvillanova, merve, thomwolf, clefourrier • Feb 4, 2025 • 1.32k
view article Article Introducing smolagents: simple agents that write actions in code. +1 m-ric, merve, thomwolf • Dec 31, 2024 • 1.2k
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 +1 eliebak, lvwerra, lewtun • Jan 28, 2025 • 889
view article Article Superposition in Transformers: A Novel Way of Building Mixture of Experts BenChaliah • Jan 4, 2025 • 13
Scaling Test-Time Compute with Open Models Collection Models and datasets used in our blog post: https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute • 10 items • Updated Jan 6, 2025 • 31
MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages Paper • 2410.01036 • Published Oct 1, 2024 • 16
view article Article Llama can now see and run on your device - welcome Llama 3.2 +5 merve, philschmid, osanseviero, reach-vb, lewtun, ariG23498, pcuenq • Sep 25, 2024 • 191
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 16 items • Updated Dec 24, 2025 • 244
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy +4 medmekk, marcsun13, lvwerra, pcuenq, osanseviero, thomwolf • Sep 18, 2024 • 281
view article Article Scaling robotics datasets with video encoding +1 aliberts, cadene, mfarre • Aug 27, 2024 • 41
view article Article Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth mlabonne • Jul 29, 2024 • 373
view article Article Memory-efficient Diffusion Transformers with Quanto and Diffusers sayakpaul, dacorvo • Jul 30, 2024 • 68
TextGrad: Automatic "Differentiation" via Text Paper • 2406.07496 • Published Jun 11, 2024 • 31