Running Featured 52 Distilling 100B+ Models 40x Faster with TRL 📝 52 TRL distillation for 100B+ teachers, 40x faster
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 14 days ago • 850
view article Article Introducing smolagents: simple agents that write actions in code. +1 Dec 31, 2024 • 1.19k
Running on CPU Upgrade 220 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 220 Explore synthetic data experiments on a virtual bookshelf
Running Featured 71 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 71 Who needs 1T parameters? Olympiad proofs with a 4B model
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 Dec 1, 2025 • 309
Running on CPU Upgrade 13.9k Open LLM Leaderboard 🏆 13.9k Track, rank and evaluate open LLMs and chatbots
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 15 items • Updated about 17 hours ago • 267
Running on CPU Upgrade Featured 3.11k The Smol Training Playbook 📚 3.11k The secrets to building world-class LLMs