Running Featured 52 Distilling 100B+ Models 40x Faster with TRL ๐ 52 TRL distillation for 100B+ teachers, 40x faster
Running on CPU Upgrade 1.01k Open VLM Leaderboard ๐ 1.01k VLMEvalKit Evaluation Results Collection
Running on CPU Upgrade 220 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens ๐ 220 Explore synthetic data experiments on a virtual bookshelf
Running Featured 71 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems ๐ 71 Who needs 1T parameters? Olympiad proofs with a 4B model
Running on CPU Upgrade 13.9k Open LLM Leaderboard ๐ 13.9k Track, rank and evaluate open LLMs and chatbots
Running on CPU Upgrade Featured 3.11k The Smol Training Playbook ๐ 3.11k The secrets to building world-class LLMs
Running 221 FineVision: Open Data is All You Need ๐ 221 A new open-source dataset for training VLMs
Running Featured 1.33k FineWeb: decanting the web for the finest text data at scale ๐ท 1.33k Read a detailed overview of the FineWeb webโscale text dataset
Running 3.78k The Ultra-Scale Playbook ๐ 3.78k The ultimate guide to training LLM on large GPU Clusters