view article Article Harness, Scaffold, and the AI Agent Terms Worth Getting Right sergiopaniego, ariG23498 • May 25 • 121
view article Article Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler +3 ariG23498, sayakpaul, sergiopaniego, ror, pcuenq • 28 days ago • 127
Running 193 The ultimate guide to RL environments: building and scaling them in the LLM era 📝 193 Building and scaling RL environments for LLM training
view article Article Unlocking asynchronicity in continuous batching +1 ror, pcuenq, ariG23498 • May 14 • 61
Working Notes on Late Interaction Dynamics: Analyzing Targeted Behaviors of Late Interaction Models Paper • 2603.26259 • Published Mar 27 • 8
view article Article DeepSeek-V4: a million-token context that agents can actually use burtenshaw • Apr 24 • 50
Running on CPU Upgrade 262 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 262 Visualize synthetic‑data experiments as an interactive bookshelf
view article Article Mixture of Experts (MoEs) in Transformers +5 ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap • Feb 26 • 169
view article Article LateOn-Code & ColGrep: LightOn unveils state-of-the-art code retrieval models and code search tooling lightonai • Feb 12 • 57
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 lysandre, ArthurZ, cyrilvallez, reach-vb • Dec 1, 2025 • 311
view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 411
Surfer 2: The Next Generation of Cross-Platform Computer Use Agents Paper • 2510.19949 • Published Oct 22, 2025 • 38
Surfer 2: The Next Generation of Cross-Platform Computer Use Agents Paper • 2510.19949 • Published Oct 22, 2025 • 38
Running on CPU Upgrade Featured 3.22k The Smol Training Playbook 📚 3.22k The secrets to building world-class LLMs