view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 3 days ago • 36
Running Featured 65 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 65 Who needs 1T parameters? Olympiad proofs with a 4B model
view article Article From Golden Gate Bridge to Broken JSON: Why Anthropic's SAE Steering Fails for Structured Output Feb 7 • 22