Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL Paper • 2602.03773 • Published 13 days ago • 9
Running Featured 21 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 21 Who needs 1T parameters? Olympiad proofs with a 4B model
Running Featured 21 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 21 Who needs 1T parameters? Olympiad proofs with a 4B model
Running Featured 21 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 21 Who needs 1T parameters? Olympiad proofs with a 4B model