Running Featured 74 QED-Nano: Teaching a Tiny Model to Prove Hard Theorems 📝 74 Who needs 1T parameters? Olympiad proofs with a 4B model
view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance tngtech • Apr 16, 2025 • 81
view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 411