view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 389
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published Oct 30, 2025 • 120
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.13k
view article Article You could have designed state of the art positional encoding FL33TW00D-HF • Nov 25, 2024 • 480
view article Article 🪆 Introduction to Matryoshka Embedding Models +1 tomaarsen, Xenova, osanseviero • Feb 23, 2024 • 208
view article Article seemore: Implement a Vision Language Model from Scratch AviSoori1x • Jun 23, 2024 • 109
view article Article Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval +1 aamirshakir, tomaarsen, SeanLee97 • Mar 22, 2024 • 134
view article Article Personal Copilot: Train Your Own Coding Assistant smangrul, sayakpaul • Oct 27, 2023 • 79