Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering Paper • 2604.08224 • Published Apr 9 • 51
view article Article Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval +1 aamirshakir, tomaarsen, SeanLee97 • Mar 22, 2024 • 134
view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 391
Running on CPU Upgrade Featured 3.18k The Smol Training Playbook 📚 3.18k The secrets to building world-class LLMs
Running 3.85k The Ultra-Scale Playbook 🌌 3.85k The ultimate guide to training LLM on large GPU Clusters
Running 343 LLM Embeddings Explained: A Visual and Intuitive Guide 🚀 343 How Language Models Turn Text into Meaning, From Traditional
view article Article Gemma 3n fully available in the open-source ecosystem! +6 ariG23498, pcuenq, sergiopaniego, reach-vb, FL33TW00D-HF, Xenova, Steveeeeeeen, kashif • Jun 26, 2025 • 121
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published Mar 5, 2025 • 233
view article Article Introducing smolagents: simple agents that write actions in code. +1 m-ric, merve, thomwolf • Dec 31, 2024 • 1.2k
AI Engineering Collection A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book. • 239 items • Updated Mar 29, 2025 • 26