view article Article ⚡ nano-vLLM: Lightweight, Low-Latency LLM Inference from Scratch zamal • Jun 28, 2025 • 41
view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 385
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 lysandre, ArthurZ, cyrilvallez, reach-vb • Dec 1, 2025 • 311
Awesome SFT datasets Collection A curated list of interesting datasets to fine-tune language models with. • 41 items • Updated Mar 2 • 152
view article Article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes ybelkada, timdettmers • Aug 17, 2022 • 132