view article Article The Open Source Community is backing OpenEnv for Agentic RL +17 burtenshaw, spisakjo, lysandre, darktex, willcb, qjoy, pawalt, cwing-nv, danielhanchen, andrewzhou, thegovind, shimmyshimmer, Hamid-Nazeri, Sanyam, zkwentz, emre0, lewtun, sergiopaniego, banghua • 21 days ago • 99
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published Apr 29 • 112
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-Base-BF16 Text Generation • 124B • Updated Mar 14 • 28.1k • 32
view article Article Profiling in PyTorch (Part 2): From nn.Linear to a Fused MLP +3 ariG23498, ror, sergiopaniego, pcuenq, sayakpaul • 18 days ago • 50
view article Article Unlocking asynchronicity in continuous batching +1 ror, pcuenq, ariG23498 • May 14 • 61
view article Article KV Cache from scratch in nanoVLM +3 ariG23498, kashif, lusxvr, andito, pcuenq • Jun 4, 2025 • 120
view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 411
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency not-lain • Jan 30, 2025 • 355
view article Article Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler +3 ariG23498, sayakpaul, sergiopaniego, ror, pcuenq • May 29 • 129
Efficient Memory Management for Large Language Model Serving with PagedAttention Paper • 2309.06180 • Published Sep 12, 2023 • 60
Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention Paper • 2605.22791 • Published May 21 • 33