view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch May 7, 2024 • 117
view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance Apr 16, 2025 • 61
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 Oct 23, 2025 • 148
view article Article How to Choose the Best Open Source LLM for Your Project in 2025 Sep 9, 2025 • 75
view article Article Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers +5 Sep 11, 2025 • 179
view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels Aug 18, 2025 • 91
view article Article Welcome EmbeddingGemma, Google's new efficient embedding model +4 Sep 4, 2025 • 273