view article Article How We Built a Semantic Highlight Model To Save Token Cost for RAG zilliz • Jan 15 • 67
view article Article I Built a RAG System That Listens to Live BBC News and Answers Questions About "What Happened 10 Minutes Ago" RakshitAralimatti • Dec 9, 2025 • 14
view article Article Continuous batching from first principles +1 ror, ArthurZ, mcpotato • Nov 25, 2025 • 396
view article Article Supercharge your OCR Pipelines with Open Models +5 merve, ariG23498, davanstrien, hynky, andito, reach-vb, pcuenq • Oct 21, 2025 • 314
view article Article Deploy NVIDIA Riva ASR on Kubernetes GPU Cluster in Just 5 Minutes RakshitAralimatti • Sep 5, 2025 • 1
view article Article What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware RakshitAralimatti • Aug 8, 2025 • 35
view article Article Tiny Agents in Python: a MCP-powered agent in ~70 lines of code +2 celinah, julien-c, Wauplin, evalstate • May 23, 2025 • 172
SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments Paper • 2410.11331 • Published Oct 15, 2024 • 8
Fine-Tuning Small Language Models for Domain-Specific AI: An Edge AI Perspective Paper • 2503.01933 • Published Mar 3, 2025 • 13