Moshi: a speech-text foundation model for real-time dialogue Paper • 2410.00037 • Published Sep 17, 2024 • 11
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published Apr 7, 2025 • 205
view article Article Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks +2 Nov 21, 2025 • 24
VibeVoice Collection Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/ • 9 items • Updated 9 days ago • 205
view article Article Building the Open Agent Ecosystem Together: Introducing OpenEnv +8 Oct 23, 2025 • 148
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency Jan 30, 2025 • 224
view article Article Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face Feb 11, 2025 • 97
view article Article Assisted Generation: a new direction toward low-latency text generation May 11, 2023 • 74
A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17, 2025 • 260
view article Article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes Aug 17, 2022 • 123