MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head Paper • 2601.07832 • Published 4 days ago • 43
DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs Paper • 2601.03559 • Published 10 days ago • 12
nvidia/nemotron-speech-streaming-en-0.6b Automatic Speech Recognition • Updated 11 days ago • 4.94k • 389
💧 LFM2 Collection LFM2 is a new generation of hybrid models, designed for on-device deployment. • 27 items • Updated 4 days ago • 136