view article Article What's New in Mellea 0.4.0 + Granite Libraries Release ibm-granite β’ Mar 20 β’ 8
view article Article A New Framework for Evaluating Voice Agents (EVA) ServiceNow-AI β’ Mar 24 β’ 92
view article Article Mixture of Experts (MoEs) in Transformers +5 ariG23498, pcuenq, merve, IlyasMoutawwakil, ArthurZ, sergiopaniego, Molbap β’ Feb 26 β’ 159
NVIDIA Nemotron V2 Collection Open, Production-ready Enterprise Models. Nvidia Open Model license. β’ 9 items β’ Updated 3 days ago β’ 104
Less is More: Recursive Reasoning with Tiny Networks Paper β’ 2510.04871 β’ Published Oct 6, 2025 β’ 514
view changelog Hugging Face Changelog Repositories total file size is now displayed Sep 18, 2025 β’ 175
view article Article Vision Language Model Alignment in TRL β‘οΈ +3 sergiopaniego, merve, qgallouedec, kashif, ariG23498 β’ Aug 7, 2025 β’ 111
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper β’ 2506.01844 β’ Published Jun 2, 2025 β’ 159
view article Article AI Policy @π€: Response to the 2025 National AI R&D Strategic Plan evijit β’ Jun 2, 2025 β’ 14
view article Article How to generate text: using different decoding methods for language generation with Transformers patrickvonplaten β’ Mar 1, 2020 β’ 297
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 β’ 10 items β’ Updated Mar 2 β’ 562
BRAVE: Broadening the visual encoding of vision-language models Paper β’ 2404.07204 β’ Published Apr 10, 2024 β’ 20
Efficient Streaming Language Models with Attention Sinks Paper β’ 2309.17453 β’ Published Sep 29, 2023 β’ 14