view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 14 days ago • 71
Nemotron-Cascade 2 Collection Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation • 4 items • Updated 7 minutes ago • 31
view article Article Introducing SPEED-Bench: A Unified and Diverse Benchmark for Speculative Decoding 4 days ago • 40
Mistral Small 4 Collection A state-of-the-art model, open-weight, with a granular Mixture-of-Experts architecture that fuses instruct, reasoning and agentic skills. • 3 items • Updated 7 days ago • 60
Bielik-11B-v3.0 Collection A collection of models based on Bielik-11B-v3.0 - instruct and quantized versions. • 5 items • Updated 5 days ago • 8
view changelog Hugging Face Changelog Introducing Buckets: S3-like storage on the Hub 13 days ago • 181
Heterogeneous Agent Collaborative Reinforcement Learning Paper • 2603.02604 • Published 21 days ago • 188
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders Paper • 2603.06569 • Published 17 days ago • 114
Helios Collection Helios: 14B Real-Time Long Video Generation Model can be Cheaper, Faster but Keep Stronger than 1.3B ones • 7 items • Updated 8 days ago • 24
DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference Paper • 2602.21548 • Published 27 days ago • 47