Nemotron-Cascade 2 Collection Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation • 4 items • Updated about 2 hours ago • 31
Mamba-3: Improved Sequence Modeling using State Space Principles Paper • 2603.15569 • Published 7 days ago • 6
Mistral Small 4 Collection A state-of-the-art model, open-weight, with a granular Mixture-of-Experts architecture that fuses instruct, reasoning and agentic skills. • 3 items • Updated 7 days ago • 60
ECoLAD: Deployment-Oriented Evaluation for Automotive Time-Series Anomaly Detection Paper • 2603.10926 • Published 12 days ago • 1
Surprised by Attention: Predictable Query Dynamics for Time Series Anomaly Detection Paper • 2603.12916 • Published 11 days ago • 3
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 11 days ago • 63
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 15 items • Updated about 2 hours ago • 235
view changelog Hugging Face Changelog Introducing Buckets: S3-like storage on the Hub 13 days ago • 181
Test-Time Training with KV Binding Is Secretly Linear Attention Paper • 2602.21204 • Published 27 days ago • 30
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 Feb 20 • 489
REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents Paper • 2602.14234 • Published Feb 15 • 26