GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 23 days ago • 211
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published Nov 26, 2025 • 121
Nemotron-Flash: Towards Latency-Optimal Hybrid Small Language Models Paper • 2511.18890 • Published Nov 24, 2025 • 34
view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks Aug 11, 2025 • 75
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation Paper • 2410.01680 • Published Oct 2, 2024 • 34
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models Paper • 2409.17481 • Published Sep 26, 2024 • 47
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 58
Compact Language Models via Pruning and Knowledge Distillation Paper • 2407.14679 • Published Jul 19, 2024 • 39
RADIO Collection A collection of Foundation Vision Models that combine multiple models (CLIP, DINOv2, SAM, etc.). • 18 items • Updated 2 days ago • 29
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated 2 days ago • 62