Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients Paper • 2606.18216 • Published 9 days ago • 60
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 29 days ago • 93
view article Article Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models nvidia • May 23 • 34
view article Article Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models nvidia • May 23 • 34
Nemotron-Labs-Diffusion Collection A Tri-Mode Language Model Family Unifying Autoregressive, Diffusion, and Self-Speculation Decoding • 7 items • Updated 13 days ago • 49