GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published 26 days ago • 217
view article Article Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers +5 Sep 11, 2025 • 178
view article Article Efficient LLM Pretraining: Packed Sequences and Masked Attention Oct 7, 2024 • 64
view article Article Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models Dec 15, 2025 • 106
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 295
view article Article Aligning to What? Rethinking Agent Generalization in MiniMax M2 Oct 30, 2025 • 42
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch +5 May 21, 2025 • 251
view article Article Welcome GPT OSS, the new open-source model family from OpenAI! +10 Aug 5, 2025 • 510