view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 5 days ago • 50
Teaching Models to Teach Themselves: Reasoning at the Edge of Learnability Paper • 2601.18778 • Published Jan 26 • 41
Aya Datasets Collection The Aya Collection is a massive multilingual collection for over 100 languages consisting of 513 million instances of prompts and completions. • 5 items • Updated Jul 31, 2025 • 27
Inference Optimized Checkpoints (with Model Optimizer) Collection A collection of generative models quantized and optimized for inference with Model Optimizer. • 54 items • Updated 3 days ago • 118
ECHO-2: A Large-Scale Distributed Rollout Framework for Cost-Efficient Reinforcement Learning Paper • 2602.02192 • Published Feb 2 • 12
Surprisal Guided Selection Collection Training at test-time for kernel optimization • 2 items • Updated about 1 month ago • 1
OpenSec: Incident Response Agent Calibration Collection OpenSec is a dual-control RL environment, dataset, and evaluation suite that measures agent calibration on incident response tasks. • 4 items • Updated about 1 month ago • 1
Surprisal-Guided Selection: Compute-Optimal Test-Time Strategies for Execution-Grounded Code Generation Paper • 2602.07670 • Published Feb 7 • 1
view article Article Where should test-time compute go? Surprisal-guided selection in verifiable environments Feb 7 • 1
Paying Less Generalization Tax: A Cross-Domain Generalization Study of RL Training for LLM Agents Paper • 2601.18217 • Published Jan 26 • 13
OpenSec: Measuring Incident Response Agent Calibration Under Adversarial Evidence Paper • 2601.21083 • Published Jan 28 • 1
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano and Super v3. • 26 items • Updated 4 days ago • 90
PhysRVG: Physics-Aware Unified Reinforcement Learning for Video Generative Models Paper • 2601.11087 • Published Jan 16 • 11