Running 152 The ultimate guide to RL environments: building and scaling them in the LLM era 📝 152 Building and scaling RL environments for LLM training
nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16 Any-to-Any • 33B • Updated 5 days ago • 203k • 280
Running Featured 81 Distilling 100B+ Models 40x Faster with TRL 📝 81 TRL distillation for 100B+ teachers, 40x faster