view article Article EMO: Pretraining mixture of experts for emergent modularity allenai • 5 days ago • 30
Running 153 The ultimate guide to RL environments: building and scaling them in the LLM era 📝 153 Building and scaling RL environments for LLM training
view article Article Building a Fast Multilingual OCR Model with Synthetic Data nvidia • 26 days ago • 33
DFlash Collection Block Diffusion for Flash Speculative Decoding • 21 items • Updated 4 days ago • 110
nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16 Any-to-Any • 33B • Updated 5 days ago • 203k • 281