view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 909
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models Paper • 2602.07026 • Published Feb 2 • 140
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano, Super, and Ultra v3. • 50 items • Updated 11 days ago • 163
Running 3.9k The Ultra-Scale Playbook 🌌 3.9k The ultimate guide to training LLM on large GPU Clusters
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1 Text Generation • 253B • Updated Oct 15, 2025 • 3.65k • • 352