From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 9 days ago • 151
Heterogeneous Scientific Foundation Model Collaboration Paper • 2604.27351 • Published 12 days ago • 210
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published 13 days ago • 104
Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence Paper • 2604.24954 • Published 15 days ago • 21
view article Article DeepSeek-V4: a million-token context that agents can actually use burtenshaw • 18 days ago • 43
LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories Paper • 2604.15311 • Published 26 days ago • 13
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 27 days ago • 157
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 28 days ago • 91
view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs nielsr • Apr 7 • 61
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 890
GPT-1900 Collection Pre-1900 LLMs for physics reasoning. RL models are physics-only; use the SFT model for general chat. Tune temperature (0.6-0.7). • 11 items • Updated Apr 2 • 9
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano and Super v3. • 28 items • Updated 3 days ago • 138
view article Article Nemotron 3 Nano 4B: A Compact Hybrid Model for Efficient Local AI nvidia • Mar 17 • 63