Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling Paper • 2605.13301 • Published May 13 • 165
GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents Paper • 2604.26752 • Published Apr 29 • 112
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published Apr 14 • 113
Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision Paper • 2604.12002 • Published Apr 13 • 12
Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification Paper • 2603.26648 • Published Mar 27 • 46
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 910
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published Feb 11 • 201
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization Paper • 2601.05242 • Published Jan 8 • 233
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published Dec 31, 2025 • 155
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper • 2512.16093 • Published Dec 18, 2025 • 97
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published Dec 19, 2025 • 29
CaptionQA: Is Your Caption as Useful as the Image Itself? Paper • 2511.21025 • Published Nov 26, 2025 • 29
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning Paper • 2511.22570 • Published Nov 27, 2025 • 96