Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published 24 days ago • 90
Self-Distillation Zero: Self-Revision Turns Binary Rewards into Dense Supervision Paper • 2604.12002 • Published 24 days ago • 10
Vision2Web: A Hierarchical Benchmark for Visual Website Development with Agent Verification Paper • 2603.26648 • Published Mar 27 • 42