G-Zero: Self-Play for Open-Ended Generation from Zero Data Paper • 2605.09959 • Published 2 days ago • 13
Training Data Efficiency in Multimodal Process Reward Models Paper • 2602.04145 • Published Feb 4 • 79