UDM-GRPO: Stable and Efficient Group Relative Policy Optimization for Uniform Discrete Diffusion Models Paper • 2604.18518 • Published 22 days ago • 7