Breaking Entropy Bounds: Accelerating RL Training via MTP with Rejection Sampling Paper • 2606.12370 • Published 2 days ago • 16 • 3