Offline Preference Optimization for Rectified Flow with Noise-Tracked Pairs Paper • 2605.09433 • Published May 10 • 6
Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation Paper • 2512.04678 • Published Dec 4, 2025 • 42
UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning Paper • 2503.21620 • Published Mar 27, 2025 • 62