Taming Preference Mode Collapse via Directional Decoupling Alignment in Diffusion Reinforcement Learning Paper • 2512.24146 • Published 13 days ago • 12
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21, 2025 • 451