Unveiling Implicit Advantage Symmetry: Why GRPO Struggles with Exploration and Difficulty Adaptation Paper • 2602.05548 • Published Feb 5 • 12
OmniVideo-R1: Reinforcing Audio-visual Reasoning with Query Intention and Modality Attention Paper • 2602.05847 • Published Feb 5 • 12
OmniVideo-R1: Reinforcing Audio-visual Reasoning with Query Intention and Modality Attention Paper • 2602.05847 • Published Feb 5 • 12
EvoFSM: Controllable Self-Evolution for Deep Research with Finite State Machines Paper • 2601.09465 • Published Jan 14 • 41
OneThinker: All-in-one Reasoning Model for Image and Video Paper • 2512.03043 • Published Dec 2, 2025 • 33
VisMem: Latent Vision Memory Unlocks Potential of Vision-Language Models Paper • 2511.11007 • Published Nov 14, 2025 • 15
VisRL: Intention-Driven Visual Perception via Reinforced Reasoning Paper • 2503.07523 • Published Mar 10, 2025 • 1