Breaking Failure Cascades: Step-Aware Reinforcement Learning for Medical Multimodal Reasoning
Paper • 2606.31825 • Published • 14
This collection hosts MRPO series introduced in paper, Breaking Failure Cascades: Step-Aware Reinforcement Learning for Medical Multimodal Reasoning