Nazzaroth2 's Collections VLM RL Reasoning
updated
OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning
via Iterative Self-Improvement
Paper
• 2503.17352
• Published • 24
When Less is Enough: Adaptive Token Reduction for Efficient Image
Representation
Paper
• 2503.16660
• Published • 72
CoMP: Continual Multimodal Pre-training for Vision Foundation Models
Paper
• 2503.18931
• Published • 30
MDocAgent: A Multi-Modal Multi-Agent Framework for Document
Understanding
Paper
• 2503.13964
• Published • 20
Qwen2.5-Omni Technical Report
Paper
• 2503.20215
• Published • 172
ViLBench: A Suite for Vision-Language Process Reward Modeling
Paper
• 2503.20271
• Published • 7
Video-R1: Reinforcing Video Reasoning in MLLMs
Paper
• 2503.21776
• Published • 79
Rethinking RL Scaling for Vision Language Models: A Transparent,
From-Scratch Framework and Comprehensive Evaluation Scheme
Paper
• 2504.02587
• Published • 32
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual
Reasoning Self-Improvement
Paper
• 2504.07934
• Published • 21
Efficient Medical VIE via Reinforcement Learning
Paper
• 2506.13363
• Published • 31
Aha Moment Revisited: Are VLMs Truly Capable of Self Verification in
Inference-time Scaling?
Paper
• 2506.17417
• Published • 11