Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement
Learning
Paper
• 2502.14768
• Published
• 47
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement
Learning
Paper
• 2502.12853
• Published
• 29
Diverse Inference and Verification for Advanced Reasoning
Paper
• 2502.09955
• Published
• 18
Distillation Scaling Laws
Paper
• 2502.08606
• Published
• 47
Small Models Struggle to Learn from Strong Reasoners
Paper
• 2502.12143
• Published
• 39
OctoTools: An Agentic Framework with Extensible Tools for Complex
Reasoning
Paper
• 2502.11271
• Published
• 18
CRANE: Reasoning with constrained LLM generation
Paper
• 2502.09061
• Published
• 21
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via
Reinforcement Learning
Paper
• 2501.12948
• Published
• 441
Visual-RFT: Visual Reinforcement Fine-Tuning
Paper
• 2503.01785
• Published
• 86
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through
Two-Stage Rule-Based RL
Paper
• 2503.07536
• Published
• 88
Vision-R1: Incentivizing Reasoning Capability in Multimodal Large
Language Models
Paper
• 2503.06749
• Published
• 31
Unified Reward Model for Multimodal Understanding and Generation
Paper
• 2503.05236
• Published
• 124
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model
Paper
• 2503.05132
• Published
• 57
START: Self-taught Reasoner with Tools
Paper
• 2503.04625
• Published
• 113
R1-VL: Learning to Reason with Multimodal Large Language Models via
Step-wise Group Relative Policy Optimization
Paper
• 2503.12937
• Published
• 30