-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 104 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75
Collections
Discover the best community collections!
Collections including paper arxiv:2601.02256
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 102 -
Robot Learning from a Physical World Model
Paper • 2511.07416 • Published • 30 -
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning
Paper • 2511.06805 • Published • 12 -
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms
Paper • 2511.17592 • Published • 118
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 199 -
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training
Paper • 2508.00414 • Published • 93 -
Continuous Autoregressive Language Models
Paper • 2510.27688 • Published • 72 -
MiMo-Embodied: X-Embodied Foundation Model Technical Report
Paper • 2511.16518 • Published • 25
-
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning
Paper • 2512.02835 • Published • 9 -
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Paper • 2512.05044 • Published • 16 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 16 -
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Paper • 2512.05343 • Published • 24
-
Hyperspherical Latents Improve Continuous-Token Autoregressive Generation
Paper • 2509.24335 • Published • 8 -
NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation
Paper • 2601.02204 • Published • 56 -
VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generation
Paper • 2601.02256 • Published • 32
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 104 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75
-
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning
Paper • 2512.02835 • Published • 9 -
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Paper • 2512.05044 • Published • 16 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 16 -
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Paper • 2512.05343 • Published • 24
-
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 102 -
Robot Learning from a Physical World Model
Paper • 2511.07416 • Published • 30 -
MathSE: Improving Multimodal Mathematical Reasoning via Self-Evolving Iterative Reflection and Reward-Guided Fine-Tuning
Paper • 2511.06805 • Published • 12 -
GigaEvo: An Open Source Optimization Framework Powered By LLMs And Evolution Algorithms
Paper • 2511.17592 • Published • 118
-
Hyperspherical Latents Improve Continuous-Token Autoregressive Generation
Paper • 2509.24335 • Published • 8 -
NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation
Paper • 2601.02204 • Published • 56 -
VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generation
Paper • 2601.02256 • Published • 32
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 199 -
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training
Paper • 2508.00414 • Published • 93 -
Continuous Autoregressive Language Models
Paper • 2510.27688 • Published • 72 -
MiMo-Embodied: X-Embodied Foundation Model Technical Report
Paper • 2511.16518 • Published • 25