-
Group Sequence Policy Optimization
Paper • 2507.18071 • Published • 316 -
LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization
Paper • 2507.15758 • Published • 35 -
Hierarchical Budget Policy Optimization for Adaptive Reasoning
Paper • 2507.15844 • Published • 16 -
Semi-off-Policy Reinforcement Learning for Vision-Language Slow-thinking Reasoning
Paper • 2507.16814 • Published • 21
Paipile
Paipile
AI & ML interests
None yet
Recent Activity
submitted
a paper
about 21 hours ago
Can Textual Reasoning Improve the Performance of MLLMs on Fine-grained Visual Classification?
authored
a paper
about 22 hours ago
Can Textual Reasoning Improve the Performance of MLLMs on Fine-grained Visual Classification?
authored
a paper
about 22 hours ago
A Quality-Guided Mixture of Score-Fusion Experts Framework for Human Recognition
Organizations
None yet