arxiv:2606.11025
Xiangxin Zhou
zhouxiangxin
AI & ML interests
None yet
Recent Activity
authored a paper 13 days ago
Rethinking the Divergence Regularization in LLM RL authored a paper 13 days ago
Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models authored a paper 13 days ago
Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning