3 21 4

Xiangxin Zhou

zhouxiangxin

https://zhouxiangxin1998.github.io/

AI & ML interests

None yet

Recent Activity

authored a paper 18 days ago

Rethinking the Divergence Regularization in LLM RL

authored a paper 18 days ago

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

authored a paper 18 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

View all activity

Organizations

authored 3 papers 18 days ago

submitted a paper to Daily Papers 18 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper • 2606.10968 • Published 20 days ago • 42

upvoted a paper 18 days ago

Reinforcing Few-step Generators via Reward-Tilted Distribution Matching

Paper • 2605.26108 • Published May 25 • 7

upvoted 2 papers 19 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper • 2606.10968 • Published 20 days ago • 42

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

Paper • 2606.11025 • Published 20 days ago • 41

commented 2 papers 19 days ago

Rethinking the Divergence Regularization in LLM RL

Paper • 2606.09821 • Published 21 days ago • 33 •

Rethinking the Divergence Regularization in LLM RL

Paper • 2606.09821 • Published 21 days ago • 33 •

upvoted a paper 19 days ago

Rethinking the Divergence Regularization in LLM RL

Paper • 2606.09821 • Published 21 days ago • 33

submitted a paper to Daily Papers 19 days ago

Rethinking the Divergence Regularization in LLM RL

Paper • 2606.09821 • Published 21 days ago • 33

upvoted a collection 21 days ago

RTDMD

Collection

Reinforcing Few-step Generators via Reward-Tilted Distribution Matching • 5 items • Updated 26 days ago • 3

authored a paper 5 months ago

Rethinking the Trust Region in LLM Reinforcement Learning

Paper • 2602.04879 • Published Feb 4 • 38

upvoted a paper 5 months ago

Rethinking the Trust Region in LLM Reinforcement Learning

Paper • 2602.04879 • Published Feb 4 • 38

liked a model 7 months ago

GSAI-ML/LLaDA-8B-Base

Text Generation • 8B • Updated Oct 21, 2025 • 111k • 100

upvoted a paper 8 months ago

Diffusion Language Models are Super Data Learners

Paper • 2511.03276 • Published Nov 5, 2025 • 132

authored a paper 8 months ago

Defeating the Training-Inference Mismatch via FP16

Paper • 2510.26788 • Published Oct 30, 2025 • 32

upvoted a paper 8 months ago

Defeating the Training-Inference Mismatch via FP16

Paper • 2510.26788 • Published Oct 30, 2025 • 32

authored a paper 9 months ago

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1, 2025 • 92

upvoted a paper 9 months ago

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1, 2025 • 92

Xiangxin Zhou

AI & ML interests

Recent Activity

Organizations

zhouxiangxin's activity