ezetimibe

company

AI & ML interests

None defined yet.

Recent Activity

zhouxiangxin authored a paper 15 days ago

Rethinking the Divergence Regularization in LLM RL

zhouxiangxin authored a paper 15 days ago

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

zhouxiangxin authored a paper 15 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

View all activity

authored 3 papers 15 days ago

Rethinking the Divergence Regularization in LLM RL

Paper • 2606.09821 • Published 19 days ago • 33

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

Paper • 2606.11025 • Published 18 days ago • 41

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper • 2606.10968 • Published 18 days ago • 42

submitted 2 papers to Daily Papers 16 days ago

Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper • 2606.10968 • Published 18 days ago • 42

Rethinking the Divergence Regularization in LLM RL

Paper • 2606.09821 • Published 19 days ago • 33

authored a paper 5 months ago

Rethinking the Trust Region in LLM Reinforcement Learning

Paper • 2602.04879 • Published Feb 4 • 38

authored a paper 8 months ago

Defeating the Training-Inference Mismatch via FP16

Paper • 2510.26788 • Published Oct 30, 2025 • 32

authored 2 papers 9 months ago

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1, 2025 • 92

Variational Reasoning for Language Models

Paper • 2509.22637 • Published Sep 26, 2025 • 70

authored a paper about 1 year ago

Reinforcing General Reasoning without Verifiers

Paper • 2505.21493 • Published May 27, 2025 • 27

authored a paper almost 2 years ago

ProteinBench: A Holistic Evaluation of Protein Foundation Models

Paper • 2409.06744 • Published Sep 10, 2024 • 8