1 3 31

Zhenghao Xu

zhenghaoxu

AI & ML interests

None yet

Recent Activity

upvoted an article about 2 months ago

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

commentedon a paper 2 months ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

updated a dataset 2 months ago

zhenghaoxu/aime-beyond

View all activity

Organizations

upvoted an article about 2 months ago

Article

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries

Mar 10

•

143

commented a paper 2 months ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 220 •

updated 2 datasets 2 months ago

zhenghaoxu/aime-beyond

Viewer • Updated Feb 22 • 100 • 5

zhenghaoxu/aime-amc23

Viewer • Updated Feb 22 • 40 • 8

published a dataset 2 months ago

zhenghaoxu/aime-amc23

Viewer • Updated Feb 22 • 40 • 8

updated 4 datasets 2 months ago

updated a dataset 3 months ago

zhenghaoxu/dapo-math-17k

Viewer • Updated Feb 13 • 17.4k • 8

published 6 datasets 3 months ago

zhenghaoxu/dapo-math-17k

Viewer • Updated Feb 13 • 17.4k • 8

zhenghaoxu/aime-beyond

Viewer • Updated Feb 22 • 100 • 5

zhenghaoxu/aime-2026

Viewer • Updated Feb 22 • 30 • 10

zhenghaoxu/aime-2025

Viewer • Updated Feb 22 • 30 • 6

zhenghaoxu/aime-2024

Viewer • Updated Feb 22 • 30 • 3

zhenghaoxu/math-aime-eval

Viewer • Updated Feb 22 • 230 • 11

upvoted 2 papers 3 months ago

Approximation of Log-Partition Function in Policy Mirror Descent Induces Implicit Regularization for LLM Post-Training

Paper • 2602.05933 • Published Feb 5 • 6

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

Paper • 2602.01058 • Published Feb 1 • 44

liked 2 models 5 months ago

inclusionAI/LLaDA2.0-flash

Text Generation • 103B • Updated Dec 19, 2025 • 573 • 69

inclusionAI/LLaDA2.0-mini

Text Generation • 16B • Updated 25 days ago • 126k • 66

Zhenghao Xu

AI & ML interests

Recent Activity

Organizations

zhenghaoxu's activity

Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries