4 6 2

Shenzhi Yang

Shenzhi

AI & ML interests

None yet

Recent Activity

upvoted a collection about 6 hours ago

WTF GENIUS PAPERS

upvoted a paper 1 day ago

OPRD: On-Policy Representation Distillation

submitted a paper 1 day ago

OPRD: On-Policy Representation Distillation

View all activity

Organizations

None yet

upvoted a collection about 6 hours ago

WTF GENIUS PAPERS

Collection

Papers that made me appreciate my major and my life a little more. obs=Observation, innov=Innovation. Most papers are abt improving tiny models. • 167 items • Updated about 14 hours ago • 33

upvoted a paper 1 day ago

OPRD: On-Policy Representation Distillation

Paper • 2606.06021 • Published 2 days ago • 6

submitted a paper to Daily Papers 1 day ago

OPRD: On-Policy Representation Distillation

Paper • 2606.06021 • Published 2 days ago • 6

liked a dataset 16 days ago

Keven16/G-OPD-Training-Data

Viewer • Updated Feb 17 • 134k • 795 • 2

upvoted an article about 1 month ago

Article

From GRPO to DAPO and GSPO: What, Why, and How

NormalUhr

•

Aug 9, 2025

• 121

commented 3 papers about 2 months ago

upvoted a paper about 2 months ago

Can LLMs Learn to Reason Robustly under Noisy Supervision?

Paper • 2604.03993 • Published Apr 5 • 43

authored a paper about 2 months ago

Can LLMs Learn to Reason Robustly under Noisy Supervision?

Paper • 2604.03993 • Published Apr 5 • 43

submitted a paper to Daily Papers 2 months ago

Can LLMs Learn to Reason Robustly under Noisy Supervision?

Paper • 2604.03993 • Published Apr 5 • 43

commented a paper 3 months ago

Seeing What Matters: Visual Preference Policy Optimization for Visual Generation

Paper • 2511.18719 • Published Nov 24, 2025 • 1 •

upvoted a paper 5 months ago

Your Group-Relative Advantage Is Biased

Paper • 2601.08521 • Published Jan 13 • 158

submitted a paper to Daily Papers 6 months ago

TraPO: A Semi-Supervised Reinforcement Learning Framework for Boosting LLM Reasoning

Paper • 2512.13106 • Published Dec 15, 2025 • 4

authored a paper 6 months ago

TraPO: A Semi-Supervised Reinforcement Learning Framework for Boosting LLM Reasoning

Paper • 2512.13106 • Published Dec 15, 2025 • 4

upvoted a paper 6 months ago

TraPO: A Semi-Supervised Reinforcement Learning Framework for Boosting LLM Reasoning

Paper • 2512.13106 • Published Dec 15, 2025 • 4

liked a dataset about 1 year ago

DMindAI/DMind_Benchmark

Viewer • Updated Feb 8 • 3.15k • 1.2k • 82

Shenzhi Yang

AI & ML interests

Recent Activity

Organizations

Shenzhi's activity

From GRPO to DAPO and GSPO: What, Why, and How