Hao Zhuoyuan 郝卓远's picture

Hao Zhuoyuan 郝卓远

larry2210

·

https://github.com/hhh2210

hzy2210

AI & ML interests

None yet

Recent Activity

authored a paper about 5 hours ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

upvoted a paper about 10 hours ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

submitted a paper about 11 hours ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

View all activity

Organizations

None yet

authored a paper about 5 hours ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Paper • 2606.04923 • Published 1 day ago • 34

upvoted a paper about 10 hours ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Paper • 2606.04923 • Published 1 day ago • 34

submitted a paper to Daily Papers about 11 hours ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Paper • 2606.04923 • Published 1 day ago • 34

New activity in MiniMaxAI/role-play-bench 14 days ago

What is the prompt used when using LLM-as-a-judge?

#2 opened 4 months ago by

submitted a paper to Daily Papers 4 months ago

Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning

Paper • 2602.06600 • Published Feb 6 • 3

authored a paper 4 months ago

Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning

Paper • 2602.06600 • Published Feb 6 • 3

upvoted a paper 4 months ago

Echoes as Anchors: Probabilistic Costs and Attention Refocusing in LLM Reasoning

Paper • 2602.06600 • Published Feb 6 • 3