Hao Peng's picture

Hao Peng

Wesleythu

·

h-peng17

AI & ML interests

None yet

Recent Activity

upvoted a paper 12 days ago

EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery

upvoted a paper 20 days ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

upvoted a paper 23 days ago

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

View all activity

Organizations

upvoted a paper 12 days ago

EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery

Paper • 2606.13662 • Published 13 days ago • 27

upvoted a paper 20 days ago

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Paper • 2606.04923 • Published 21 days ago • 40

upvoted a paper 23 days ago

LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards

Paper • 2605.31584 • Published 26 days ago • 42

upvoted a paper 3 months ago

IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse

Paper • 2603.12201 • Published Mar 12 • 60

liked a dataset 4 months ago

Lossfunk/ISO-Bench

Viewer • Updated Feb 26 • 54 • 21 • 2

updated a collection 4 months ago

WildReward

Learning Reward Models from In-the-Wild Interactions • 4 items • Updated Mar 2 • 2

updated 2 models 4 months ago

THU-KEG/WildReward-8B

Text Classification • 8B • Updated Feb 26 • 9 • 3

THU-KEG/WildReward-4B

Text Classification • 4B • Updated Feb 26 • 15 • 4

liked a dataset 4 months ago

THU-KEG/WildFB

Updated Feb 26 • 35 • 3

updated a collection 4 months ago

WildReward

Learning Reward Models from In-the-Wild Interactions • 4 items • Updated Mar 2 • 2

updated a dataset 4 months ago

THU-KEG/WildFB

Updated Feb 26 • 35 • 3

published a dataset 4 months ago

THU-KEG/WildFB

Updated Feb 26 • 35 • 3

upvoted a paper 4 months ago

WildReward: Learning Reward Models from In-the-Wild Human Interactions

Paper • 2602.08829 • Published Feb 9 • 3

submitted a paper to Daily Papers 4 months ago

WildReward: Learning Reward Models from In-the-Wild Human Interactions

Paper • 2602.08829 • Published Feb 9 • 3

upvoted a collection 5 months ago

WildReward

Learning Reward Models from In-the-Wild Interactions • 4 items • Updated Mar 2 • 2

liked 2 models 5 months ago

THU-KEG/WildReward-8B

Text Classification • 8B • Updated Feb 26 • 9 • 3

THU-KEG/WildReward-4B

Text Classification • 4B • Updated Feb 26 • 15 • 4

updated a collection 5 months ago

WildReward

Learning Reward Models from In-the-Wild Interactions • 4 items • Updated Mar 2 • 2