1 9 2

haoyu wang

haoyuw

AI & ML interests

None yet

Recent Activity

authored a paper 25 days ago

RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains

liked a dataset 27 days ago

OpenRubrics/RubricARROW-Judge-SFT

upvoted a paper 28 days ago

RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains

View all activity

Organizations

authored a paper 25 days ago

RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains

Paper • 2605.29156 • Published about 1 month ago • 14

liked a dataset 27 days ago

OpenRubrics/RubricARROW-Judge-SFT

Viewer • Updated 28 days ago • 119k • 251 • 4

upvoted a paper 28 days ago

RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains

Paper • 2605.29156 • Published about 1 month ago • 14

authored a paper 5 months ago

Alternating Reinforcement Learning for Rubric-Based Reward Modeling in Non-Verifiable LLM Post-Training

Paper • 2602.01511 • Published Feb 2 • 15

upvoted a paper 5 months ago

Alternating Reinforcement Learning for Rubric-Based Reward Modeling in Non-Verifiable LLM Post-Training

Paper • 2602.01511 • Published Feb 2 • 15

upvoted a paper 7 months ago

Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning

Paper • 2511.19900 • Published Nov 25, 2025 • 49

upvoted a collection 9 months ago

RubricRM

Collection

Reward Models trained with OpenRubric. • 4 items • Updated Mar 2 • 2

authored a paper 9 months ago

OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment

Paper • 2510.07743 • Published Oct 9, 2025 • 14

upvoted 2 papers 9 months ago

OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment

Paper • 2510.07743 • Published Oct 9, 2025 • 14

AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play

Paper • 2509.24193 • Published Sep 29, 2025 • 7

liked a model 11 months ago

rubricreward/R3-Qwen3-14B-4k

Text Generation • 15B • Updated May 21, 2025 • 8 • 5

updated a dataset 12 months ago

haoyuw/cn_math_2024

Viewer • Updated Jun 30, 2025 • 30 • 7

published a dataset 12 months ago

haoyuw/cn_math_2024

Viewer • Updated Jun 30, 2025 • 30 • 7

updated a dataset about 1 year ago

haoyuw/aime

Viewer • Updated May 22, 2025 • 30 • 7

published a dataset about 1 year ago

haoyuw/aime

Viewer • Updated May 22, 2025 • 30 • 7

updated a dataset about 1 year ago

haoyuw/minerva

Viewer • Updated May 7, 2025 • 272 • 5

published a dataset about 1 year ago

haoyuw/minerva

Viewer • Updated May 7, 2025 • 272 • 5

updated a dataset about 1 year ago

haoyuw/olympiad_bench

Viewer • Updated May 7, 2025 • 675 • 4

published a dataset about 1 year ago

haoyuw/olympiad_bench

Viewer • Updated May 7, 2025 • 675 • 4

updated a dataset over 1 year ago

haoyuw/minervamath_latex

Viewer • Updated Mar 24, 2025 • 272 • 14

haoyu wang

AI & ML interests

Recent Activity

Organizations

haoyuw's activity