haoyu wang

haoyuw

·

AI & ML interests

None yet

Recent Activity

authored a paper about 1 month ago

RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains

liked a dataset about 1 month ago

OpenRubrics/RubricARROW-Judge-SFT

upvoted a paper about 1 month ago

RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains

View all activity

Organizations

upvoted a paper about 1 month ago

RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains

Paper • 2605.29156 • Published May 27 • 14

upvoted a paper 5 months ago

Alternating Reinforcement Learning for Rubric-Based Reward Modeling in Non-Verifiable LLM Post-Training

Paper • 2602.01511 • Published Feb 2 • 15

upvoted a paper 7 months ago

Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning

Paper • 2511.19900 • Published Nov 25, 2025 • 49

upvoted a collection 9 months ago

RubricRM

Reward Models trained with OpenRubric. • 4 items • Updated Mar 2 • 2

upvoted 2 papers 9 months ago

OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment

Paper • 2510.07743 • Published Oct 9, 2025 • 14

AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play

Paper • 2509.24193 • Published Sep 29, 2025 • 7

upvoted 2 papers over 1 year ago

RoseRAG: Robust Retrieval-augmented Generation with Small-scale LLMs via Margin-aware Preference Optimization

Paper • 2502.10993 • Published Feb 16, 2025 • 1

Mitigating Heterogeneous Token Overfitting in LLM Knowledge Editing

Paper • 2502.00602 • Published Feb 2, 2025 • 2

upvoted a paper about 2 years ago

RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning

Paper • 2406.10777 • Published Jun 16, 2024 • 2