Xiaobo Wang's picture

Xiaobo Wang

Yofuria

·

https://yofuria.github.io/

Yofuria

AI & ML interests

Reward Modeling, Agent Memory, LLM Alignment

Recent Activity

upvoted a paper 16 days ago

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives

updated a collection 16 days ago

updated a collection 16 days ago

View all activity

Organizations

upvoted a paper 16 days ago

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives

Paper • 2505.19558 • Published May 26, 2025 • 1

updated 4 collections 16 days ago

ICE

In-Context Editing: Learning Knowledge from Self-Induced Distributions • 2 items • Updated 16 days ago

PoliCon

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives • 2 items • Updated 16 days ago

UAPO

Adaptive Preference Optimization with Uncertainty-aware Utility Anchor • 4 items • Updated 16 days ago

SAVE

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement • 4 items • Updated 16 days ago

updated a collection 22 days ago

UAPO

Adaptive Preference Optimization with Uncertainty-aware Utility Anchor • 4 items • Updated 16 days ago

authored a paper 22 days ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Paper • 2605.30888 • Published 26 days ago • 10

updated 2 collections 23 days ago

ICE

In-Context Editing: Learning Knowledge from Self-Induced Distributions • 2 items • Updated 16 days ago

SAVE

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement • 4 items • Updated 16 days ago

upvoted a paper 23 days ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Paper • 2605.30888 • Published 26 days ago • 10

submitted a paper to Daily Papers 23 days ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

Paper • 2605.30888 • Published 26 days ago • 10

updated a dataset about 1 month ago

Yofuria/UltraFeedback-binarized-ms-swift-1024

Viewer • Updated May 12 • 38.9k • 20

published a dataset about 1 month ago

Yofuria/UltraFeedback-binarized-ms-swift-1024

Viewer • Updated May 12 • 38.9k • 20

updated a dataset about 2 months ago

Yofuria/UltraFeedback-ms-swift-1024

Viewer • Updated Apr 27 • 41k • 58

published a dataset about 2 months ago

Yofuria/UltraFeedback-ms-swift-1024

Viewer • Updated Apr 27 • 41k • 58

updated a collection 2 months ago

PoliCon

PoliCon: Evaluating LLMs on Achieving Diverse Political Consensus Objectives • 2 items • Updated 16 days ago