1t4chi (YigeYuan)

liked a dataset 11 months ago

open-r1/DAPO-Math-17k-Processed

Viewer • Updated Nov 10, 2025 • 34.8k • 7.14k • 70

liked a Space over 1 year ago

Reward Bench Leaderboard

📐

430

Explore and compare model scores on RewardBench benchmarks

liked 4 models over 1 year ago

liked 3 datasets over 1 year ago

PKU-Alignment/PKU-SafeRLHF

Viewer • Updated Oct 18, 2024 • 164k • 13.9k • 187

PKU-Alignment/PKU-SafeRLHF-10K

Viewer • Updated Jul 20, 2023 • 10k • 1.61k • 62

unalignment/toxic-dpo-v0.2

Viewer • Updated Jan 9, 2024 • 541 • 334 • 141

liked a model over 1 year ago

HelpingAI/HelpingAI-9B

Text Generation • 9B • Updated Oct 31, 2024 • 84 • 26

liked 2 datasets over 1 year ago

rngusry/UltraFeedback-honesty-preferences

Viewer • Updated Aug 3, 2024 • 251k • 27 • 1

rngusry/UltraFeedback-truthfulness-preferences

Viewer • Updated Jul 25, 2024 • 217k • 28 • 1

liked 2 models over 1 year ago

jointpreferences/mistral_7b_sft_helpful

Text Generation • 7B • Updated Apr 2, 2024 • 2 • 1

GraySwanAI/Mistral-7B-Instruct-RR

Text Generation • 7B • Updated Jul 9, 2024 • 270 • • 5

YigeYuan

AI & ML interests

Organizations

open-r1/DAPO-Math-17k-Processed

Reward Bench Leaderboard

RLHFlow/RewardModel-Mistral-7B-for-DPA-v1

allenai/tulu-v2.5-dpo-13b-hh-rlhf

allenai/tulu-2-dpo-13b

PKU-Alignment/beaver-7b-v1.0

PKU-Alignment/PKU-SafeRLHF

PKU-Alignment/PKU-SafeRLHF-10K

unalignment/toxic-dpo-v0.2

HelpingAI/HelpingAI-9B

rngusry/UltraFeedback-honesty-preferences

rngusry/UltraFeedback-truthfulness-preferences

jointpreferences/mistral_7b_sft_helpful

GraySwanAI/Mistral-7B-Instruct-RR

YigeYuan

AI & ML interests

Organizations

1t4chi's activity

Reward Bench Leaderboard