human preference dataset stanfordnlp/SHP-2 Viewer • Updated Jan 11, 2024 • 4.07M • 1.61k • 17 Anthropic/hh-rlhf Viewer • Updated May 26, 2023 • 169k • 28.5k • 1.81k OpenMOSS-Team/hh-rlhf-strength-cleaned Viewer • Updated Jan 31, 2024 • 168k • 51 • 23 heegyu/hh-rlhf-vicuna-format Viewer • Updated Sep 6, 2023 • 169k • 5 • 4
RM OpenAssistant/reward-model-deberta-v3-large-v2 Text Classification • Updated Feb 1, 2023 • 10.3k • • 247
OpenAssistant/reward-model-deberta-v3-large-v2 Text Classification • Updated Feb 1, 2023 • 10.3k • • 247
human preference dataset stanfordnlp/SHP-2 Viewer • Updated Jan 11, 2024 • 4.07M • 1.61k • 17 Anthropic/hh-rlhf Viewer • Updated May 26, 2023 • 169k • 28.5k • 1.81k OpenMOSS-Team/hh-rlhf-strength-cleaned Viewer • Updated Jan 31, 2024 • 168k • 51 • 23 heegyu/hh-rlhf-vicuna-format Viewer • Updated Sep 6, 2023 • 169k • 5 • 4
RM OpenAssistant/reward-model-deberta-v3-large-v2 Text Classification • Updated Feb 1, 2023 • 10.3k • • 247
OpenAssistant/reward-model-deberta-v3-large-v2 Text Classification • Updated Feb 1, 2023 • 10.3k • • 247