Anthropic/hh-rlhf
Viewer • Updated • 169k • 29.3k • 1.8k
How to use kanishkez/Reward-Model with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="kanishkez/Reward-Model") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("kanishkez/Reward-Model")
model = AutoModelForSequenceClassification.from_pretrained("kanishkez/Reward-Model")This is a 3B reward model fine-tuned from Qwen 2.5 3B using Anthropic HH-RLHF data.
It is designed to score model outputs for alignment and quality, and can be used with RewardBench for evaluation.
| Category | Score |
|---|---|
| Chat | 83.5% |
| Chat Hard | 53.2% |
| Safety | 72.2% |
| Reasoning | 73.4% |
from transformers import AutoModelForSequenceClassification, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("kanishkez/Reward-Model")
model = AutoModelForSequenceClassification.from_pretrained("kanishkez/Reward-Model")
Base model
Qwen/Qwen2.5-3B