openai/webgpt_comparisons
Viewer • Updated • 19.6k • 1.21k • 242
How to use theblackcat102/roberta-base-webgpt-rm with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="theblackcat102/roberta-base-webgpt-rm") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("theblackcat102/roberta-base-webgpt-rm")
model = AutoModelForSequenceClassification.from_pretrained("theblackcat102/roberta-base-webgpt-rm")Reward model finetuned from existing pretrain model.
Things that aligned with the orignal papers
Overfits easily using rank loss
Small learning rate
Different from the papers
Small model performs bad due to lack of world knowledge, since the validation accuracy doesn't even reach 60%. OpenAI RM had 6B parameters.
Train using a 80-20 train-validation split on torch AMP settings
Other models I had tried
bloomz-560m : embedding size doesn't worth the training, since this dataset only contain english prompt
gpt2-large : not stable
gpt2-base : not stable
| model | val acc | val loss (rank loss) |
|---|---|---|
| roberta-base | 56.21 | 0.71 |
| roberta-large | 57.89 | 0.67 |
| electra-base | 57.02 | 0.70 |
| electra-large | 58.75 | 0.69 |
Tensorboard logs are located under runs/