metadata
base_model: Qwen/Qwen3-8B
library_name: peft
license: apache-2.0
tags:
- reward-model
- rlhf
- humor
- qwen
- lora
JokeGPT - Reward Model
This is the Reward Model for JokeGPT. It is trained to evaluate the humor quality of a given text, outputting a scalar score.
Model Details
- Base Model: Qwen/Qwen3-8B (initialized from SFT weights)
- Training Method: Reward Modeling (LoRA)
- Task: Sequence Classification (Score Prediction)
Purpose
This model is used during the PPO (Proximal Policy Optimization) phase to provide feedback to the generation model, guiding it towards more humorous outputs.
Usage
from peft import PeftModel
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained(
"Qwen/Qwen3-8B",
num_labels=1,
device_map="auto"
)
model = PeftModel.from_pretrained(model, "JokeGPT-Model/reward_model_final")