TzJ2006's picture
Upload folder using huggingface_hub
f89a9b6 verified
metadata
base_model: JokeGPT-Model/sft_final
library_name: peft
license: apache-2.0
tags:
  - ppo
  - rlhf
  - humor
  - qwen
  - lora

JokeGPT - PPO Model

This is the final PPO-aligned version of JokeGPT. It has been optimized using Reinforcement Learning from Human Feedback (RLHF) to maximize humor scores provided by the Reward Model.

Model Details

  • Base Model: JokeGPT SFT Model
  • Training Method: PPO (Proximal Policy Optimization) with LoRA
  • Objective: Maximize humor reward while maintaining KL divergence from the SFT policy.

Performance

This model aims to generate jokes that are consistently rated as more humorous compared to the SFT baseline, as evaluated by the Reward Model.

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-8B", device_map="auto")
model = PeftModel.from_pretrained(model, "JokeGPT-Model/ppo_model")