JokeGPT-Model / sft_final /README.md

TzJ2006

Upload folder using huggingface_hub

f89a9b6 verified about 2 months ago

preview code

raw

history blame contribute delete

850 Bytes

metadata

base_model: Qwen/Qwen3-8B
library_name: peft
license: apache-2.0
tags:
  - sft
  - humor
  - qwen
  - lora

JokeGPT - SFT Model

This is the Supervised Fine-Tuned (SFT) version of JokeGPT. It serves as the foundation for the RLHF pipeline.

Model Details

Base Model: Qwen/Qwen3-8B
Training Method: LoRA (Low-Rank Adaptation)
Task: Causal Language Modeling (Joke Generation)

Training Data

The model was fine-tuned on a curated dataset of jokes, including:

Reddit Jokes
Ruozhiba (Weak Intellect Bar) dataset
Custom humor datasets

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-8B", device_map="auto")
model = PeftModel.from_pretrained(model, "JokeGPT-Model/sft_final")