| base_model: Qwen/Qwen3-8B | |
| library_name: peft | |
| license: apache-2.0 | |
| tags: | |
| - sft | |
| - humor | |
| - qwen | |
| - lora | |
| # JokeGPT - SFT Model | |
| This is the Supervised Fine-Tuned (SFT) version of JokeGPT. It serves as the foundation for the RLHF pipeline. | |
| ## Model Details | |
| - **Base Model**: [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) | |
| - **Training Method**: LoRA (Low-Rank Adaptation) | |
| - **Task**: Causal Language Modeling (Joke Generation) | |
| ## Training Data | |
| The model was fine-tuned on a curated dataset of jokes, including: | |
| - Reddit Jokes | |
| - Ruozhiba (Weak Intellect Bar) dataset | |
| - Custom humor datasets | |
| ## Usage | |
| ```python | |
| from peft import PeftModel | |
| from transformers import AutoModelForCausalLM | |
| model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-8B", device_map="auto") | |
| model = PeftModel.from_pretrained(model, "JokeGPT-Model/sft_final") | |
| ``` |