|
|
--- |
|
|
license: mit |
|
|
language: en |
|
|
tags: |
|
|
- text-generation |
|
|
- fine-tuning |
|
|
- qlora |
|
|
- phi-2 |
|
|
- humor |
|
|
- semeval |
|
|
datasets: |
|
|
- custom-humor-tsv |
|
|
base_model: microsoft/phi-2 |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# Phi-2 Humor Generator (SemEval 2026 System) |
|
|
|
|
|
This model is a 2.7 Billion parameter, fine-tuned version of Microsoft's Phi-2, adapted for the task of structured **Humor Generation** at the SemEval-202X shared task. |
|
|
|
|
|
The system was optimized using **QLoRA (Quantized Low-Rank Adaptation)** on a structured dataset to generate creative and context-aware jokes based on a provided input prompt. |
|
|
|
|
|
### Model Details |
|
|
|
|
|
* **Base Model:** [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) |
|
|
* **Architecture:** Transformer (Causal Language Model) |
|
|
* **Fine-tuning Method:** QLoRA (4-bit quantization, Rank `r=64`, Alpha `lora_alpha=16`) |
|
|
* **Framework:** Hugging Face `transformers` and `peft` libraries. |
|
|
|
|
|
|
|
|
## Training Data |
|
|
|
|
|
The model was fine-tuned on a custom dataset compiled for the SemEval task. The data was formatted using a strict instruction template to guide the model's output structure. |
|
|
|
|
|
* **Dataset Source:** Custom-created and cleaned data, available on GitHub. |
|
|
* **Data Format:** Tab-Separated Values (TSV) file. |
|
|
* **Dataset Link:** [https://github.com/insaabbas/Humor-generation-colab-notebook/blob/main/humor%20dataset%20final.tsv](https://github.com/insaabbas/Humor-generation-colab-notebook/blob/main/humor%20dataset%20final.tsv) |
|
|
|
|
|
## Training Procedure |
|
|
|
|
|
The base Phi-2 model was fine-tuned for **2 epochs** using the QLoRA technique. |
|
|
|
|
|
| Hyperparameter | Value | |
|
|
| :--- | :--- | |
|
|
| **Quantization** | 4-bit NormalFloat (NF4) | |
|
|
| **QLoRA Rank ($r$)** | 64 | |
|
|
| **Learning Rate** | $2 \times 10^{-4}$ | |
|
|
| **Max Context Length** | 1024 tokens | |
|
|
|
|
|
|
|
|
## How to Use |
|
|
|
|
|
The model can be loaded and used via the Hugging Face `transformers` library. Since Phi-2 requires remote code execution, ensure `trust_remote_code=True` is set. |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
from peft import PeftModel, PeftConfig # If you uploaded the LoRA adapter separately |
|
|
|
|
|
# Load the model and tokenizer |
|
|
model_id = "insaabbas/phi2_humor_merged_model" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) |
|
|
model = AutoModelForCausalLM.from_pretrained(model_id, |
|
|
torch_dtype=torch.float16, |
|
|
trust_remote_code=True, |
|
|
device_map="auto") |
|
|
|
|
|
# Example Prompt (Use your exact structured prompt format here!) |
|
|
prompt = """ |
|
|
### Input: |
|
|
Topic: The difference between a politician and a normal person. |
|
|
Constraints: Must be a one-liner. |
|
|
### Output: |
|
|
""" |
|
|
|
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
|
|
|
# Generate the humor |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_new_tokens=100, |
|
|
do_sample=True, |
|
|
temperature=0.7, |
|
|
pad_token_id=tokenizer.eos_token_id # Important for Phi-2 |
|
|
) |
|
|
|
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=False)) |
|
|
|
|
|
### 5. Evaluation, Limitations, and Citation |
|
|
|
|
|
```markdown |
|
|
## Evaluation Results |
|
|
|
|
|
* **Task:** SemEval 202X Humor Generation |
|
|
* **Official Metric:** [State the official metric, e.g., Human Evaluation Score, BERTScore, etc.] |
|
|
* **Performance:** [State your final competition score or a relevant validation metric] |
|
|
|
|
|
## Limitations and Ethical Considerations |
|
|
|
|
|
The model's output quality is dependent on the style and structure of the training data. |
|
|
* It may struggle to adhere to complex or contradictory constraints. |
|
|
* As an LLM, it may occasionally generate non-P.C. or offensive content, reflecting biases present in its original pre-training data or the fine-tuning data. **Use caution and human review for all outputs.** |
|
|
|
|
|
|