Humor Generation Model โ€“ SemEval 2026

Model Description

This model was developed for participation in SemEval 2026 โ€“ Humor Generation Task.
It is fine-tuned to generate creative, logically coherent, and instruction-compliant jokes following a structured output format.

The system emphasizes:

  • High logical coherence
  • Creative and clever humor
  • Strict instruction adherence
  • Consistent formatting compliance

All outputs strictly follow the required format:

JOKE: <generated joke text>

Intended Use

Primary Use

This model is intended for:

  • Humor generation research
  • Controlled joke generation
  • Instruction-following text generation experiments
  • Academic evaluation and benchmarking

Out-of-Scope Use

  • Harmful or offensive content generation
  • Misinformation or deceptive content
  • Automated large-scale content spam

Users are responsible for ensuring ethical and appropriate usage.


Training Details

Training Setup

  • Epochs: 4
  • Final Training Loss: ~1.65
  • Dataset Format: TSV
  • Output Prefix Enforcement: JOKE:
  • Instruction Compliance: 100%

The model was fine-tuned to balance creativity and coherence while maintaining formatting reliability.


Dataset

The training dataset consists of cleaned and structured humorous text samples.
Preprocessing steps included:

  • Removal of malformed entries
  • Structural normalization
  • Prefix enforcement
  • Formatting consistency validation

The dataset was prepared specifically for the SemEval 2026 Humor Generation task.


Evaluation

Performance Summary

Metric Result
Training Loss ~1.65
Joke Quality Creative / Clever
Instruction Following 100% (Perfect)
Logic / Coherence High
Formatting Compliance Correct JOKE: Prefix

Evaluation focused on:

  • Creativity
  • Logical consistency
  • Instruction adherence
  • Structural compliance

Example Usage

from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "your-username/your-model-name"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = "Generate a clever joke about programming."

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Ethical Considerations

Humor generation may occasionally produce unintended or sensitive content depending on input prompts.

Users should:

  • Apply moderation filters if deploying publicly
  • Monitor outputs in real-world applications
  • Ensure compliance with ethical AI guidelines

Limitations

  • Performance depends on prompt clarity
  • May struggle with highly niche or domain-specific humor
  • Creativity is bounded by training data diversity

Citation

If you use this model in academic work, please cite:

@misc{abbas2026humor,
  title={Humor Generation Model for SemEval 2026},
  author={Insa Abbas},
  year={2026},
  note={SemEval 2026 Submission},
  url={https://huggingface.co/your-username/your-model-name}
}

Author

Insa Abbas
Email: insaabbas675@gmail.com


Acknowledgment

This model was developed as part of participation in SemEval 2026 โ€“ Humor Generation Task.

Downloads last month
1
Safetensors
Model size
3B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support