Norm Macdonald Style Adapter

LoRA adapter trained to write in Norm Macdonald's comedic style.

Training

  • Base model: Llama 3.1 8B Instruct
  • Method: DPO + SFT with adapter merge (1.0×DPO + 0.25×SFT)
  • Data: 511 preference pairs generated synthetically from transcripts
  • Judge: Llama 3.1 70B (AWQ INT4)

Usage

Evaluation

Variant Train Rubric Eval Rubric
base 2.22 1.86
merged 4.92 5.00

Scored by 70B judge on 50 prompts across 4 dimensions (2 train + 2 held-out).

Project

See held-out-style for the full pipeline.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for samiamjidkhan/norm-macdonald-style

Adapter
(74)
this model