Norm Macdonald Style Adapter
LoRA adapter trained to write in Norm Macdonald's comedic style.
Training
- Base model: Llama 3.1 8B Instruct
- Method: DPO + SFT with adapter merge (1.0×DPO + 0.25×SFT)
- Data: 511 preference pairs generated synthetically from transcripts
- Judge: Llama 3.1 70B (AWQ INT4)
Usage
Evaluation
| Variant | Train Rubric | Eval Rubric |
|---|---|---|
| base | 2.22 | 1.86 |
| merged | 4.92 | 5.00 |
Scored by 70B judge on 50 prompts across 4 dimensions (2 train + 2 held-out).
Project
See held-out-style for the full pipeline.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for samiamjidkhan/norm-macdonald-style
Base model
NousResearch/Meta-Llama-3.1-8B-Instruct