samiamjidkhan
/

norm-macdonald-style

character-training

Model card Files Files and versions

Norm Macdonald Style Adapter

LoRA adapter trained to write in Norm Macdonald's comedic style.

Training

Base model: Llama 3.1 8B Instruct
Method: DPO + SFT with adapter merge (1.0×DPO + 0.25×SFT)
Data: 511 preference pairs generated synthetically from transcripts
Judge: Llama 3.1 70B (AWQ INT4)

Usage

Evaluation

Variant	Train Rubric	Eval Rubric
base	2.22	1.86
merged	4.92	5.00

Scored by 70B judge on 50 prompts across 4 dimensions (2 train + 2 held-out).

Project

See held-out-style for the full pipeline.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for samiamjidkhan/norm-macdonald-style

Base model

NousResearch/Meta-Llama-3.1-8B-Instruct

Adapter

(80)

this model