Tenacious Judge LoRA

Path B SimPO LoRA adapter for Tenacious-Bench sales-output preference scoring.

Backbone: unsloth/Qwen2.5-0.5B-Instruct
Backbone note: Operational text-only fallback; Qwen3.5-0.8B is currently multimodal and breaks TRL CPO tokenization on text-only preference pairs.
Objective: SimPO via TRL CPOTrainer
Seed: 3407
Training pairs: 81
Eval pairs: 10
Intended use: rejection-sampling or critique layer for Tenacious-style B2B sales outreach drafts
Known limitation: dual-control / premature-booking examples are underrepresented in the current preference set

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support