Tenacious-Qwen-DPO-Stable π
This is a LoRA adapter for Qwen-2.5-1.5B-Instruct, fine-tuned to solve the "Honesty Gap" in B2B sales agents. It ensures that sales agents correctly calibrate their confidence and never hallucinate engineering bench capacity.
Model Details
- Developed by: Meseret Bolled
- Model type: LoRA Adapter (PEFT)
- Language(s): English
- License: CC-BY-4.0
- Finetuned from model: Qwen/Qwen2.5-1.5B-Instruct
Training Details
- Training Data: Tenacious-Bench v0.1 (119 preference-aligned tasks)
- Training Algorithm: Supervised Fine-Tuning (SFT) / DPO
- Hyperparameters:
- Learning Rate: 2e-5
- LoRA Rank (r): 16
- LoRA Alpha: 32
- Max Steps: 150
- Optimizer: AdamW
Evaluation Results
The model was evaluated on the Tenacious-Bench Held-Out (52 tasks).
| Metric | Base Model (Qwen 1.5B) | Tenacious-Stable (Trained) | Delta |
|---|---|---|---|
| Weighted Score | 0.24 | 0.82 | +0.58 |
| Pass Rate | 23.1% | 82.7% | +59.6pp |
| BCH Violations | 53.8% | 5.8% | -48.0pp |
Intended Use
This model is intended for B2B sales outreach automation where strict adherence to supply-side capacity (Bench Capacity Honesty) and brand voice is required.
Limitations
The model is optimized for the Tenacious Intelligence Corp sales workflow. It may require further fine-tuning for other B2B domains with different ICP (Ideal Customer Profile) definitions.
How to Get Started
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B-Instruct")
model = PeftModel.from_pretrained(base_model, "meseretbolled/Tenacious-Qwen-DPO-Stable")
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support