Mistral 7B β JSON Support Ticket Classifier (QLoRA Adapter)
A QLoRA fine-tuned adapter for Mistral 7B Instruct v0.3 that converts free-text customer support messages into structured JSON with intent classification, priority assignment, entity extraction, and clarification detection.
What It Does
Given a customer message like:
"Hi, I want a refund because my wireless earbuds are defective. Order id: ORD-39256"
The model outputs:
{
"intent": "refund",
"priority": "high",
"entities": {
"order_id": "ORD-39256",
"product": "wireless earbuds"
},
"needs_clarification": false,
"clarifying_question": null
}
When information is missing, it knows to ask:
{
"intent": "shipping",
"priority": "medium",
"entities": {
"order_id": null,
"product": null
},
"needs_clarification": true,
"clarifying_question": "Can you share your order ID and the delivery address ZIP code so I can check the shipment status?"
}
Output Schema
| Field | Type | Description |
|---|---|---|
intent |
string | One of: refund, cancel, shipping, exchange, complaint, inquiry |
priority |
string | low, medium, or high |
entities.order_id |
string | null | Extracted order ID if present |
entities.product |
string | null | Extracted product name if present |
needs_clarification |
boolean | Whether the model needs more info to proceed |
clarifying_question |
string | null | Follow-up question if clarification is needed |
Usage
Load and Run Inference
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
# Load base model in 4-bit
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
base_model = AutoModelForCausalLM.from_pretrained(
"mistralai/Mistral-7B-Instruct-v0.3",
quantization_config=bnb_config,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.3")
tokenizer.pad_token = tokenizer.eos_token
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "aashnakunk/mistral-7b-json-support")
model.eval()
# Build prompt
system = """You are a support automation assistant.
Return ONLY a single JSON object that matches this schema exactly, with these keys in this order:
1) intent
2) priority
3) entities (with keys: order_id, product)
4) needs_clarification
5) clarifying_question"""
user_message = "Hi, I want a refund because my wireless earbuds are defective. Order id: ORD-39256"
prompt = f"<s>[INST] {system}\n\n{user_message} [/INST]"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=False,
repetition_penalty=1.2,
pad_token_id=tokenizer.eos_token_id,
)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)
Important: Inference Settings
- Set
model.config.use_cache = Truefor generation (it's disabled during training) - Call
model.eval()to disable dropout - Use
do_sample=Falsefor deterministic JSON output repetition_penalty=1.2helps prevent degenerate repetition
Training Details
| Parameter | Value |
|---|---|
| Base model | mistralai/Mistral-7B-Instruct-v0.3 |
| Method | QLoRA (4-bit quantization + LoRA adapters) |
| LoRA rank (r) | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0.05 |
| Target modules | q_proj, k_proj, v_proj, o_proj |
| Trainable parameters | 13.6M / 3.77B (0.36%) |
| Training examples | 6,000 |
| Epochs | 1 |
| Batch size | 1 (with gradient accumulation = 8, effective batch = 8) |
| Learning rate | 2e-4 |
| Optimizer | paged_adamw_8bit |
| Precision | fp16 mixed precision |
| Hardware | NVIDIA Tesla T4 (15 GB) |
| Training time | ~2.75 hours |
| Final training loss | 0.109 |
Loss Curve
Training converged smoothly over 750 steps:
- Step 10: 1.088 (learning JSON structure)
- Step 30: 0.203 (rapid improvement)
- Step 100: 0.127 (stabilizing)
- Step 750: 0.109 (converged)
Adapter Size
~50 MB β only the LoRA adapter weights are stored, not the full 7B model.
Limitations
- Designed specifically for customer support ticket classification β may not generalize to other JSON extraction tasks without further fine-tuning
- Relies on the system prompt format shown above for best results
- Entity extraction is limited to
order_idandproductfields - Trained on synthetic support data β real-world edge cases may need additional examples
Framework Versions
- Transformers: 4.x
- PEFT: latest
- PyTorch: 2.9.0+cu128
- BitsAndBytes: latest
- TRL: latest
- Downloads last month
- 10
Model tree for aashnakunk/mistral-7b-json-support
Base model
mistralai/Mistral-7B-v0.3
Finetuned
mistralai/Mistral-7B-Instruct-v0.3