Mistral 7B — JSON Support Ticket Classifier (QLoRA Adapter)

A QLoRA fine-tuned adapter for Mistral 7B Instruct v0.3 that converts free-text customer support messages into structured JSON with intent classification, priority assignment, entity extraction, and clarification detection.

What It Does

Given a customer message like:

"Hi, I want a refund because my wireless earbuds are defective. Order id: ORD-39256"

The model outputs:

{
  "intent": "refund",
  "priority": "high",
  "entities": {
    "order_id": "ORD-39256",
    "product": "wireless earbuds"
  },
  "needs_clarification": false,
  "clarifying_question": null
}

When information is missing, it knows to ask:

{
  "intent": "shipping",
  "priority": "medium",
  "entities": {
    "order_id": null,
    "product": null
  },
  "needs_clarification": true,
  "clarifying_question": "Can you share your order ID and the delivery address ZIP code so I can check the shipment status?"
}

Output Schema

Field	Type	Description
`intent`	string	One of: `refund`, `cancel`, `shipping`, `exchange`, `complaint`, `inquiry`
`priority`	string	`low`, `medium`, or `high`
`entities.order_id`	string \| null	Extracted order ID if present
`entities.product`	string \| null	Extracted product name if present
`needs_clarification`	boolean	Whether the model needs more info to proceed
`clarifying_question`	string \| null	Follow-up question if clarification is needed

Usage

Load and Run Inference

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel

# Load base model in 4-bit
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.3",
    quantization_config=bnb_config,
    device_map="auto",
)

tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.3")
tokenizer.pad_token = tokenizer.eos_token

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "aashnakunk/mistral-7b-json-support")
model.eval()

# Build prompt
system = """You are a support automation assistant.
Return ONLY a single JSON object that matches this schema exactly, with these keys in this order:
1) intent
2) priority
3) entities (with keys: order_id, product)
4) needs_clarification
5) clarifying_question"""

user_message = "Hi, I want a refund because my wireless earbuds are defective. Order id: ORD-39256"

prompt = f"<s>[INST] {system}\n\n{user_message} [/INST]"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        do_sample=False,
        repetition_penalty=1.2,
        pad_token_id=tokenizer.eos_token_id,
    )

response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)

Important: Inference Settings

Set model.config.use_cache = True for generation (it's disabled during training)
Call model.eval() to disable dropout
Use do_sample=False for deterministic JSON output
repetition_penalty=1.2 helps prevent degenerate repetition

Training Details

Parameter	Value
Base model	`mistralai/Mistral-7B-Instruct-v0.3`
Method	QLoRA (4-bit quantization + LoRA adapters)
LoRA rank (r)	16
LoRA alpha	32
LoRA dropout	0.05
Target modules	`q_proj`, `k_proj`, `v_proj`, `o_proj`
Trainable parameters	13.6M / 3.77B (0.36%)
Training examples	6,000
Epochs	1
Batch size	1 (with gradient accumulation = 8, effective batch = 8)
Learning rate	2e-4
Optimizer	`paged_adamw_8bit`
Precision	fp16 mixed precision
Hardware	NVIDIA Tesla T4 (15 GB)
Training time	~2.75 hours
Final training loss	0.109

Loss Curve

Training converged smoothly over 750 steps:

Step 10: 1.088 (learning JSON structure)
Step 30: 0.203 (rapid improvement)
Step 100: 0.127 (stabilizing)
Step 750: 0.109 (converged)

Adapter Size

~50 MB — only the LoRA adapter weights are stored, not the full 7B model.

Limitations

Designed specifically for customer support ticket classification — may not generalize to other JSON extraction tasks without further fine-tuning
Relies on the system prompt format shown above for best results
Entity extraction is limited to order_id and product fields
Trained on synthetic support data — real-world edge cases may need additional examples

Framework Versions

Transformers: 4.x
PEFT: latest
PyTorch: 2.9.0+cu128
BitsAndBytes: latest
TRL: latest

Downloads last month: 3

Model tree for aashnakunk/mistral-7b-json-support

Base model

mistralai/Mistral-7B-v0.3

Finetuned

mistralai/Mistral-7B-Instruct-v0.3

Adapter

(825)

this model