Mistral 7B β€” JSON Support Ticket Classifier (QLoRA Adapter)

A QLoRA fine-tuned adapter for Mistral 7B Instruct v0.3 that converts free-text customer support messages into structured JSON with intent classification, priority assignment, entity extraction, and clarification detection.

What It Does

Given a customer message like:

"Hi, I want a refund because my wireless earbuds are defective. Order id: ORD-39256"

The model outputs:

{
  "intent": "refund",
  "priority": "high",
  "entities": {
    "order_id": "ORD-39256",
    "product": "wireless earbuds"
  },
  "needs_clarification": false,
  "clarifying_question": null
}

When information is missing, it knows to ask:

{
  "intent": "shipping",
  "priority": "medium",
  "entities": {
    "order_id": null,
    "product": null
  },
  "needs_clarification": true,
  "clarifying_question": "Can you share your order ID and the delivery address ZIP code so I can check the shipment status?"
}

Output Schema

Field Type Description
intent string One of: refund, cancel, shipping, exchange, complaint, inquiry
priority string low, medium, or high
entities.order_id string | null Extracted order ID if present
entities.product string | null Extracted product name if present
needs_clarification boolean Whether the model needs more info to proceed
clarifying_question string | null Follow-up question if clarification is needed

Usage

Load and Run Inference

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel

# Load base model in 4-bit
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
)

base_model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.3",
    quantization_config=bnb_config,
    device_map="auto",
)

tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.3")
tokenizer.pad_token = tokenizer.eos_token

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "aashnakunk/mistral-7b-json-support")
model.eval()

# Build prompt
system = """You are a support automation assistant.
Return ONLY a single JSON object that matches this schema exactly, with these keys in this order:
1) intent
2) priority
3) entities (with keys: order_id, product)
4) needs_clarification
5) clarifying_question"""

user_message = "Hi, I want a refund because my wireless earbuds are defective. Order id: ORD-39256"

prompt = f"<s>[INST] {system}\n\n{user_message} [/INST]"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        do_sample=False,
        repetition_penalty=1.2,
        pad_token_id=tokenizer.eos_token_id,
    )

response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)

Important: Inference Settings

  • Set model.config.use_cache = True for generation (it's disabled during training)
  • Call model.eval() to disable dropout
  • Use do_sample=False for deterministic JSON output
  • repetition_penalty=1.2 helps prevent degenerate repetition

Training Details

Parameter Value
Base model mistralai/Mistral-7B-Instruct-v0.3
Method QLoRA (4-bit quantization + LoRA adapters)
LoRA rank (r) 16
LoRA alpha 32
LoRA dropout 0.05
Target modules q_proj, k_proj, v_proj, o_proj
Trainable parameters 13.6M / 3.77B (0.36%)
Training examples 6,000
Epochs 1
Batch size 1 (with gradient accumulation = 8, effective batch = 8)
Learning rate 2e-4
Optimizer paged_adamw_8bit
Precision fp16 mixed precision
Hardware NVIDIA Tesla T4 (15 GB)
Training time ~2.75 hours
Final training loss 0.109

Loss Curve

Training converged smoothly over 750 steps:

  • Step 10: 1.088 (learning JSON structure)
  • Step 30: 0.203 (rapid improvement)
  • Step 100: 0.127 (stabilizing)
  • Step 750: 0.109 (converged)

Adapter Size

~50 MB β€” only the LoRA adapter weights are stored, not the full 7B model.

Limitations

  • Designed specifically for customer support ticket classification β€” may not generalize to other JSON extraction tasks without further fine-tuning
  • Relies on the system prompt format shown above for best results
  • Entity extraction is limited to order_id and product fields
  • Trained on synthetic support data β€” real-world edge cases may need additional examples

Framework Versions

  • Transformers: 4.x
  • PEFT: latest
  • PyTorch: 2.9.0+cu128
  • BitsAndBytes: latest
  • TRL: latest
Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for aashnakunk/mistral-7b-json-support

Adapter
(597)
this model