Instructions to use mdk615661/it-helpdesk-qlora-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mdk615661/it-helpdesk-qlora-v3 with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")
model = PeftModel.from_pretrained(base_model, "mdk615661/it-helpdesk-qlora-v3")

Transformers

How to use mdk615661/it-helpdesk-qlora-v3 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="mdk615661/it-helpdesk-qlora-v3")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("mdk615661/it-helpdesk-qlora-v3", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use mdk615661/it-helpdesk-qlora-v3 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "mdk615661/it-helpdesk-qlora-v3"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mdk615661/it-helpdesk-qlora-v3",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/mdk615661/it-helpdesk-qlora-v3

SGLang

How to use mdk615661/it-helpdesk-qlora-v3 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "mdk615661/it-helpdesk-qlora-v3" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mdk615661/it-helpdesk-qlora-v3",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "mdk615661/it-helpdesk-qlora-v3" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mdk615661/it-helpdesk-qlora-v3",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use mdk615661/it-helpdesk-qlora-v3 with Docker Model Runner:
```
docker model run hf.co/mdk615661/it-helpdesk-qlora-v3
```

IT Helpdesk AI — QLoRA Adapter v3

Model Description

Fine-tuned LoRA adapter for Mistral-7B-v0.1 trained on 1141 IT helpdesk tickets. This is v3 — the most accurate version with normalized ticket classification output that directly maps to IT helpdesk database tables.

Developed by: mdk615661
Model type: QLoRA Fine-tuned Adapter (PEFT)
Language: English
License: Apache 2.0
Finetuned from: mistralai/Mistral-7B-v0.1
Full merged model: mdk615661/it-helpdesk-merged-v3

What It Does

Input any IT support ticket → Returns structured output:

Normalized — standardized ticket title matching DB table
Category — Hardware / Software / Incident / Others / Procurement
Subcategory — specific issue type
Insight — AI analysis of the problem
Recommendation — actionable step for IT team

How To Use

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

model_name = "mistralai/Mistral-7B-v0.1"
adapter_name = "mdk615661/it-helpdesk-qlora-v3"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16
)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map="auto"
)

model = PeftModel.from_pretrained(model, adapter_name)
tokenizer = AutoTokenizer.from_pretrained(adapter_name)

def classify_ticket(ticket):
    prompt = f"""### Instruction:
Normalize and classify this IT ticket

### Input:
{ticket}

### Output:
"""
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    outputs = model.generate(
        **inputs,
        max_new_tokens=200,
        temperature=0.1,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )
    result = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return result.split("### Output:")[-1].strip()

print(classify_ticket("My laptop is not connecting to WiFi"))

Example Output

Input: My laptop is not connecting to WiFi

Output:

Normalized: wifi connectivity issue Category: Hardware Subcategory: Hardware - Laptop Insight: WiFi adapter driver may be outdated or misconfigured Recommendation: Update WiFi driver and check network adapter settings

Training Details

Base Model: mistralai/Mistral-7B-v0.1
Method: QLoRA (4-bit NF4 quantization + LoRA)
Dataset Size: 1141 IT helpdesk tickets
Epochs: 3
Batch Size: 4
Gradient Accumulation: 4
Learning Rate: 2e-4
LoRA Rank (r): 16
LoRA Alpha: 32
Max Length: 512
Training Loss: 0.264
Validation Loss: 0.344
Training Platform: Google Colab T4 GPU

Version History

Version	Samples	Train Loss	Val Loss
v1 (it-helpdesk-qlora)	301	0.595	0.831
v2 (it-helpdesk-qlora-v2)	451	0.558	0.763
v3 (it-helpdesk-qlora-v3)	1141	0.264	0.344

Limitations

English language only
Best for corporate IT helpdesk scenarios
Should be reviewed by IT staff before action
Performance improves with more organization-specific training data

Downloads last month: 26

Model tree for mdk615661/it-helpdesk-qlora-v3

Base model

mistralai/Mistral-7B-v0.1

Adapter

(2471)

this model