Model Card for Qwen3-1.7B Customer Support Agent
This model is a fine-tuned version of Qwen/Qwen3-1.7B on the Bitext Customer Support dataset. It is designed to act as a helpful customer support agent.
Model Details
- Base Model: Qwen/Qwen3-1.7B
- Architecture: Qwen3 (1.7B parameters)
- Training Method: QLoRA (4-bit quantization with LoRA adapters)
- Dataset: Bitext-customer-support-llm-chatbot-training-dataset
- Language: English
- Code Repository: omid511/qwen3-1.7b-finetune-customer-support
Usage
Loading the Adapter (Default)
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base_model_id = "Qwen/Qwen3-1.7B"
adapter_model_id = "omid5/Qwen3-1.7b-cusomer-support-agent"
# 1. Load Base Model
base_model = AutoModelForCausalLM.from_pretrained(
base_model_id,
torch_dtype=torch.float16,
device_map="auto"
)
# 2. Load Adapters
model = PeftModel.from_pretrained(base_model, adapter_model_id)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
# 3. Inference
messages = [
{"role": "user", "content": "I received a defective item, what should I do?"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Loading the Merged Model
The full merged model is available on the merged branch of this repository.
model = AutoModelForCausalLM.from_pretrained(
"omid5/Qwen3-1.7b-cusomer-support-agent",
revision="merged",
torch_dtype=torch.float16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("omid5/Qwen3-1.7b-cusomer-support-agent", revision="merged")
Training Configuration
The model was trained using accelerate and DeepSpeed with the following hyperparameters:
- Epochs: 2
- Learning Rate: 2e-4 (Cosine Schedule)
- Batch Size: 8 (per device train) / 16 (per device eval)
- Gradient Accumulation: 2
- Max Sequence Length: 400
- LoRA Config:
r: 16alpha: 32dropout: 0.05target_modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]
- Quantization: 4-bit (nf4) via
bitsandbytes
Training Metrics
| Metric | Value |
|---|---|
| Validation Loss | 0.5842 |
| Validation Token Acc. | 81.00% |
| Training Loss | 0.6846 |
| Training Runtime | 9282s (~2.6h) |
| Samples/Second | 5.21 |
| Total Global Steps | 1512 |
Hardware
- GPUs: 2x Tesla T4
- Platform: Kaggle / Cloud
License
Apache-2.0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support