Banking Intent Classifier — Gemma-2-9B LoRA Adapter
A LoRA adapter fine-tuned from unsloth/gemma-2-9b-bnb-4bit using QLoRA (4-bit quantization + Low-Rank Adaptation) via Unsloth.
Classifies customer service queries in Vietnamese into 6 intent categories for banking chatbots.
Model Details
| Field | Value |
|---|---|
| Base model | unsloth/gemma-2-9b-bnb-4bit |
| Architecture | Gemma-2ForCausalLM + PEFT LoRA |
| Fine-tuning method | QLoRA (4-bit NF, LoRA r=16, α=16) |
| PEFT version | 0.18.1 |
| Language | Vietnamese (Vi) |
| Task | Multi-class intent classification |
| License | MIT |
Intent Labels
| Label | Description |
|---|---|
balance_inquiry |
User asks about account balance |
transfer |
User wants to transfer money |
card_block |
User reports lost/stolen card |
loan_inquiry |
User asks about loan products |
complaint |
User files a service complaint |
unknown |
Query doesn't match any known intent |
Usage
Load with PEFT + Transformers
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch
# Quantization config
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
)
# Base model
base_model_id = "unsloth/gemma-2-9b-bnb-4bit"
base = AutoModelForCausalLM.from_pretrained(
base_model_id,
quantization_config=bnb_config,
device_map="auto",
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base, "your-hf-username/banking-intent-gemma2-unsloth")
# Tokenizer
tokenizer = AutoTokenizer.from_pretrained("unsloth/gemma-2-9b-bnb-4bit")
Inference
PROMPT = (
"<start_of_turn>user\n"
"Phân loại intent: 'Tôi muốn chuyển tiền sang tài khoản khác'\n"
"Intent: <end_of_turn>\n"
"<start_of_turn>model\n"
)
inputs = tokenizer(PROMPT, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=64)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# → transfer
Training Details
- Framework: Unsloth + TRL (SFTTrainer)
- Quantization: 4-bit NF4 via BitsAndBytes
- LoRA rank (r): 16
- LoRA alpha: 16
- LoRA dropout: 0
- Target modules:
q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj - Max sequence length: 8192 (tokenizer limit)
Full training logs and scripts are available in the parent repository.
Limitations
- Trained on synthetic/generative banking queries; may not cover all real-world variations.
- Vietnamese-language only; performance on other languages is not guaranteed.
- Intent classification is text-only; does not handle multi-modal inputs.
Citation
@misc{banking-intent-gemma2-unsloth,
title = {Banking Intent Classifier — Gemma-2-9B LoRA},
author = {Your Name},
year = {2026},
publisher = {HuggingFace},
url = {https://huggingface.co/your-username/banking-intent-gemma2-unsloth}
}
License
MIT
- Downloads last month
- 13