---
base_model: unsloth/qwen2.5-7b-unsloth-bnb-4bit
library_name: peft
pipeline_tag: text-generation
tags:
- base_model:adapter:unsloth/qwen2.5-7b-unsloth-bnb-4bit
- lora
- sft
- transformers
- trl
- unsloth
- intent-classification
- banking77
---

# Qwen2.5-7B Banking Intent Classification

This is a LoRA adapter fine-tuned on the **BANKING77** dataset to perform fine-grained intent classification in the banking domain. The model is based on `unsloth/Qwen2.5-7B` and trained using the [Unsloth](https://github.com/unslothai/unsloth) library for highly efficient training.

## Model Details

- **Model Type:** Causal Language Model with LoRA adapter
- **Developer:** ngbaoan
- **Base Model:** `unsloth/qwen2.5-7b-unsloth-bnb-4bit`
- **Language:** English
- **Task:** Intent Classification
- **Dataset:** [BANKING77](https://huggingface.co/datasets/banking77) (77 distinct banking-related intents)

## Performance

The model was evaluated on the test set and achieved the following results:
- **Accuracy:** **92.29%** (0.9229)
- **Macro F1-Score:** 0.85
- **Weighted F1-Score:** 0.92

*(Note: Some labels in the dataset subset might have 0 support, which affects the macro average. For supported intents, the F1 score ranges from 0.80 to 1.00).*

## Intended Use

This model is designed to classify user queries related to banking operations (e.g., card activation, lost cards, top-up failures, exchange rates, etc.) into one of 77 specific intents. 

**Example Input:**
> "I tried to top up my account using a card but it failed, what should I do?"

**Example Output:**
> `top_up_failed`

## Training Details

The model was fine-tuned efficiently using Unsloth with 4-bit quantization and LoRA. 

### Training Hyperparameters
- **LoRA Rank (r):** 64
- **LoRA Alpha:** 64
- **Batch Size:** 2 (per device)
- **Gradient Accumulation Steps:** 4
- **Learning Rate:** 5.0e-5
- **Optimizer:** `adamw_8bit`
- **LR Scheduler:** `cosine`
- **Warmup Steps:** 20
- **Weight Decay:** 0.01
- **Epochs:** 6
- **Max Sequence Length:** 512

## How to Get Started with the Model

Since this is a LoRA adapter, you need to load the base model and then apply these PEFT weights. The easiest way is using the `unsloth` library or standard `transformers`.

```python
from unsloth import FastLanguageModel
import torch

max_seq_length = 512

# 1. Load the model and tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "ngbaoan/intent-banking", # Your Hugging Face repo
    max_seq_length = max_seq_length,
    dtype = None,
    load_in_4bit = True,
)
FastLanguageModel.for_inference(model)

# 2. Format your prompt
prompt = """Instruct: Classify the following banking query into the correct intent.
Query: I lost my card yesterday and I need a replacement.
Intent: """

inputs = tokenizer([prompt], return_tensors = "pt").to("cuda")

# 3. Generate the response
outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
print(tokenizer.batch_decode(outputs, skip_special_tokens = True)[0])
```

## Framework Versions
- PEFT 0.18.1
- Transformers
- Unsloth
- TRL