--- base_model: unsloth/qwen2.5-7b-unsloth-bnb-4bit library_name: peft pipeline_tag: text-generation tags: - base_model:adapter:unsloth/qwen2.5-7b-unsloth-bnb-4bit - lora - sft - transformers - trl - unsloth - intent-classification - banking77 --- # Qwen2.5-7B Banking Intent Classification This is a LoRA adapter fine-tuned on the **BANKING77** dataset to perform fine-grained intent classification in the banking domain. The model is based on `unsloth/Qwen2.5-7B` and trained using the [Unsloth](https://github.com/unslothai/unsloth) library for highly efficient training. ## Model Details - **Model Type:** Causal Language Model with LoRA adapter - **Developer:** ngbaoan - **Base Model:** `unsloth/qwen2.5-7b-unsloth-bnb-4bit` - **Language:** English - **Task:** Intent Classification - **Dataset:** [BANKING77](https://huggingface.co/datasets/banking77) (77 distinct banking-related intents) ## Performance The model was evaluated on the test set and achieved the following results: - **Accuracy:** **92.29%** (0.9229) - **Macro F1-Score:** 0.85 - **Weighted F1-Score:** 0.92 *(Note: Some labels in the dataset subset might have 0 support, which affects the macro average. For supported intents, the F1 score ranges from 0.80 to 1.00).* ## Intended Use This model is designed to classify user queries related to banking operations (e.g., card activation, lost cards, top-up failures, exchange rates, etc.) into one of 77 specific intents. **Example Input:** > "I tried to top up my account using a card but it failed, what should I do?" **Example Output:** > `top_up_failed` ## Training Details The model was fine-tuned efficiently using Unsloth with 4-bit quantization and LoRA. ### Training Hyperparameters - **LoRA Rank (r):** 64 - **LoRA Alpha:** 64 - **Batch Size:** 2 (per device) - **Gradient Accumulation Steps:** 4 - **Learning Rate:** 5.0e-5 - **Optimizer:** `adamw_8bit` - **LR Scheduler:** `cosine` - **Warmup Steps:** 20 - **Weight Decay:** 0.01 - **Epochs:** 6 - **Max Sequence Length:** 512 ## How to Get Started with the Model Since this is a LoRA adapter, you need to load the base model and then apply these PEFT weights. The easiest way is using the `unsloth` library or standard `transformers`. ```python from unsloth import FastLanguageModel import torch max_seq_length = 512 # 1. Load the model and tokenizer model, tokenizer = FastLanguageModel.from_pretrained( model_name = "ngbaoan/intent-banking", # Your Hugging Face repo max_seq_length = max_seq_length, dtype = None, load_in_4bit = True, ) FastLanguageModel.for_inference(model) # 2. Format your prompt prompt = """Instruct: Classify the following banking query into the correct intent. Query: I lost my card yesterday and I need a replacement. Intent: """ inputs = tokenizer([prompt], return_tensors = "pt").to("cuda") # 3. Generate the response outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True) print(tokenizer.batch_decode(outputs, skip_special_tokens = True)[0]) ``` ## Framework Versions - PEFT 0.18.1 - Transformers - Unsloth - TRL