| # RoBERTa-Base Quantized Model for Intent Classification for Banking Systems |
|
|
| This repository contains a fine-tuned RoBERTa-Base model for **intent classification** on the **Banking77** dataset. The model identifies user intent from natural language queries in the context of banking services. |
|
|
| ## Model Details |
|
|
| - **Model Architecture:** RoBERTa Base |
| - **Task:** Intent Classification |
| - **Dataset:** Banking77 |
| - **Use Case:** Detecting user intents in banking conversations |
| - **Fine-tuning Framework:** Hugging Face Transformers |
|
|
| ## Usage |
|
|
| ### Installation |
|
|
| ```bash |
| pip install transformers torch datasets |
| ``` |
|
|
| ### Loading the Model |
|
|
| ```python |
| from transformers import RobertaTokenizerFast, RobertaForSequenceClassification |
| import torch |
| from datasets import load_dataset |
| |
| # Load tokenizer and model |
| tokenizer = RobertaTokenizerFast.from_pretrained("roberta-base") |
| model = RobertaForSequenceClassification.from_pretrained("path_to_your_fine_tuned_model") |
| model.eval() |
| |
| # Sample input |
| text = "I am still waiting on my card?" |
| |
| # Tokenize and predict |
| inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) |
| with torch.no_grad(): |
| outputs = model(**inputs) |
| predicted_class = torch.argmax(outputs.logits, dim=1).item() |
| |
| # Load label mapping from dataset |
| label_map = load_dataset("PolyAI/banking77")["train"].features["label"].int2str |
| predicted_label = label_map(predicted_class) |
| |
| print(f"Predicted Intent: {predicted_label}") |
| ``` |
|
|
| ## Performance Metrics |
|
|
| - **Accuracy:** 0.927922 |
| - **Precision:** 0.931764 |
| - **Recall:** 0.927922 |
| - **F1 Score:** 0.927976 |
|
|
| ## Fine-Tuning Details |
|
|
| ### Dataset |
|
|
| The Banking77 dataset contains 13,083 labeled queries across 77 banking-related intents, including tasks like checking balances, transferring money, and reporting fraud. |
|
|
| ### Training Configuration |
|
|
| - Number of epochs: 5 |
| - Batch size: 16 |
| - Evaluation strategy: epoch |
| - Learning rate: 2e-5 |
|
|
| ## Repository Structure |
|
|
| ``` |
| . |
| βββ config.json |
| βββ tokenizer_config.json |
| βββ special_tokens_map.json |
| βββ tokenizer.json |
| βββ model.safetensors # Fine-tuned RoBERTa model |
| βββ README.md # Documentation |
| ``` |
|
|
| ## Limitations |
| |
| - The model may not generalize well to domains outside the fine-tuning dataset. |
|
|
| - Quantization may result in minor accuracy degradation compared to full-precision models. |
| |
| ## Contributing |
| |
| Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements. |
|
|