File size: 2,538 Bytes
610e4b1 314a702 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | # RoBERTa-Base Quantized Model for Intent Classification for Banking Systems
This repository contains a fine-tuned RoBERTa-Base model for **intent classification** on the **Banking77** dataset. The model identifies user intent from natural language queries in the context of banking services.
## Model Details
- **Model Architecture:** RoBERTa Base
- **Task:** Intent Classification
- **Dataset:** Banking77
- **Use Case:** Detecting user intents in banking conversations
- **Fine-tuning Framework:** Hugging Face Transformers
## Usage
### Installation
```bash
pip install transformers torch datasets
```
### Loading the Model
```python
from transformers import RobertaTokenizerFast, RobertaForSequenceClassification
import torch
from datasets import load_dataset
# Load tokenizer and model
tokenizer = RobertaTokenizerFast.from_pretrained("roberta-base")
model = RobertaForSequenceClassification.from_pretrained("path_to_your_fine_tuned_model")
model.eval()
# Sample input
text = "I am still waiting on my card?"
# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
with torch.no_grad():
outputs = model(**inputs)
predicted_class = torch.argmax(outputs.logits, dim=1).item()
# Load label mapping from dataset
label_map = load_dataset("PolyAI/banking77")["train"].features["label"].int2str
predicted_label = label_map(predicted_class)
print(f"Predicted Intent: {predicted_label}")
```
## Performance Metrics
- **Accuracy:** 0.927922
- **Precision:** 0.931764
- **Recall:** 0.927922
- **F1 Score:** 0.927976
## Fine-Tuning Details
### Dataset
The Banking77 dataset contains 13,083 labeled queries across 77 banking-related intents, including tasks like checking balances, transferring money, and reporting fraud.
### Training Configuration
- Number of epochs: 5
- Batch size: 16
- Evaluation strategy: epoch
- Learning rate: 2e-5
## Repository Structure
```
.
βββ config.json
βββ tokenizer_config.json
βββ special_tokens_map.json
βββ tokenizer.json
βββ model.safetensors # Fine-tuned RoBERTa model
βββ README.md # Documentation
```
## Limitations
- The model may not generalize well to domains outside the fine-tuning dataset.
- Quantization may result in minor accuracy degradation compared to full-precision models.
## Contributing
Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.
|