NikG100's picture
Update README.md
610e4b1 verified
# RoBERTa-Base Quantized Model for Intent Classification for Banking Systems
This repository contains a fine-tuned RoBERTa-Base model for **intent classification** on the **Banking77** dataset. The model identifies user intent from natural language queries in the context of banking services.
## Model Details
- **Model Architecture:** RoBERTa Base
- **Task:** Intent Classification
- **Dataset:** Banking77
- **Use Case:** Detecting user intents in banking conversations
- **Fine-tuning Framework:** Hugging Face Transformers
## Usage
### Installation
```bash
pip install transformers torch datasets
```
### Loading the Model
```python
from transformers import RobertaTokenizerFast, RobertaForSequenceClassification
import torch
from datasets import load_dataset
# Load tokenizer and model
tokenizer = RobertaTokenizerFast.from_pretrained("roberta-base")
model = RobertaForSequenceClassification.from_pretrained("path_to_your_fine_tuned_model")
model.eval()
# Sample input
text = "I am still waiting on my card?"
# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
with torch.no_grad():
outputs = model(**inputs)
predicted_class = torch.argmax(outputs.logits, dim=1).item()
# Load label mapping from dataset
label_map = load_dataset("PolyAI/banking77")["train"].features["label"].int2str
predicted_label = label_map(predicted_class)
print(f"Predicted Intent: {predicted_label}")
```
## Performance Metrics
- **Accuracy:** 0.927922
- **Precision:** 0.931764
- **Recall:** 0.927922
- **F1 Score:** 0.927976
## Fine-Tuning Details
### Dataset
The Banking77 dataset contains 13,083 labeled queries across 77 banking-related intents, including tasks like checking balances, transferring money, and reporting fraud.
### Training Configuration
- Number of epochs: 5
- Batch size: 16
- Evaluation strategy: epoch
- Learning rate: 2e-5
## Repository Structure
```
.
β”œβ”€β”€ config.json
β”œβ”€β”€ tokenizer_config.json
β”œβ”€β”€ special_tokens_map.json
β”œβ”€β”€ tokenizer.json
β”œβ”€β”€ model.safetensors # Fine-tuned RoBERTa model
β”œβ”€β”€ README.md # Documentation
```
## Limitations
- The model may not generalize well to domains outside the fine-tuning dataset.
- Quantization may result in minor accuracy degradation compared to full-precision models.
## Contributing
Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.