File size: 7,605 Bytes
890029a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 |
---
language:
- en
- ru
- uz
- multilingual
license: apache-2.0
tags:
- multi-task-learning
- token-classification
- text-classification
- ner
- named-entity-recognition
- intent-classification
- language-detection
- banking
- transactions
- financial
- multilingual
- bert
- pytorch
datasets:
- custom
metrics:
- precision
- recall
- f1
- accuracy
- seqeval
widget:
- text: "Transfer 12.5mln USD to Apex Industries account 27109477752047116719 INN 123456789 bank code 01234 for consulting"
example_title: "English Transaction"
- text: "Отправить 150тыс рублей на счет ООО Ромашка 40817810099910004312 ИНН 987654321 за услуги"
example_title: "Russian Transaction"
- text: "44380583609046995897 ҳисобга 170190.66 UZS ўтказиш Голден Стар ИНН 485232484"
example_title: "Uzbek Cyrillic Transaction"
- text: "Show completed transactions from 01.12.2024 to 15.12.2024"
example_title: "Query Request"
library_name: transformers
pipeline_tag: token-classification
---
# Intentity AIBA - Multi-Task Banking Model 🏦🤖
## Model Description
**Intentity AIBA** is a state-of-the-art multi-task model that simultaneously performs:
1. 🌐 **Language Detection** - Identifies the language of input text
2. 🎯 **Intent Classification** - Determines user's intent
3. 📋 **Named Entity Recognition** - Extracts key entities from banking transactions
Built on `google-bert/bert-base-multilingual-cased` with a shared encoder and three specialized output heads, this model provides comprehensive understanding of banking and financial transaction texts in multiple languages.
## 🎯 Capabilities
### Language Detection
Supports 5 languages:
- `en`
- `mixed`
- `ru`
- `uz_cyrl`
- `uz_latn`
### Intent Classification
Recognizes 4 intent types:
- `create_transaction`
- `help`
- `list_transaction`
- `unknown`
### Named Entity Recognition
Extracts 6 entity types:
- `amount`
- `currency`
- `description`
- `receiver_hr`
- `receiver_inn`
- `receiver_name`
## 📊 Model Performance
| Task | Metric | Score |
|------|--------|-------|
| **NER** | F1 Score | 0.9891 |
| **NER** | Precision | 0.9891 |
| **Intent** | F1 Score | 0.9999 |
| **Intent** | Accuracy | 0.9999 |
| **Language** | Accuracy | 0.9648 |
| **Overall** | Average F1 | 0.9945 |
## 🚀 Quick Start
### Installation
```bash
pip install transformers torch
```
### Basic Usage
```python
import torch
from transformers import AutoTokenizer, AutoModel
# Load model and tokenizer
model_name = "primel/intentity-aiba"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)
# Note: This is a custom multi-task model
# Use the inference code below for predictions
```
### Complete Inference Code
```python
import torch
from transformers import AutoTokenizer, AutoModel
import json
class IntentityAIBA:
def __init__(self, model_name="primel/intentity-aiba"):
self.tokenizer = AutoTokenizer.from_pretrained(model_name)
self.model = AutoModel.from_pretrained(model_name)
# Load label mappings from model config
self.id2tag = self.model.config.id2label if hasattr(self.model.config, 'id2label') else {}
# Note: Intent and language mappings should be loaded from model files
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
self.model.to(self.device)
self.model.eval()
def predict(self, text):
"""Predict language, intent, and entities for input text."""
inputs = self.tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
inputs = {k: v.to(self.device) for k, v in inputs.items()}
with torch.no_grad():
outputs = self.model(**inputs)
# Extract predictions from custom model heads
# (Implementation depends on your model architecture)
return {
'language': 'detected_language',
'intent': 'detected_intent',
'entities': {}
}
# Initialize
model = IntentityAIBA()
# Predict
text = "Transfer 12.5mln USD to Apex Industries account 27109477752047116719"
result = model.predict(text)
print(result)
```
## 📝 Example Outputs
### Example 1: English Transaction
**Input**: `"Transfer 12.5mln USD to Apex Industries account 27109477752047116719 INN 123456789 bank code 01234 for consulting"`
**Output**:
```python
{
"language": "en",
"intent": "create_transaction",
"entities": {
"amount": "12.5mln",
"currency": "USD",
"receiver_name": "Apex Industries",
"receiver_hr": "27109477752047116719",
"receiver_inn": "123456789",
"bank_code": "01234",
"description": "consulting"
}
}
```
### Example 2: Russian Transaction
**Input**: `"Отправить 150тыс рублей на счет ООО Ромашка 40817810099910004312 ИНН 987654321"`
**Output**:
```python
{
"language": "ru",
"intent": "create_transaction",
"entities": {
"amount": "150тыс",
"currency": "рублей",
"receiver_name": "ООО Ромашка",
"receiver_hr": "40817810099910004312",
"receiver_inn": "987654321"
}
}
```
### Example 3: Query Request
**Input**: `"Show completed transactions from 01.12.2024 to 15.12.2024"`
**Output**:
```python
{
"language": "en",
"intent": "list_transaction",
"entities": {
"start_date": "01.12.2024",
"end_date": "15.12.2024"
}
}
```
## 🏗️ Model Architecture
- **Base Model**: `google-bert/bert-base-multilingual-cased`
- **Architecture**: Multi-task learning with shared encoder
- Shared BERT encoder (110M parameters)
- NER head: Token-level classifier
- Intent head: Sequence-level classifier
- Language head: Sequence-level classifier
- **Total Parameters**: ~178M
- **Loss Function**: Weighted combination (0.4 × NER + 0.3 × Intent + 0.3 × Language)
## 🎓 Training Details
- **Training Samples**: 340,986
- **Validation Samples**: 60,175
- **Epochs**: 6
- **Batch Size**: 16 (per device)
- **Learning Rate**: 3e-5
- **Warmup Ratio**: 0.15
- **Optimizer**: AdamW with weight decay
- **LR Scheduler**: Linear with warmup
- **Framework**: Transformers + PyTorch
- **Hardware**: Trained on Tesla T4 GPU
## 💡 Use Cases
- **Banking Applications**: Transaction processing and validation
- **Chatbots**: Intent-aware financial assistants
- **Document Processing**: Automated extraction from transaction documents
- **Compliance**: KYC/AML data extraction
- **Analytics**: Transaction categorization and analysis
- **Multi-language Support**: Cross-border banking operations
## ⚠️ Limitations
- Designed for banking/financial domain - may not generalize to other domains
- Performance may vary on formats significantly different from training data
- Mixed language texts may have lower accuracy
- Best results with transaction-style texts similar to training distribution
- Requires fine-tuning for specific banking systems or regional variations
## 📚 Citation
```bibtex
@misc{intentity-aiba-2025,
author = {Primel},
title = {Intentity AIBA: Multi-Task Banking Language Model},
year = {2025},
publisher = {Hugging Face},
journal = {Hugging Face Model Hub},
howpublished = {\url{https://huggingface.co/primel/intentity-aiba}}
}
```
## 📄 License
Apache 2.0
## 🤝 Contact
For questions, issues, or collaboration opportunities, please open an issue on the model repository.
---
**Model Card Authors**: Primel
**Last Updated**: 2025
**Model Version**: 1.0
|