File size: 7,605 Bytes
890029a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
---
language:
- en
- ru
- uz
- multilingual
license: apache-2.0
tags:
- multi-task-learning
- token-classification
- text-classification
- ner
- named-entity-recognition
- intent-classification
- language-detection
- banking
- transactions
- financial
- multilingual
- bert
- pytorch
datasets:
- custom
metrics:
- precision
- recall
- f1
- accuracy
- seqeval
widget:
- text: "Transfer 12.5mln USD to Apex Industries account 27109477752047116719 INN 123456789 bank code 01234 for consulting"
  example_title: "English Transaction"
- text: "Отправить 150тыс рублей на счет ООО Ромашка 40817810099910004312 ИНН 987654321 за услуги"
  example_title: "Russian Transaction"
- text: "44380583609046995897 ҳисобга 170190.66 UZS ўтказиш Голден Стар ИНН 485232484"
  example_title: "Uzbek Cyrillic Transaction"
- text: "Show completed transactions from 01.12.2024 to 15.12.2024"
  example_title: "Query Request"
library_name: transformers
pipeline_tag: token-classification
---

# Intentity AIBA - Multi-Task Banking Model 🏦🤖

## Model Description

**Intentity AIBA** is a state-of-the-art multi-task model that simultaneously performs:
1. 🌐 **Language Detection** - Identifies the language of input text
2. 🎯 **Intent Classification** - Determines user's intent
3. 📋 **Named Entity Recognition** - Extracts key entities from banking transactions

Built on `google-bert/bert-base-multilingual-cased` with a shared encoder and three specialized output heads, this model provides comprehensive understanding of banking and financial transaction texts in multiple languages.

## 🎯 Capabilities

### Language Detection
Supports 5 languages:
- `en`
- `mixed`
- `ru`
- `uz_cyrl`
- `uz_latn`

### Intent Classification
Recognizes 4 intent types:
- `create_transaction`
- `help`
- `list_transaction`
- `unknown`

### Named Entity Recognition
Extracts 6 entity types:
- `amount`
- `currency`
- `description`
- `receiver_hr`
- `receiver_inn`
- `receiver_name`

## 📊 Model Performance

| Task | Metric | Score |
|------|--------|-------|
| **NER** | F1 Score | 0.9891 |
| **NER** | Precision | 0.9891 |
| **Intent** | F1 Score | 0.9999 |
| **Intent** | Accuracy | 0.9999 |
| **Language** | Accuracy | 0.9648 |
| **Overall** | Average F1 | 0.9945 |

## 🚀 Quick Start

### Installation

```bash
pip install transformers torch
```

### Basic Usage

```python
import torch
from transformers import AutoTokenizer, AutoModel

# Load model and tokenizer
model_name = "primel/intentity-aiba"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

# Note: This is a custom multi-task model
# Use the inference code below for predictions
```

### Complete Inference Code

```python
import torch
from transformers import AutoTokenizer, AutoModel
import json

class IntentityAIBA:
    def __init__(self, model_name="primel/intentity-aiba"):
        self.tokenizer = AutoTokenizer.from_pretrained(model_name)
        self.model = AutoModel.from_pretrained(model_name)

        # Load label mappings from model config
        self.id2tag = self.model.config.id2label if hasattr(self.model.config, 'id2label') else {}
        # Note: Intent and language mappings should be loaded from model files

        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        self.model.to(self.device)
        self.model.eval()

    def predict(self, text):
        """Predict language, intent, and entities for input text."""
        inputs = self.tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
        inputs = {k: v.to(self.device) for k, v in inputs.items()}

        with torch.no_grad():
            outputs = self.model(**inputs)

        # Extract predictions from custom model heads
        # (Implementation depends on your model architecture)

        return {
            'language': 'detected_language',
            'intent': 'detected_intent',
            'entities': {}
        }

# Initialize
model = IntentityAIBA()

# Predict
text = "Transfer 12.5mln USD to Apex Industries account 27109477752047116719"
result = model.predict(text)
print(result)
```

## 📝 Example Outputs

### Example 1: English Transaction

**Input**: `"Transfer 12.5mln USD to Apex Industries account 27109477752047116719 INN 123456789 bank code 01234 for consulting"`

**Output**:
```python
{
    "language": "en",
    "intent": "create_transaction",
    "entities": {
        "amount": "12.5mln",
        "currency": "USD",
        "receiver_name": "Apex Industries",
        "receiver_hr": "27109477752047116719",
        "receiver_inn": "123456789",
        "bank_code": "01234",
        "description": "consulting"
    }
}
```

### Example 2: Russian Transaction

**Input**: `"Отправить 150тыс рублей на счет ООО Ромашка 40817810099910004312 ИНН 987654321"`

**Output**:
```python
{
    "language": "ru",
    "intent": "create_transaction",
    "entities": {
        "amount": "150тыс",
        "currency": "рублей",
        "receiver_name": "ООО Ромашка",
        "receiver_hr": "40817810099910004312",
        "receiver_inn": "987654321"
    }
}
```

### Example 3: Query Request

**Input**: `"Show completed transactions from 01.12.2024 to 15.12.2024"`

**Output**:
```python
{
    "language": "en",
    "intent": "list_transaction",
    "entities": {
        "start_date": "01.12.2024",
        "end_date": "15.12.2024"
    }
}
```

## 🏗️ Model Architecture

- **Base Model**: `google-bert/bert-base-multilingual-cased`
- **Architecture**: Multi-task learning with shared encoder
  - Shared BERT encoder (110M parameters)
  - NER head: Token-level classifier
  - Intent head: Sequence-level classifier
  - Language head: Sequence-level classifier
- **Total Parameters**: ~178M
- **Loss Function**: Weighted combination (0.4 × NER + 0.3 × Intent + 0.3 × Language)

## 🎓 Training Details

- **Training Samples**: 340,986
- **Validation Samples**: 60,175
- **Epochs**: 6
- **Batch Size**: 16 (per device)
- **Learning Rate**: 3e-5
- **Warmup Ratio**: 0.15
- **Optimizer**: AdamW with weight decay
- **LR Scheduler**: Linear with warmup
- **Framework**: Transformers + PyTorch
- **Hardware**: Trained on Tesla T4 GPU

## 💡 Use Cases

- **Banking Applications**: Transaction processing and validation
- **Chatbots**: Intent-aware financial assistants
- **Document Processing**: Automated extraction from transaction documents
- **Compliance**: KYC/AML data extraction
- **Analytics**: Transaction categorization and analysis
- **Multi-language Support**: Cross-border banking operations

## ⚠️ Limitations

- Designed for banking/financial domain - may not generalize to other domains
- Performance may vary on formats significantly different from training data
- Mixed language texts may have lower accuracy
- Best results with transaction-style texts similar to training distribution
- Requires fine-tuning for specific banking systems or regional variations

## 📚 Citation

```bibtex
@misc{intentity-aiba-2025,
  author = {Primel},
  title = {Intentity AIBA: Multi-Task Banking Language Model},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
  howpublished = {\url{https://huggingface.co/primel/intentity-aiba}}
}
```

## 📄 License

Apache 2.0

## 🤝 Contact

For questions, issues, or collaboration opportunities, please open an issue on the model repository.

---

**Model Card Authors**: Primel
**Last Updated**: 2025
**Model Version**: 1.0