File size: 6,016 Bytes
b7f5c51
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
---

license: cc-by-4.0
language:
  - en
tags:
  - onnx
  - ner
  - transaction-extraction
  - sms-parsing
  - gliner2
  - deberta
  - on-device
  - mobile
library_name: onnxruntime
pipeline_tag: token-classification
---


# Model Card: fintext-extractor

GLiNER2-based two-stage NER model that extracts structured transaction data from bank SMS and push notifications. Designed for on-device inference on mobile and desktop, with ONNX Runtime as the inference backend.

## Architecture

fintext-extractor uses a **two-stage pipeline** to maximize both speed and accuracy:

1. **Stage 1 -- Classification:** A DeBERTa-v3-large binary classifier determines whether an incoming message is a completed transaction (`is_transaction: yes/no`). Non-transaction messages (OTPs, promotional alerts, balance reminders) are filtered out early, keeping latency low.

2. **Stage 2 -- Extraction:** A GLiNER2-large extraction model with a LoRA adapter runs only on messages classified as transactions. It extracts structured fields: amount, date, transaction type, description, and masked account digits.

This two-stage design means the heavier extraction model is invoked only when needed, reducing average inference cost on mixed message streams.

## Extracted Fields

| Field | Type | Description |
|-------|------|-------------|
| `is_transaction` | bool | Whether the message is a completed transaction |
| `transaction_amount` | float | Numeric amount (e.g., 5000.00) |
| `transaction_type` | str | DEBIT or CREDIT |
| `transaction_date` | str | Date in DD-MM-YYYY format |
| `transaction_description` | str | Merchant or person name |
| `masked_account_digits` | str | Last 4 digits of card/account |

## Model Files

| File | Size | Description |
|------|------|-------------|
| `onnx/deberta_classifier_fp16.onnx` + `.data` | ~830 MB | Classification model (FP16) |
| `onnx/deberta_classifier_fp32.onnx` + `.data` | ~1.66 GB | Classification model (FP32) |
| `onnx/extraction_full_fp16.onnx` + `.data` | ~930 MB | Extraction model (FP16) |
| `onnx/extraction_full_fp32.onnx` + `.data` | ~1.9 GB | Extraction model (FP32) |
| `tokenizer/` | ~11 MB | Classification tokenizer |
| `tokenizer_extraction/` | ~11 MB | Extraction tokenizer |

FP16 variants are recommended for most use cases. FP32 variants are provided for environments that do not support half-precision.

## Quick Start (Python)

```python

from fintext import FintextExtractor



extractor = FintextExtractor.from_pretrained("Sowrabhm/fintext-extractor")

result = extractor.extract("Rs.5,000 debited from a/c XX1234 for Amazon Pay on 08-Mar-26")

print(result)

# {'is_transaction': True, 'transaction_amount': 5000.0, 'transaction_type': 'DEBIT',

#  'transaction_date': '08-03-2026', 'transaction_description': 'Amazon Pay',

#  'masked_account_digits': '1234'}

```

## Direct ONNX Runtime Usage

If you prefer not to install the `fintext` library, you can run the ONNX models directly:

```python

import numpy as np

import onnxruntime as ort

from tokenizers import Tokenizer



# Load classification model and tokenizer

cls_session = ort.InferenceSession("onnx/deberta_classifier_fp16.onnx")

tokenizer = Tokenizer.from_file("tokenizer/tokenizer.json")



# Tokenize input

text = "Rs.5,000 debited from a/c XX1234 for Amazon Pay on 08-Mar-26"

encoding = tokenizer.encode(text)

input_ids = np.array([encoding.ids], dtype=np.int64)

attention_mask = np.array([encoding.attention_mask], dtype=np.int64)



# Run classification

cls_output = cls_session.run(None, {

    "input_ids": input_ids,

    "attention_mask": attention_mask,

})

is_transaction = np.argmax(cls_output[0], axis=-1)[0] == 1



# If classified as a transaction, run extraction

if is_transaction:

    ext_session = ort.InferenceSession("onnx/extraction_full_fp16.onnx")

    ext_tokenizer = Tokenizer.from_file("tokenizer_extraction/tokenizer.json")

    # ... tokenize and run extraction session

```

## Training

The models were fine-tuned from the following base checkpoints:

- **Classifier:** [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) with LoRA (r=16, alpha=32)
- **Extractor:** [fastino/gliner2-large-v1](https://huggingface.co/fastino/gliner2-large-v1) with LoRA extraction adapter

Training used the GLiNER2 multi-task schema, combining binary classification (`is_transaction`) with structured extraction (`transaction_info`) in a single training loop. LoRA adapters keep the trainable parameter count low, enabling fine-tuning on consumer GPUs.

## Metrics

| Metric | Value |
|--------|-------|
| Classification accuracy | 0.80 |
| Amount extraction accuracy | 1.00 |
| Type extraction accuracy | 1.00 |
| Digits extraction accuracy | 1.00 |
| Avg latency (FP16, CPU) | 47 ms |

Metrics were evaluated on a held-out test split. Latency measured on a single-threaded ONNX Runtime CPU session.

## Limitations

- **Regional focus:** Primarily trained on Indian bank SMS formats (Rs., INR, currency symbols common in India). Performance on other regional formats has not been evaluated.
- **English only:** The model supports English language messages only.
- **Span extraction, not generation:** Field values must exist verbatim in the input text. The model extracts spans rather than generating new text.
- **Synthetic evaluation data:** The evaluation metrics above were computed on synthetic data. Real-world accuracy may differ.

## Use Cases

- Personal finance apps
- Expense tracking and categorization
- Transaction monitoring and alerting
- Bank statement reconciliation from SMS/notifications

## License

This model is released under the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/) license.

## Links

- **GitHub:** [https://github.com/sowrabhmv/fintext-extractor](https://github.com/sowrabhmv/fintext-extractor)
- **Notebooks:** See the GitHub repo for cookbook examples and training notebooks