fintext-extractor / README.md
Sowrabhm's picture
Upload README.md with huggingface_hub
b7f5c51 verified
---
license: cc-by-4.0
language:
- en
tags:
- onnx
- ner
- transaction-extraction
- sms-parsing
- gliner2
- deberta
- on-device
- mobile
library_name: onnxruntime
pipeline_tag: token-classification
---
# Model Card: fintext-extractor
GLiNER2-based two-stage NER model that extracts structured transaction data from bank SMS and push notifications. Designed for on-device inference on mobile and desktop, with ONNX Runtime as the inference backend.
## Architecture
fintext-extractor uses a **two-stage pipeline** to maximize both speed and accuracy:
1. **Stage 1 -- Classification:** A DeBERTa-v3-large binary classifier determines whether an incoming message is a completed transaction (`is_transaction: yes/no`). Non-transaction messages (OTPs, promotional alerts, balance reminders) are filtered out early, keeping latency low.
2. **Stage 2 -- Extraction:** A GLiNER2-large extraction model with a LoRA adapter runs only on messages classified as transactions. It extracts structured fields: amount, date, transaction type, description, and masked account digits.
This two-stage design means the heavier extraction model is invoked only when needed, reducing average inference cost on mixed message streams.
## Extracted Fields
| Field | Type | Description |
|-------|------|-------------|
| `is_transaction` | bool | Whether the message is a completed transaction |
| `transaction_amount` | float | Numeric amount (e.g., 5000.00) |
| `transaction_type` | str | DEBIT or CREDIT |
| `transaction_date` | str | Date in DD-MM-YYYY format |
| `transaction_description` | str | Merchant or person name |
| `masked_account_digits` | str | Last 4 digits of card/account |
## Model Files
| File | Size | Description |
|------|------|-------------|
| `onnx/deberta_classifier_fp16.onnx` + `.data` | ~830 MB | Classification model (FP16) |
| `onnx/deberta_classifier_fp32.onnx` + `.data` | ~1.66 GB | Classification model (FP32) |
| `onnx/extraction_full_fp16.onnx` + `.data` | ~930 MB | Extraction model (FP16) |
| `onnx/extraction_full_fp32.onnx` + `.data` | ~1.9 GB | Extraction model (FP32) |
| `tokenizer/` | ~11 MB | Classification tokenizer |
| `tokenizer_extraction/` | ~11 MB | Extraction tokenizer |
FP16 variants are recommended for most use cases. FP32 variants are provided for environments that do not support half-precision.
## Quick Start (Python)
```python
from fintext import FintextExtractor
extractor = FintextExtractor.from_pretrained("Sowrabhm/fintext-extractor")
result = extractor.extract("Rs.5,000 debited from a/c XX1234 for Amazon Pay on 08-Mar-26")
print(result)
# {'is_transaction': True, 'transaction_amount': 5000.0, 'transaction_type': 'DEBIT',
# 'transaction_date': '08-03-2026', 'transaction_description': 'Amazon Pay',
# 'masked_account_digits': '1234'}
```
## Direct ONNX Runtime Usage
If you prefer not to install the `fintext` library, you can run the ONNX models directly:
```python
import numpy as np
import onnxruntime as ort
from tokenizers import Tokenizer
# Load classification model and tokenizer
cls_session = ort.InferenceSession("onnx/deberta_classifier_fp16.onnx")
tokenizer = Tokenizer.from_file("tokenizer/tokenizer.json")
# Tokenize input
text = "Rs.5,000 debited from a/c XX1234 for Amazon Pay on 08-Mar-26"
encoding = tokenizer.encode(text)
input_ids = np.array([encoding.ids], dtype=np.int64)
attention_mask = np.array([encoding.attention_mask], dtype=np.int64)
# Run classification
cls_output = cls_session.run(None, {
"input_ids": input_ids,
"attention_mask": attention_mask,
})
is_transaction = np.argmax(cls_output[0], axis=-1)[0] == 1
# If classified as a transaction, run extraction
if is_transaction:
ext_session = ort.InferenceSession("onnx/extraction_full_fp16.onnx")
ext_tokenizer = Tokenizer.from_file("tokenizer_extraction/tokenizer.json")
# ... tokenize and run extraction session
```
## Training
The models were fine-tuned from the following base checkpoints:
- **Classifier:** [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) with LoRA (r=16, alpha=32)
- **Extractor:** [fastino/gliner2-large-v1](https://huggingface.co/fastino/gliner2-large-v1) with LoRA extraction adapter
Training used the GLiNER2 multi-task schema, combining binary classification (`is_transaction`) with structured extraction (`transaction_info`) in a single training loop. LoRA adapters keep the trainable parameter count low, enabling fine-tuning on consumer GPUs.
## Metrics
| Metric | Value |
|--------|-------|
| Classification accuracy | 0.80 |
| Amount extraction accuracy | 1.00 |
| Type extraction accuracy | 1.00 |
| Digits extraction accuracy | 1.00 |
| Avg latency (FP16, CPU) | 47 ms |
Metrics were evaluated on a held-out test split. Latency measured on a single-threaded ONNX Runtime CPU session.
## Limitations
- **Regional focus:** Primarily trained on Indian bank SMS formats (Rs., INR, currency symbols common in India). Performance on other regional formats has not been evaluated.
- **English only:** The model supports English language messages only.
- **Span extraction, not generation:** Field values must exist verbatim in the input text. The model extracts spans rather than generating new text.
- **Synthetic evaluation data:** The evaluation metrics above were computed on synthetic data. Real-world accuracy may differ.
## Use Cases
- Personal finance apps
- Expense tracking and categorization
- Transaction monitoring and alerting
- Bank statement reconciliation from SMS/notifications
## License
This model is released under the [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/) license.
## Links
- **GitHub:** [https://github.com/sowrabhmv/fintext-extractor](https://github.com/sowrabhmv/fintext-extractor)
- **Notebooks:** See the GitHub repo for cookbook examples and training notebooks