File size: 2,683 Bytes
e9346c5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
---
license: apache-2.0
tags:
  - onnx
  - int8
  - quantized
  - finance
  - embeddings
  - justembed
base_model: ProsusAI/finbert
library_name: onnxruntime
pipeline_tag: feature-extraction
---

# FinBERT INT8 — ONNX Quantized

ONNX INT8 quantized version of [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert) for efficient financial text embeddings.

## Model Details

| Property | Value |
|----------|-------|
| Base Model | [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert) |
| Format | ONNX |
| Quantization | INT8 (dynamic quantization) |
| Embedding Dimension | 768 |
| Quantized by | [JustEmbed](https://pypi.org/project/justembed/) |

## What is this?

This is a quantized ONNX export of FinBERT, a BERT model further pre-trained on financial text by Prosus AI. The INT8 quantization reduces model size and improves inference speed while maintaining high accuracy for financial domain embeddings.

## Use Cases

- Financial document search and retrieval
- Banking text analysis
- Financial sentiment embeddings
- SEC filing analysis
- Financial news similarity

## Files

- `model_quantized.onnx` — INT8 quantized ONNX model
- `tokenizer.json` — Fast tokenizer
- `vocab.txt` — Vocabulary file
- `config.json` — Model configuration

## Usage with JustEmbed

```python
from justembed import Embedder

embedder = Embedder("finbert-int8")
vectors = embedder.embed(["quarterly earnings exceeded expectations"])
```

## Usage with ONNX Runtime

```python
import onnxruntime as ort
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(".")
session = ort.InferenceSession("model_quantized.onnx")

inputs = tokenizer("quarterly earnings exceeded expectations", return_tensors="np")
outputs = session.run(None, dict(inputs))
```

## Quantization Details

- Method: Dynamic INT8 quantization via ONNX Runtime
- Source: Original PyTorch weights converted to ONNX, then quantized
- Speed: ~2-3x faster inference than FP32
- Size: ~4x smaller than FP32

## License

This model is a derivative work of [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert).

The original model is licensed under **Apache License 2.0**. This quantized version is distributed under the same license. See the [LICENSE](LICENSE) file for the full text.

## Citation

```bibtex
@article{araci2019finbert,
  title={FinBERT: Financial Sentiment Analysis with Pre-Trained Language Models},
  author={Araci, Dogu},
  journal={arXiv preprint arXiv:1908.10063},
  year={2019}
}
```

## Acknowledgments

- Original model by [Prosus AI](https://github.com/ProsusAI/finBERT)
- Quantization and packaging by [JustEmbed](https://pypi.org/project/justembed/)