finbert-int8 / README.md
sekarkrishna's picture
Upload folder using huggingface_hub
e9346c5 verified
---
license: apache-2.0
tags:
- onnx
- int8
- quantized
- finance
- embeddings
- justembed
base_model: ProsusAI/finbert
library_name: onnxruntime
pipeline_tag: feature-extraction
---
# FinBERT INT8 β€” ONNX Quantized
ONNX INT8 quantized version of [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert) for efficient financial text embeddings.
## Model Details
| Property | Value |
|----------|-------|
| Base Model | [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert) |
| Format | ONNX |
| Quantization | INT8 (dynamic quantization) |
| Embedding Dimension | 768 |
| Quantized by | [JustEmbed](https://pypi.org/project/justembed/) |
## What is this?
This is a quantized ONNX export of FinBERT, a BERT model further pre-trained on financial text by Prosus AI. The INT8 quantization reduces model size and improves inference speed while maintaining high accuracy for financial domain embeddings.
## Use Cases
- Financial document search and retrieval
- Banking text analysis
- Financial sentiment embeddings
- SEC filing analysis
- Financial news similarity
## Files
- `model_quantized.onnx` β€” INT8 quantized ONNX model
- `tokenizer.json` β€” Fast tokenizer
- `vocab.txt` β€” Vocabulary file
- `config.json` β€” Model configuration
## Usage with JustEmbed
```python
from justembed import Embedder
embedder = Embedder("finbert-int8")
vectors = embedder.embed(["quarterly earnings exceeded expectations"])
```
## Usage with ONNX Runtime
```python
import onnxruntime as ort
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(".")
session = ort.InferenceSession("model_quantized.onnx")
inputs = tokenizer("quarterly earnings exceeded expectations", return_tensors="np")
outputs = session.run(None, dict(inputs))
```
## Quantization Details
- Method: Dynamic INT8 quantization via ONNX Runtime
- Source: Original PyTorch weights converted to ONNX, then quantized
- Speed: ~2-3x faster inference than FP32
- Size: ~4x smaller than FP32
## License
This model is a derivative work of [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert).
The original model is licensed under **Apache License 2.0**. This quantized version is distributed under the same license. See the [LICENSE](LICENSE) file for the full text.
## Citation
```bibtex
@article{araci2019finbert,
title={FinBERT: Financial Sentiment Analysis with Pre-Trained Language Models},
author={Araci, Dogu},
journal={arXiv preprint arXiv:1908.10063},
year={2019}
}
```
## Acknowledgments
- Original model by [Prosus AI](https://github.com/ProsusAI/finBERT)
- Quantization and packaging by [JustEmbed](https://pypi.org/project/justembed/)