| --- |
| license: apache-2.0 |
| tags: |
| - onnx |
| - int8 |
| - quantized |
| - finance |
| - embeddings |
| - justembed |
| base_model: ProsusAI/finbert |
| library_name: onnxruntime |
| pipeline_tag: feature-extraction |
| --- |
| |
| # FinBERT INT8 β ONNX Quantized |
|
|
| ONNX INT8 quantized version of [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert) for efficient financial text embeddings. |
|
|
| ## Model Details |
|
|
| | Property | Value | |
| |----------|-------| |
| | Base Model | [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert) | |
| | Format | ONNX | |
| | Quantization | INT8 (dynamic quantization) | |
| | Embedding Dimension | 768 | |
| | Quantized by | [JustEmbed](https://pypi.org/project/justembed/) | |
|
|
| ## What is this? |
|
|
| This is a quantized ONNX export of FinBERT, a BERT model further pre-trained on financial text by Prosus AI. The INT8 quantization reduces model size and improves inference speed while maintaining high accuracy for financial domain embeddings. |
|
|
| ## Use Cases |
|
|
| - Financial document search and retrieval |
| - Banking text analysis |
| - Financial sentiment embeddings |
| - SEC filing analysis |
| - Financial news similarity |
|
|
| ## Files |
|
|
| - `model_quantized.onnx` β INT8 quantized ONNX model |
| - `tokenizer.json` β Fast tokenizer |
| - `vocab.txt` β Vocabulary file |
| - `config.json` β Model configuration |
|
|
| ## Usage with JustEmbed |
|
|
| ```python |
| from justembed import Embedder |
| |
| embedder = Embedder("finbert-int8") |
| vectors = embedder.embed(["quarterly earnings exceeded expectations"]) |
| ``` |
|
|
| ## Usage with ONNX Runtime |
|
|
| ```python |
| import onnxruntime as ort |
| from transformers import AutoTokenizer |
| |
| tokenizer = AutoTokenizer.from_pretrained(".") |
| session = ort.InferenceSession("model_quantized.onnx") |
| |
| inputs = tokenizer("quarterly earnings exceeded expectations", return_tensors="np") |
| outputs = session.run(None, dict(inputs)) |
| ``` |
|
|
| ## Quantization Details |
|
|
| - Method: Dynamic INT8 quantization via ONNX Runtime |
| - Source: Original PyTorch weights converted to ONNX, then quantized |
| - Speed: ~2-3x faster inference than FP32 |
| - Size: ~4x smaller than FP32 |
|
|
| ## License |
|
|
| This model is a derivative work of [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert). |
|
|
| The original model is licensed under **Apache License 2.0**. This quantized version is distributed under the same license. See the [LICENSE](LICENSE) file for the full text. |
|
|
| ## Citation |
|
|
| ```bibtex |
| @article{araci2019finbert, |
| title={FinBERT: Financial Sentiment Analysis with Pre-Trained Language Models}, |
| author={Araci, Dogu}, |
| journal={arXiv preprint arXiv:1908.10063}, |
| year={2019} |
| } |
| ``` |
|
|
| ## Acknowledgments |
|
|
| - Original model by [Prosus AI](https://github.com/ProsusAI/finBERT) |
| - Quantization and packaging by [JustEmbed](https://pypi.org/project/justembed/) |
|
|