finbert-int8 / README.md
sekarkrishna's picture
Upload folder using huggingface_hub
e9346c5 verified
metadata
license: apache-2.0
tags:
  - onnx
  - int8
  - quantized
  - finance
  - embeddings
  - justembed
base_model: ProsusAI/finbert
library_name: onnxruntime
pipeline_tag: feature-extraction

FinBERT INT8 — ONNX Quantized

ONNX INT8 quantized version of ProsusAI/finbert for efficient financial text embeddings.

Model Details

Property Value
Base Model ProsusAI/finbert
Format ONNX
Quantization INT8 (dynamic quantization)
Embedding Dimension 768
Quantized by JustEmbed

What is this?

This is a quantized ONNX export of FinBERT, a BERT model further pre-trained on financial text by Prosus AI. The INT8 quantization reduces model size and improves inference speed while maintaining high accuracy for financial domain embeddings.

Use Cases

  • Financial document search and retrieval
  • Banking text analysis
  • Financial sentiment embeddings
  • SEC filing analysis
  • Financial news similarity

Files

  • model_quantized.onnx — INT8 quantized ONNX model
  • tokenizer.json — Fast tokenizer
  • vocab.txt — Vocabulary file
  • config.json — Model configuration

Usage with JustEmbed

from justembed import Embedder

embedder = Embedder("finbert-int8")
vectors = embedder.embed(["quarterly earnings exceeded expectations"])

Usage with ONNX Runtime

import onnxruntime as ort
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(".")
session = ort.InferenceSession("model_quantized.onnx")

inputs = tokenizer("quarterly earnings exceeded expectations", return_tensors="np")
outputs = session.run(None, dict(inputs))

Quantization Details

  • Method: Dynamic INT8 quantization via ONNX Runtime
  • Source: Original PyTorch weights converted to ONNX, then quantized
  • Speed: ~2-3x faster inference than FP32
  • Size: ~4x smaller than FP32

License

This model is a derivative work of ProsusAI/finbert.

The original model is licensed under Apache License 2.0. This quantized version is distributed under the same license. See the LICENSE file for the full text.

Citation

@article{araci2019finbert,
  title={FinBERT: Financial Sentiment Analysis with Pre-Trained Language Models},
  author={Araci, Dogu},
  journal={arXiv preprint arXiv:1908.10063},
  year={2019}
}

Acknowledgments