BGE-M3 INT8 ONNX

INT8 quantized ONNX version of BGE-M3, optimized for faster inference and lower memory usage while preserving strong multilingual embedding performance.

Model Details

  • Base Model: BAAI/bge-m3
  • Format: ONNX
  • Quantization: INT8
  • Embedding Size: 1024
  • Max Sequence Length: 8192

This model was converted from the original BGE-M3 model and quantized to INT8 for improved deployment efficiency on CPU and edge environments.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ananddey/bge-m3

Base model

BAAI/bge-m3
Quantized
(268)
this model