vectorizer.banana

This model is a vectorizer developed by Sinequa. It produces an embedding vector given a passage or a query. The passage vectors are stored in our vector index and the query vector is used at query time to look up relevant passages in the index.

Model name: vectorizer.banana

Supported Languages

Since this model is a distilled version of the BGE-M3 model, it can theoritically handle 100+ languages.

Scores

We computed the differences in performance w.r.t the original BGE-M3 on MS MARCO EN. Scores on famous benchmarks (BEIR, MIRACL, MTEB, etc.) can be found directly in the model card of BGE-M3 under line "Dense". We expect the performance to drop linearly with the same scale than the observed with MS MARCO EN for other datasets.

Model Performance Relative to BGE-M3
vectorizer.banana (1024 dimensions) 99.3%
vectorizer.banana (768 dimensions) 98.8%
vectorizer.banana (512 dimensions) 98%
vectorizer.banana (256 dimensions)* 95.7%

* The default dimension within Sinequa

Inference Times

GPU Quantization type Batch size 1 Batch size 32
NVIDIA A10 FP16 4.5 ms 43 ms
NVIDIA T4 FP16 2.5 ms 35 ms

GPU Memory Usage

Quantization type Memory
FP16 1450 MiB

Note that GPU memory usage only includes how much GPU memory the actual model consumes on an NVIDIA T4 GPU with a batch size of 32. It does not include the fix amount of memory that is consumed by the ONNX Runtime upon initialization which can be around 0.5 to 1 GiB depending on the used GPU.

Requirements

Model Details

Configuration

Note that this model will be packaged with a default MRL cutoff of 256 dimensions . In order to use the 1024 dimensions or any other value the mrl-cutoff parameter needs to be set.

Training

This model used the BGE-M3, a good and compact multilingual embedding model as a backbone for distillation.

The original model size was 24 layers and then reduced to 5 layers. To obtain a low dimensional output space (256 compared to the original 1024), Matryoshka Representation Learning was used at training time.

Downloads last month
14
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including sinequa/vectorizer.banana