Add exported onnx model 'model_qint8_avx512_vnni.onnx'

by xtlalert - opened Apr 8, 2025

base: refs/heads/main

←

from: refs/pr/9

Discussion Files changed

-0

xtlalert

Apr 8, 2025

⚙️

sentence-transformers/backend-export

Hello!

This pull request has been automatically generated from the export_dynamic_quantized_onnx_model function from the Sentence Transformers library.

Config

QuantizationConfig(
    is_static=False,
    format=<QuantFormat.QOperator: 0>,
    mode=<QuantizationMode.IntegerOps: 0>,
    activations_dtype=<QuantType.QUInt8: 1>,
    activations_symmetric=False,
    weights_dtype=<QuantType.QInt8: 0>,
    weights_symmetric=True,
    per_channel=True,
    reduce_range=False,
    nodes_to_quantize=[],
    nodes_to_exclude=[],
    operators_to_quantize=['Conv',
    'MatMul',
    'Attention',
    'LSTM',
    'Gather',
    'Transpose',
    'EmbedLayerNormalization'],
    qdq_add_pair_to_weight=False,
    qdq_dedicated_pair=False,
    qdq_op_type_per_channel_support_to_axis={'MatMul': 1}
)

Tip:

Consider testing this pull request before merging by loading the model from this PR with the revision argument:

from sentence_transformers import SentenceTransformer

# TODO: Fill in the PR number
pr_number = 2
model = SentenceTransformer(
    "cross-encoder/ms-marco-MiniLM-L6-v2",
    revision=f"refs/pr/{pr_number}",
    backend="onnx",
    model_kwargs={"file_name": "model_qint8_avx512_vnni.onnx"},
)

# Verify that everything works as expected
embeddings = model.encode(["The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium."])
print(embeddings.shape)

similarities = model.similarity(embeddings, embeddings)
print(similarities)

Add exported onnx model 'model_qint8_avx512_vnni.onnx'98419557

xtlalert changed pull request status to closed Apr 9, 2025

tomaarsen

Sentence Transformers - Cross-Encoders org Apr 9, 2025

I'm working on ONNX/OpenVINO support for these rerankers, implementation is ready, still finishing benchmarks and docs.
Stay tuned!

Tom Aarsen

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment