ONNX Quantized versions of google/embeddinggemma-300m

This repository contains ONNX export and multiple quantized versions of google/embeddinggemma-300m-qat-q8_0-unquantized.

Usage

from sentence_transformers import SentenceTransformer

# Load Int8 model (ARM64 example)
model = SentenceTransformer(
    "jrc2139/embeddinggemma-300m",
    backend="onnx",
    model_kwargs={"file_name": "onnx/model_q4.onnx"},
    trust_remote_code=True
)
Downloads last month
11
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jrc2139/embeddinggemma-300m-qat-q8_0-unquantized-ONNX

Quantized
(43)
this model