ONNX Quantized versions of google/embeddinggemma-300m
This repository contains ONNX export and multiple quantized versions of google/embeddinggemma-300m-qat-q8_0-unquantized.
Usage
from sentence_transformers import SentenceTransformer
# Load Int8 model (ARM64 example)
model = SentenceTransformer(
"jrc2139/embeddinggemma-300m",
backend="onnx",
model_kwargs={"file_name": "onnx/model_q4.onnx"},
trust_remote_code=True
)
- Downloads last month
- 11
Model tree for jrc2139/embeddinggemma-300m-qat-q8_0-unquantized-ONNX
Base model
google/embeddinggemma-300m