siglip2-base-patch16-224 (ONNX)
This is Google's SigLIP 2 base/224 exported to ONNX format for CPU inference, used by Nebula for local, offline image search.
What's inside
| File | Description |
|---|---|
model.onnx |
Combined vision + text encoder (~110 MB) |
tokenizer.json |
SigLIP tokenizer |
Model inputs & outputs
The single model.onnx file contains both encoders. You can run either independently by passing a dummy tensor for the unused branch.
Inputs
| Name | Shape | dtype |
|---|---|---|
pixel_values |
[image_batch, 3, 224, 224] |
float32 |
input_ids |
[text_batch, seq_len] |
int64 |
Outputs
| Name | Shape | dtype | Description |
|---|---|---|---|
image_embeds |
[image_batch, 768] |
float32 | L2-normalizable image embedding |
text_embeds |
[text_batch, 768] |
float32 | L2-normalizable text embedding |
logits_per_image |
[image_batch, text_batch] |
float32 | Cosine similarity scores |
logits_per_text |
[text_batch, image_batch] |
float32 | Cosine similarity scores (transposed) |
How it was exported
optimum-cli export onnx \
--model google/siglip2-base-patch16-224 \
--task zero-shot-image-classification \
--opset 18 \
./models/
Requires optimum[onnxruntime] and transformers.
License
Inherits Apache 2.0 from the original Google SigLIP 2 model.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for diegohh/siglip2-base-patch16-224
Base model
google/siglip2-base-patch16-224