Instructions to use google/siglip2-base-patch16-224 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/siglip2-base-patch16-224 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-image-classification", model="google/siglip2-base-patch16-224") pipe( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png", candidate_labels=["animals", "humans", "landscape"], )# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("google/siglip2-base-patch16-224", dtype="auto") - Notebooks
- Google Colab
- Kaggle
mportError: cannot import name 'Siglip2VisionModel' from 'transformers' (/home/AAA/.conda/envs/internvl/lib/python3.9/site-packages/transformers/__init__.py)
Thank you very much for your help.I only want to use the Encoder part. This is a code example from the official documentation, but the error message states that the imported Siglip2VisionModel cannot be found. I have checked that the version of transformers is the latest 4.49.0.My code is as follows:
from PIL import Image
import requests
from transformers import AutoProcessor, Siglip2VisionModel
model = Siglip2VisionModel.from_pretrained("google/siglip2-base-patch16-224")
processor = AutoProcessor.from_pretrained("google/siglip2-base-patch16-224")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(images=image, return_tensors="pt")
outputs = model(**inputs)
last_hidden_state = outputs.last_hidden_state
pooled_output = outputs.pooler_output # pooled features
Reading some other comments around here it seems you might need to update your HF package.
