Zero-Shot Image Classification
Transformers
Safetensors
tipsv2
feature-extraction
vision
image-text
contrastive-learning
zero-shot
custom_code
Instructions to use google/tipsv2-so400m14 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/tipsv2-so400m14 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-image-classification", model="google/tipsv2-so400m14", trust_remote_code=True) pipe( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png", candidate_labels=["animals", "humans", "landscape"], )# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("google/tipsv2-so400m14", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Fix broken image URL and add missing .cpu() in code snippets
Browse files
README.md
CHANGED
|
@@ -50,7 +50,7 @@ transform = transforms.Compose([
|
|
| 50 |
transforms.ToTensor(),
|
| 51 |
])
|
| 52 |
|
| 53 |
-
url = "https://huggingface.co/spaces/google/
|
| 54 |
image = Image.open(requests.get(url, stream=True).raw)
|
| 55 |
pixel_values = transform(image).unsqueeze(0)
|
| 56 |
out = model.encode_image(pixel_values)
|
|
@@ -85,7 +85,7 @@ import numpy as np
|
|
| 85 |
from sklearn.decomposition import PCA
|
| 86 |
|
| 87 |
spatial = out.patch_tokens.reshape(1, 32, 32, 1152)
|
| 88 |
-
feat = spatial[0].detach().numpy().reshape(-1, 1152)
|
| 89 |
rgb = PCA(n_components=3, whiten=True).fit_transform(feat).reshape(32, 32, 3)
|
| 90 |
rgb = 1 / (1 + np.exp(-2.0 * rgb)) # sigmoid for [0, 1] range with good contrast
|
| 91 |
print(rgb.shape) # (32, 32, 3) — PCA of patch features as RGB
|
|
|
|
| 50 |
transforms.ToTensor(),
|
| 51 |
])
|
| 52 |
|
| 53 |
+
url = "https://huggingface.co/spaces/google/TIPSv2/resolve/main/examples/zeroseg/pascal_context_00049_image.png"
|
| 54 |
image = Image.open(requests.get(url, stream=True).raw)
|
| 55 |
pixel_values = transform(image).unsqueeze(0)
|
| 56 |
out = model.encode_image(pixel_values)
|
|
|
|
| 85 |
from sklearn.decomposition import PCA
|
| 86 |
|
| 87 |
spatial = out.patch_tokens.reshape(1, 32, 32, 1152)
|
| 88 |
+
feat = spatial[0].detach().cpu().numpy().reshape(-1, 1152)
|
| 89 |
rgb = PCA(n_components=3, whiten=True).fit_transform(feat).reshape(32, 32, 3)
|
| 90 |
rgb = 1 / (1 + np.exp(-2.0 * rgb)) # sigmoid for [0, 1] range with good contrast
|
| 91 |
print(rgb.shape) # (32, 32, 3) — PCA of patch features as RGB
|