How to use Q-MM/clip-vit-large-patch14-336 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-image-classification", model="Q-MM/clip-vit-large-patch14-336") pipe( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png", candidate_labels=["animals", "humans", "landscape"], )
# Load model directly from transformers import AutoProcessor, AutoModelForZeroShotImageClassification processor = AutoProcessor.from_pretrained("Q-MM/clip-vit-large-patch14-336") model = AutoModelForZeroShotImageClassification.from_pretrained("Q-MM/clip-vit-large-patch14-336")