Spaces:

Granitagushi
/

Image_detector

Sleeping

App Files Files Community

Granitagushi commited on Apr 13, 2025

Commit

a2b9220

verified ·

1 Parent(s): 9291f82

Upload 3 files

Browse files

Files changed (3) hide show

README.md +40 -0
app.py +42 -0
requirements.txt +2 -0

README.md ADDED Viewed

	@@ -0,0 +1,40 @@

+# CLIP Zero-Shot Classification on Oxford Pets Dataset
+## Model Details
+- **Model Name**: CLIP (Contrastive Language-Image Pre-training)
+- **Model Version**: openai/clip-vit-large-patch14
+- **Task**: Zero-shot Image Classification
+- **Dataset**: Oxford-IIIT Pet Dataset
+## Evaluation Results
+The model was evaluated on the Oxford Pets dataset using zero-shot classification. The following metrics were obtained:
+- **Accuracy**: 0.8800 -> 88%
+- **Precision**: 0.8768 -> 87.68%
+- **Recall**: 0.8800 -> 88%
+## Model Description
+CLIP (Contrastive Language-Image Pre-training) is a neural network trained on a variety of (image, text) pairs. It can be instructed in natural language to perform a great variety of classification benchmarks, without directly optimizing for the benchmark's performance. This zero-shot capability of CLIP is particularly useful for tasks where labeled data is scarce or expensive to obtain.
+## Dataset
+The Oxford-IIIT Pet Dataset is a 37 category pet dataset with roughly 200 images for each class. The images have large variations in scale, pose and lighting. All images have an associated ground truth annotation of breed.
+## Usage
+```python
+from transformers import pipeline
+# Load the model
+checkpoint = "openai/clip-vit-large-patch14"
+detector = pipeline(model=checkpoint, task="zero-shot-image-classification")
+# Define candidate labels
+labels = ['Siamese', 'Birman', 'shiba inu', 'staffordshire bull terrier', ...]
+# Run inference
+results = detector(image, candidate_labels=labels)
+```
+## Limitations
+- The model's performance may vary depending on the quality and characteristics of the input images
+- Zero-shot classification may not perform as well as fine-tuned models on specific tasks
+- The model's predictions are based on the provided candidate labels, so the quality of results depends on the relevance and completeness of these labels

app.py ADDED Viewed

	@@ -0,0 +1,42 @@

+import gradio as gr
+from transformers import pipeline
+# Load models
+vit_classifier = pipeline("image-classification", model="kuhs/vit-base-oxford-iiit-pets")
+clip_detector = pipeline(model="openai/clip-vit-large-patch14", task="zero-shot-image-classification")
+labels_oxford_pets = [
+    'Siamese', 'Birman', 'shiba inu', 'staffordshire bull terrier', 'basset hound', 'Bombay', 'japanese chin',
+    'chihuahua', 'german shorthaired', 'pomeranian', 'beagle', 'english cocker spaniel', 'american pit bull terrier',
+    'Ragdoll', 'Persian', 'Egyptian Mau', 'miniature pinscher', 'Sphynx', 'Maine Coon', 'keeshond', 'yorkshire terrier',
+    'havanese', 'leonberger', 'wheaten terrier', 'american bulldog', 'english setter', 'boxer', 'newfoundland', 'Bengal',
+    'samoyed', 'British Shorthair', 'great pyrenees', 'Abyssinian', 'pug', 'saint bernard', 'Russian Blue', 'scottish terrier'
+]
+def classify_pet(image):
+    vit_results = vit_classifier(image)
+    vit_output = {result['label']: result['score'] for result in vit_results}
+    clip_results = clip_detector(image, candidate_labels=labels_oxford_pets)
+    clip_output = {result['label']: result['score'] for result in clip_results}
+    return {"ViT Classification": vit_output, "CLIP Zero-Shot Classification": clip_output}
+example_images = [
+    ["example_images/dog1.jpeg"],
+    ["example_images/dog2.jpeg"],
+    ["example_images/leonberger.jpg"],
+    ["example_images/snow_leopard.jpeg"],
+    ["example_images/cat.jpg"]
+]
+iface = gr.Interface(
+    fn=classify_pet,
+    inputs=gr.Image(type="filepath"),
+    outputs=gr.JSON(),
+    title="Pet Classification Comparison",
+    description="Upload an image of a pet, and compare results from a trained ViT model and a zero-shot CLIP model.",
+    examples=example_images
+)
+iface.launch()

requirements.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ transformers
2	+ torch