13hbeltran commited on
Commit
7e35296
·
verified ·
1 Parent(s): adf685b

Upload 3 files

Browse files
Files changed (3) hide show
  1. app.py +33 -0
  2. readme.md +47 -0
  3. requirements.txt +4 -0
app.py ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ from transformers import pipeline
3
+ from PIL import Image
4
+
5
+ # Load image classification pipeline
6
+ classifier = pipeline(
7
+ task="image-classification",
8
+ model="google/vit-base-patch16-224"
9
+ )
10
+
11
+ def classify_image(image):
12
+ if image is None:
13
+ return "No image provided."
14
+
15
+ # Convert to PIL Image if needed
16
+ if not isinstance(image, Image.Image):
17
+ image = Image.fromarray(image)
18
+
19
+ results = classifier(image)
20
+ return {r["label"]: r["score"] for r in results}
21
+
22
+
23
+ # Gradio interface
24
+ app = gr.Interface(
25
+ fn=classify_image,
26
+ inputs=gr.Image(type="pil", label="Upload an Animal Image"),
27
+ outputs=gr.Label(label="Prediction"),
28
+ title="Animal Image Classification",
29
+ description="Upload an image of an animal and the model will predict what it is."
30
+ )
31
+
32
+ if __name__ == "__main__":
33
+ app.launch()
readme.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Model Card: Vision Transformer (ViT) for Animal Image Classification
2
+ Model Description
3
+
4
+ This application uses a pretrained Vision Transformer (ViT) model from Hugging Face for animal image classification. Vision Transformers adapt the transformer architecture—originally developed for NLP tasks like BERT—to image data by processing images as patches rather than pixels.
5
+
6
+ The model is pretrained on large-scale image datasets (such as ImageNet) and is used as-is for inference. Images are resized to 224×224 pixels, which matches the model’s expected input size. No additional fine-tuning was performed for this assignment.
7
+
8
+ The goal of this project is to demonstrate how a pretrained computer vision model can be deployed as a simple interactive application that accepts an animal image and returns a predicted class.
9
+
10
+ Intended Uses & Limitations
11
+ Intended Uses
12
+
13
+ Animal Image Classification:
14
+ Classify images of animals using a pretrained vision model.
15
+
16
+ Educational Demonstration:
17
+ Showcase how Hugging Face models and Spaces can be used to build and deploy a simple ML application.
18
+
19
+ Limitations
20
+
21
+ The model was not fine-tuned specifically for animals, so predictions may be inaccurate for uncommon species or low-quality images.
22
+
23
+ Results depend heavily on image clarity, lighting, and background.
24
+
25
+ This application is intended for demonstration and learning, not production use.
26
+
27
+ How to Use
28
+
29
+ Upload an image of an animal using the interface.
30
+ The application preprocesses the image and returns the model’s predicted label.
31
+
32
+ Internally, the app uses the Hugging Face image-classification pipeline to handle preprocessing, inference, and output formatting.
33
+
34
+ Training Data
35
+
36
+ This project does not train a new model.
37
+ It relies on a pretrained Vision Transformer that was originally trained on large, publicly available image datasets (e.g., ImageNet).
38
+
39
+ Notes
40
+
41
+ This Space is part of a coursework assignment focused on:
42
+
43
+ Using pretrained models responsibly
44
+
45
+ Understanding model inputs and outputs
46
+
47
+ Deploying simple ML applications locally and via Hugging Face Spaces
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ gradio
2
+ transformers
3
+ torch
4
+ pillow