nikkoyabut's picture
Upload 8 files
f925e95 verified
---
title: CLIP Zero-Shot Classifier
emoji: πŸ–ΌοΈ
colorFrom: indigo
colorTo: green
sdk: gradio
sdk_version: "4.24.0"
app_file: app.py
pinned: false
---
# πŸ–ΌοΈ CLIP Zero-Shot Classifier
This interactive web app demonstrates a **zero-shot image classification** system using **OpenAI's CLIP model** (`ViT-B/32`) and a custom Gradio interface.
## πŸš€ What It Does
CLIP can understand images and text in the same embedding space. With this app, you can:
- Upload an image
- Enter any number of labels (comma-separated)
- Get predictions on how likely the image matches each label β€” **even without training!**
## πŸ’‘ How It Works
1. The input image is preprocessed and encoded using CLIP.
2. Your custom labels are tokenized and also encoded.
3. The cosine similarity between image and text embeddings is computed.
4. The results are displayed with a probability score and a visual bar indicator.
## πŸ“¦ Technologies Used
- [Gradio](https://www.gradio.app/) β€” for the interactive web interface
- [OpenAI CLIP](https://github.com/openai/CLIP) β€” the core model for zero-shot classification
- PyTorch β€” model backend
- Hugging Face Spaces β€” for easy and free deployment
## πŸ“· Example Use Cases
- Test if an image matches multiple tags
- Quickly validate custom labels
- Educational demos for multimodal ML
## πŸ› οΈ How to Use
1. Upload an image.
2. Type in labels like: `a cat, a dog, a diagram, a spacecraft`
3. Click **Classify**.
4. See prediction probabilities and visual bars for each label.
## πŸ“ Notes
- You can enter *any text labels* β€” even abstract or creative ones!
- Works best on natural images (e.g., animals, objects, scenes)
## πŸ““ Notebook
You can explore the companion Jupyter notebook here:
[πŸ“˜ Open notebook.ipynb](./notebook.ipynb)
---
## πŸ‘€ About Me
I'm **Nikko**, a Machine Learning Engineer and AI enthusiast with a Master's degree in Artificial Intelligence from the University of the Philippines Diliman. With over a decade of experience in ICT consulting and telecommunications, I now specialize in **vision-language models**, **LLMs**, and **generative AI applications**.
I'm passionate about creating systems where AI and humans can collaborate seamlessly β€” working toward a future where **smart cities** and intelligent automation become reality.
Feel free to connect with me on [LinkedIn](https://www.linkedin.com/in/nikkoyabut/).
---
Made with ❀️ using CLIP + Gradio