|
|
--- |
|
|
title: CLIP Zero-Shot Classifier |
|
|
emoji: πΌοΈ |
|
|
colorFrom: indigo |
|
|
colorTo: green |
|
|
sdk: gradio |
|
|
sdk_version: "4.24.0" |
|
|
app_file: app.py |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
# πΌοΈ CLIP Zero-Shot Classifier |
|
|
|
|
|
This interactive web app demonstrates a **zero-shot image classification** system using **OpenAI's CLIP model** (`ViT-B/32`) and a custom Gradio interface. |
|
|
|
|
|
## π What It Does |
|
|
|
|
|
CLIP can understand images and text in the same embedding space. With this app, you can: |
|
|
- Upload an image |
|
|
- Enter any number of labels (comma-separated) |
|
|
- Get predictions on how likely the image matches each label β **even without training!** |
|
|
|
|
|
## π‘ How It Works |
|
|
|
|
|
1. The input image is preprocessed and encoded using CLIP. |
|
|
2. Your custom labels are tokenized and also encoded. |
|
|
3. The cosine similarity between image and text embeddings is computed. |
|
|
4. The results are displayed with a probability score and a visual bar indicator. |
|
|
|
|
|
## π¦ Technologies Used |
|
|
|
|
|
- [Gradio](https://www.gradio.app/) β for the interactive web interface |
|
|
- [OpenAI CLIP](https://github.com/openai/CLIP) β the core model for zero-shot classification |
|
|
- PyTorch β model backend |
|
|
- Hugging Face Spaces β for easy and free deployment |
|
|
|
|
|
## π· Example Use Cases |
|
|
|
|
|
- Test if an image matches multiple tags |
|
|
- Quickly validate custom labels |
|
|
- Educational demos for multimodal ML |
|
|
|
|
|
## π οΈ How to Use |
|
|
|
|
|
1. Upload an image. |
|
|
2. Type in labels like: `a cat, a dog, a diagram, a spacecraft` |
|
|
3. Click **Classify**. |
|
|
4. See prediction probabilities and visual bars for each label. |
|
|
|
|
|
## π Notes |
|
|
|
|
|
- You can enter *any text labels* β even abstract or creative ones! |
|
|
- Works best on natural images (e.g., animals, objects, scenes) |
|
|
|
|
|
## π Notebook |
|
|
|
|
|
You can explore the companion Jupyter notebook here: |
|
|
[π Open notebook.ipynb](./notebook.ipynb) |
|
|
|
|
|
--- |
|
|
|
|
|
## π€ About Me |
|
|
|
|
|
I'm **Nikko**, a Machine Learning Engineer and AI enthusiast with a Master's degree in Artificial Intelligence from the University of the Philippines Diliman. With over a decade of experience in ICT consulting and telecommunications, I now specialize in **vision-language models**, **LLMs**, and **generative AI applications**. |
|
|
|
|
|
I'm passionate about creating systems where AI and humans can collaborate seamlessly β working toward a future where **smart cities** and intelligent automation become reality. |
|
|
|
|
|
Feel free to connect with me on [LinkedIn](https://www.linkedin.com/in/nikkoyabut/). |
|
|
|
|
|
--- |
|
|
|
|
|
Made with β€οΈ using CLIP + Gradio |
|
|
|