Food_Recommender / README.md
MatanKriel's picture
Update README.md
3e810e8 verified

A newer version of the Gradio SDK is available: 6.5.1

Upgrade
metadata
title: Food Matcher AI (SigLIP Edition)
emoji: πŸ”
colorFrom: green
colorTo: yellow
sdk: gradio
sdk_version: 5.0.0
app_file: app.py
pinned: false

πŸ” Visual Dish Matcher AI

A computer vision app that suggests recipes and dishes based on visual similarity using Google's SigLIP model.

🎯 Project Overview

This project builds a Visual Search Engine for food. Instead of relying on text labels (which can be inaccurate or missing), we use Vector Embeddings to find dishes that look similar.

Key Features:

  • Multimodal Search: Find food using an image or a text description.
  • Advanced Data Cleaning: Automated detection of blurry or low-quality images.
  • Model Comparison: A scientific comparison between OpenAI CLIP and Google SigLIP to choose the best engine.

Live Demo: [Click "App" tab above to view]


πŸ› οΈ Tech Stack

  • Model: Google SigLIP (google/siglip-base-patch16-224)
  • Frameworks: PyTorch, Transformers, Gradio, Datasets
  • Data Engineering: OpenCV (Feature Extraction), NumPy
  • Data Storage: Parquet (via Git LFS)
  • Visualization: Matplotlib, Seaborn, Scikit-Learn (t-SNE/PCA)

πŸ“Š Part 1: Data Analysis & Cleaning

Dataset: Food-101 (ETH Zurich) (Subset of 5,000 images).

1. Exploratory Data Analysis (EDA)

Before any modeling, we analyzed the raw data to ensure quality and balance.

  • Class Balance Check: We verified that our random subset of 5,000 images maintained a healthy distribution across the 101 food categories (approx. 50 images per class).
  • Image Dimensions: We visualized the width and height distribution to identify unusually small or large images.
  • Outlier Detection: We plotted the distribution of Aspect Ratios and Brightness Levels.

image

image

image

2. Data Cleaning

Based on the plots above, we deleted "bad" images that were:

  • Too Dark (Avg Pixel Intensity < 20)
  • Too Bright/Washed out (Avg Pixel Intensity > 245)
  • Extreme Aspect Ratios (Too stretched or squashed, AR > 3.0)

βš”οΈ Part 2: Model Comparison (CLIP vs. SigLIP vs metaclip)

To ensure the best search results, we ran a "Challenger" test between three leading multimodal models.

The Contestants:

  1. Baseline: OpenAI CLIP (clip-vit-base-patch32)
  2. Challenger: Google SigLIP (siglip-base-patch16-224)
  3. Challenger: Facebook MetaCLIP": ("facebook/metaclip-b32-400m)

The Evaluation:

We compared them using Silhouette Scores (measuring how distinct the food clusters are) and a visual "Taste Test" (checking nearest neighbors for specific dishes).

  • Metric: Silhouette Score
  • Winner: Google SigLIP (Produced cleaner, more distinct clusters and better visual matches).

Visual Comparison: We queried both models with the same image to see which returned more accurate similar foods.

image


🧠 Part 3: Embeddings & Clustering

Using the winning model (SigLIP), We applied dimensionality reduction to visualize how the AI groups food concepts.

  • Algorithm: K-Means Clustering (k=101 categories).
  • Visualization:
    • PCA: To see the global variance.
    • t-SNE: To see local groupings (e.g., "Sushi" clusters separately from "Burgers").

image


πŸš€ Part 4: The Application

The final product is a Gradio web application hosted on Hugging Face Spaces.

  1. Image-to-Image: Upload a photo (e.g., a burger) -> The app embeds it using SigLIP -> Finds the nearest 3 visual matches.
  2. Text-to-Image: Type "Spicy Tacos" -> The app finds images matching that description.

Note

The application is running the clip model even though the sigLip model won, sigLip was to big to be run on the hugging face space free tier

How to Run Locally

  1. Clone the repository:
    git clone [https://huggingface.co/spaces/YOUR_USERNAME/Food_Recommender](https://huggingface.co/spaces/YOUR_USERNAME/Food-Match)
    cd Food-Match
    
  2. Install dependencies:
    pip install -r requirements.txt
    
  3. Run the app:
    python app.py
    

πŸ“‚ Repository Structure

  • app.py: Main application logic (Gradio + SigLIP).
  • food_embeddings_siglip.parquet: Pre-computed SigLIP vector database.
  • requirements.txt: Python dependencies (includes sentencepiece, protobuf).
  • README.md: Project documentation.

✍️ Authors

Matan Kriel Odeya Shmuel Assignment #3: Embeddings, RecSys, and Spaces