Spaces:

MatanKriel
/

Food_Recommender

Sleeping

App Files Files Community

Matan Kriel commited on Dec 26, 2025

Commit

b2aba87

1 Parent(s): 9b48fc7

added files

Browse files

Files changed (5) hide show

Assignment_3 SigLIP.ipynb +0 -0
README.md +139 -6
app.py +93 -0
food_embeddings_siglip.parquet +3 -0
requirements.txt +11 -0

Assignment_3 SigLIP.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

README.md CHANGED Viewed

@@ -1,12 +1,145 @@
 ---
-title: Food Recommender
-emoji: 😻
-colorFrom: purple
-colorTo: indigo
 sdk: gradio
-sdk_version: 6.2.0
 app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Food Matcher AI (SigLIP Edition)
+emoji: 🍔
+colorFrom: green
+colorTo: yellow
 sdk: gradio
+sdk_version: 5.0.0
 app_file: app.py
 pinned: false
 ---
+# 🍔 Visual Dish Matcher AI
+**A computer vision app that suggests recipes and dishes based on visual similarity using Google's SigLIP model.**
+## 🎯 Project Overview
+This project builds a **Visual Search Engine** for food. Instead of relying on text labels (which can be inaccurate or missing), we use **Vector Embeddings** to find dishes that look similar.
+**Key Features:**
+* **Multimodal Search:** Find food using an image *or* a text description.
+* **Advanced Data Cleaning:** Automated detection of blurry or low-quality images.
+* **Model Comparison:** A scientific comparison between **OpenAI CLIP** and **Google SigLIP** to choose the best engine.
+**Live Demo:** [Click "App" tab above to view]
+---
+## 🛠️ Tech Stack
+* **Model:** Google SigLIP (`google/siglip-base-patch16-224`)
+* **Frameworks:** PyTorch, Transformers, Gradio, Datasets
+* **Data Engineering:** OpenCV (Feature Extraction), NumPy
+* **Data Storage:** Parquet (via Git LFS)
+* **Visualization:** Matplotlib, Seaborn, Scikit-Learn (t-SNE/PCA)
+---
+## 📊 Part 1: Data Analysis & Cleaning
+**Dataset:** [Food-101 (ETH Zurich)](https://huggingface.co/datasets/ethz/food101) (Subset of 5,000 images).
+### 1. Exploratory Data Analysis (EDA)
+Before any modeling, we analyzed the raw data to ensure quality and balance.
+* **Class Balance Check:** We verified that our random subset of 5,000 images maintained a healthy distribution across the 101 food categories (approx. 50 images per class).
+* **Image Dimensions:** We visualized the width and height distribution to identify unusually small or large images.
+* **Outlier Detection:** We plotted the distribution of **Aspect Ratios** and **Brightness Levels**.
+![image](https://cdn-uploads.huggingface.co/production/uploads/67dfcd96d01eab4618a66f78/qe5z9j81mj2ahlENA2_5l.png)
+![image](https://cdn-uploads.huggingface.co/production/uploads/67dfcd96d01eab4618a66f78/_lh9-4RGOXCb8yy11Jar4.png)
+![image](https://cdn-uploads.huggingface.co/production/uploads/67dfcd96d01eab4618a66f78/6au3HUidoYBsiKreYTPJj.png)
+### 2. Data Cleaning
+Based on the plots above, **we deleted "bad" images** that were:
+* Too Dark (Avg Pixel Intensity < 20)
+* Too Bright/Washed out (Avg Pixel Intensity > 245)
+* Extreme Aspect Ratios (Too stretched or squashed, AR > 3.0)
+### 3. Advanced Feature Engineering
+After removing the garbage data, we engineered deeper visual features to assess image content:
+* **Sharpness Score:** Used Laplacian Variance to find blurry photos.
+* **Dominant Color (Hue):** Analyzed color clusters (e.g., Green for Salads vs. Red for Pizza).
+* **Texture Complexity:** Calculated pixel standard deviation to distinguish smooth vs. complex foods.
+![image](https://cdn-uploads.huggingface.co/production/uploads/67dfcd96d01eab4618a66f78/0QMOkOCATUfePwu_-nm0z.png)
+---
+## ⚔️ Part 2: Model Comparison (CLIP vs. SigLIP)
+To ensure the best search results, we ran a "Challenger" test between two leading multimodal models.
+### The Contestants:
+1.  **Baseline:** OpenAI CLIP (`clip-vit-base-patch32`)
+2.  **Challenger:** Google SigLIP (`siglip-base-patch16-224`)
+### The Evaluation:
+We compared them using **Silhouette Scores** (measuring how distinct the food clusters are) and a visual "Taste Test" (checking nearest neighbors for specific dishes).
+* **Metric:** Silhouette Score
+* **Winner:** **Google SigLIP** (Produced cleaner, more distinct clusters and better visual matches).
+**Visual Comparison:**
+We queried both models with the same image to see which returned more accurate similar foods.
+![image](https://cdn-uploads.huggingface.co/production/uploads/67dfcd96d01eab4618a66f78/R4biFno1FUizlVRLRVCqM.png)
+---
+## 🧠 Part 3: Embeddings & Clustering
+Using the winning model (**SigLIP**), we generated 768-dimensional vectors for the entire dataset. We applied dimensionality reduction to visualize how the AI groups food concepts.
+* **Algorithm:** K-Means Clustering (k=101 categories).
+* **Visualization:**
+    * **PCA:** To see the global variance.
+    * **t-SNE:** To see local groupings (e.g., "Sushi" clusters separately from "Burgers").
+![image](https://cdn-uploads.huggingface.co/production/uploads/67dfcd96d01eab4618a66f78/MT92BvvwToLxk83X0Yd12.png)
+![image](https://cdn-uploads.huggingface.co/production/uploads/67dfcd96d01eab4618a66f78/KyMPVz6VUsIGq2IMEZRYl.png)
+---
+## 🚀 Part 4: The Application
+The final product is a **Gradio** web application hosted on Hugging Face Spaces.
+1.  **Image-to-Image:** Upload a photo (e.g., a burger) -> The app embeds it using SigLIP -> Finds the nearest 3 visual matches.
+2.  **Text-to-Image:** Type "Spicy Tacos" -> The app finds images matching that description.
+### How to Run Locally
+1.  **Clone the repository:**
+    ```bash
+    git clone [https://huggingface.co/spaces/YOUR_USERNAME/Food-Match](https://huggingface.co/spaces/YOUR_USERNAME/Food-Match)
+    cd Food-Match
+    ```
+2.  **Install dependencies:**
+    ```bash
+    pip install -r requirements.txt
+    ```
+3.  **Run the app:**
+    ```bash
+    python app.py
+    ```
+---
+## 📂 Repository Structure
+* `app.py`: Main application logic (Gradio + SigLIP).
+* `food_embeddings_siglip.parquet`: Pre-computed SigLIP vector database.
+* `requirements.txt`: Python dependencies (includes `sentencepiece`, `protobuf`).
+* `README.md`: Project documentation.
+---
+## ✍️ Authors
+**Matan Kriel**
+**Odeya Shmuel**
+*Assignment #3: Embeddings, RecSys, and Spaces*

app.py ADDED Viewed

	@@ -0,0 +1,93 @@

+import gradio as gr
+import torch
+import pandas as pd
+import numpy as np
+from PIL import Image
+from transformers import AutoProcessor, AutoModel
+from datasets import load_dataset
+from torch.nn import functional as F
+# --- 1. SETUP & CONFIG ---
+MODEL_ID = "google/siglip-base-patch16-224"
+DATA_FILE = "food_embeddings_siglip.parquet"
+print(f"⏳ Starting App... Loading Model: {MODEL_ID}...")
+try:
+    model = AutoModel.from_pretrained(MODEL_ID)
+    processor = AutoProcessor.from_pretrained(MODEL_ID)
+except Exception as e:
+    print(f"❌ Model Error: {e}")
+# --- 2. LOAD DATA ---
+print("⏳ Loading Dataset...")
+# Load exact 5k subset used in training
+dataset = load_dataset("ethz/food101", split="train").shuffle(seed=42).select(range(5000))
+# --- 3. LOAD EMBEDDINGS ---
+print(f"⏳ Loading Embeddings from {DATA_FILE}...")
+try:
+    df = pd.read_parquet(DATA_FILE)
+    db_features = torch.tensor(np.stack(df['embedding'].to_numpy()))
+    db_features = F.normalize(db_features, p=2, dim=1)
+    print("✅ System Ready!")
+except Exception as e:
+    print(f"❌ Error loading parquet file: {e}")
+    print("⚠️ Please ensure 'food_embeddings_siglip.parquet' is uploaded to the Files tab.")
+    db_features = None
+# --- 4. CORE SEARCH LOGIC ---
+def find_best_matches(query_features, top_k=3):
+    if db_features is None:
+        return [None] * top_k # Return empty list if DB failed
+    # Normalize query
+    query_features = F.normalize(query_features, p=2, dim=1)
+    # Similarity Search
+    similarity = torch.mm(query_features, db_features.T)
+    scores, indices = torch.topk(similarity, k=top_k)
+    results = []
+    for idx, score in zip(indices[0], scores[0]):
+        idx = idx.item()
+        img = dataset[idx]['image']
+        label = df.iloc[idx]['label_name']
+        results.append((img, f"{label} ({score:.2f})"))
+    return results
+# --- 5. GRADIO FUNCTIONS ---
+def search_by_image(input_image):
+    if input_image is None: return []
+    inputs = processor(images=input_image, return_tensors="pt")
+    with torch.no_grad():
+        features = model.get_image_features(**inputs)
+    return find_best_matches(features)
+def search_by_text(input_text):
+    if not input_text: return []
+    inputs = processor(text=[input_text], return_tensors="pt", padding="max_length")
+    with torch.no_grad():
+        features = model.get_text_features(**inputs)
+    return find_best_matches(features)
+# --- 6. BUILD UI ---
+with gr.Blocks(title="Food Matcher AI") as demo:
+    gr.Markdown("# 🍔 Visual Dish Matcher")
+    gr.Markdown("Upload a photo of food (or describe it) to find similar dishes in our database.")
+    with gr.Tab("Image Search"):
+        with gr.Row():
+            img_input = gr.Image(type="pil", label="Upload Food Image")
+            img_gallery = gr.Gallery(label="Top Matches")
+        btn_img = gr.Button("Find Similar Dishes")
+        btn_img.click(search_by_image, inputs=img_input, outputs=img_gallery)
+    with gr.Tab("Text Search"):
+        with gr.Row():
+            txt_input = gr.Textbox(label="Describe the food (e.g., 'Spicy Tacos')")
+            txt_gallery = gr.Gallery(label="Top Matches")
+        btn_txt = gr.Button("Search by Description")
+        btn_txt.click(search_by_text, inputs=txt_input, outputs=txt_gallery)
+# Launch
+demo.launch()

food_embeddings_siglip.parquet ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a6f7a90c748628bffd1d2b4b08afa9e70707c593e5d7c98cfcdf9773d658af4e
+size 12925008

requirements.txt ADDED Viewed

	@@ -0,0 +1,11 @@

+gradio
+torch
+transformers
+pandas
+numpy
+datasets
+pyarrow
+scikit-learn
+sentencepiece
+protobuf
+pillow