ThirdFourthFifth commited on
Commit
1f8f3c7
·
verified ·
1 Parent(s): 3191302

Upload 4 files

Browse files
Files changed (4) hide show
  1. README.md +39 -9
  2. app.py +41 -19
  3. image_database.xlsx +0 -0
  4. requirements.txt +2 -0
README.md CHANGED
@@ -8,14 +8,39 @@ An image search application powered by Google's SigLIP2 model (`google/siglip2-s
8
  - 🖼️ Search through a curated database of images
9
  - 📊 Similarity scores for each result
10
  - 🎯 Adjustable number of results (top-k)
 
11
 
12
  ## How It Works
13
 
14
  The app uses the SigLIP2 vision-language model to:
15
- 1. Encode all images in the database into embeddings
16
- 2. Encode your text query into an embedding
17
- 3. Find images with the highest similarity to your query
18
- 4. Display the top matching results with similarity scores
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  ## Usage
21
 
@@ -39,10 +64,6 @@ The app uses the SigLIP2 vision-language model to:
39
 
40
  This space uses the **google/siglip2-so400m-patch16-naflex** model, a state-of-the-art vision-language model from Google.
41
 
42
- ## Dataset
43
-
44
- The app searches through a fixed collection of sample images from Unsplash covering various categories like nature, animals, cities, food, and more.
45
-
46
  ## Local Setup
47
 
48
  To run this locally:
@@ -52,13 +73,22 @@ pip install -r requirements.txt
52
  python app.py
53
  ```
54
 
 
 
55
  ## Deployment on Hugging Face Spaces
56
 
57
  1. Create a new Space on Hugging Face
58
  2. Select "Gradio" as the SDK
59
- 3. Upload `app.py` and `requirements.txt`
60
  4. The Space will automatically build and deploy
61
 
 
 
 
 
 
 
 
62
  ## License
63
 
64
  This application is provided as-is for demonstration purposes. The SigLIP2 model is provided by Google and subject to its own license terms.
 
8
  - 🖼️ Search through a curated database of images
9
  - 📊 Similarity scores for each result
10
  - 🎯 Adjustable number of results (top-k)
11
+ - 📁 Easy image management via Excel spreadsheet
12
 
13
  ## How It Works
14
 
15
  The app uses the SigLIP2 vision-language model to:
16
+ 1. Load image URLs from an Excel spreadsheet (`image_database.xlsx`)
17
+ 2. Encode all images in the database into embeddings
18
+ 3. Encode your text query into an embedding
19
+ 4. Find images with the highest similarity to your query
20
+ 5. Display the top matching results with similarity scores
21
+
22
+ ## Image Database Format
23
+
24
+ The app reads image URLs from an Excel file named `image_database.xlsx`. The Excel file should have:
25
+
26
+ - **Required:** A column named `url` (or `URL`, `image_url`, `urls`, `link`, or `image`) containing the image URLs
27
+ - **Optional:** Additional columns like `description`, `category`, etc. for your own reference
28
+
29
+ ### Example Excel Format:
30
+
31
+ | url | description |
32
+ |-----|-------------|
33
+ | https://example.com/image1.jpg | Mountain landscape |
34
+ | https://example.com/image2.jpg | Cat photo |
35
+ | https://example.com/image3.jpg | Beach sunset |
36
+
37
+ ### To Update Your Image Database:
38
+
39
+ 1. Edit `image_database.xlsx` with your own image URLs
40
+ 2. Save the file
41
+ 3. Restart the Gradio app
42
+
43
+ The app will automatically load all URLs from the Excel file at startup.
44
 
45
  ## Usage
46
 
 
64
 
65
  This space uses the **google/siglip2-so400m-patch16-naflex** model, a state-of-the-art vision-language model from Google.
66
 
 
 
 
 
67
  ## Local Setup
68
 
69
  To run this locally:
 
73
  python app.py
74
  ```
75
 
76
+ Make sure you have `image_database.xlsx` in the same directory.
77
+
78
  ## Deployment on Hugging Face Spaces
79
 
80
  1. Create a new Space on Hugging Face
81
  2. Select "Gradio" as the SDK
82
+ 3. Upload `app.py`, `requirements.txt`, and `image_database.xlsx`
83
  4. The Space will automatically build and deploy
84
 
85
+ ## Files Included
86
+
87
+ - `app.py` - Main Gradio application
88
+ - `requirements.txt` - Python dependencies
89
+ - `image_database.xlsx` - Excel spreadsheet containing image URLs
90
+ - `README.md` - This file
91
+
92
  ## License
93
 
94
  This application is provided as-is for demonstration purposes. The SigLIP2 model is provided by Google and subject to its own license terms.
app.py CHANGED
@@ -6,6 +6,8 @@ import numpy as np
6
  from typing import List, Tuple
7
  import requests
8
  from io import BytesIO
 
 
9
 
10
  # Initialize model and processor
11
  MODEL_NAME = "google/siglip2-so400m-patch16-naflex"
@@ -16,24 +18,42 @@ processor = AutoProcessor.from_pretrained(MODEL_NAME)
16
  model = AutoModel.from_pretrained(MODEL_NAME).to(device)
17
  model.eval()
18
 
19
- # Fixed database of images (using sample images from various sources)
20
- IMAGE_DATABASE = [
21
- "https://images.unsplash.com/photo-1506905925346-21bda4d32df4?w=400", # Mountain landscape
22
- "https://images.unsplash.com/photo-1518791841217-8f162f1e1131?w=400", # Cat
23
- "https://images.unsplash.com/photo-1552053831-71594a27632d?w=400", # Dog
24
- "https://images.unsplash.com/photo-1506748686214-e9df14d4d9d0?w=400", # Beach sunset
25
- "https://images.unsplash.com/photo-1469474968028-56623f02e42e?w=400", # Nature/Forest
26
- "https://images.unsplash.com/photo-1519681393784-d120267933ba?w=400", # Mountains
27
- "https://images.unsplash.com/photo-1504893524553-b855bce32c67?w=400", # City skyline
28
- "https://images.unsplash.com/photo-1541963463532-d68292c34b19?w=400", # Flowers
29
- "https://images.unsplash.com/photo-1488590528505-98d2b5aba04b?w=400", # Technology/laptop
30
- "https://images.unsplash.com/photo-1546069901-ba9599a7e63c?w=400", # Food
31
- "https://images.unsplash.com/photo-1511919884226-fd3cad34687c?w=400", # Car
32
- "https://images.unsplash.com/photo-1473186578172-c141e6798cf4?w=400", # Person running
33
- "https://images.unsplash.com/photo-1464822759023-fed622ff2c3b?w=400", # Mountain peaks
34
- "https://images.unsplash.com/photo-1470071459604-3b5ec3a7fe05?w=400", # Nature scene
35
- "https://images.unsplash.com/photo-1441974231531-c6227db76b6e?w=400", # Forest path
36
- ]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
 
38
  # Cache for loaded images
39
  image_cache = {}
@@ -129,6 +149,8 @@ with gr.Blocks(title="Image Search with SigLIP2") as demo:
129
  Search through a collection of images using natural language queries!
130
  The model used is **google/siglip2-so400m-patch16-naflex**.
131
 
 
 
132
  Try queries like:
133
  - "a cat"
134
  - "mountain landscape"
@@ -179,7 +201,7 @@ with gr.Blocks(title="Image Search with SigLIP2") as demo:
179
  gr.Markdown(
180
  """
181
  ---
182
- **Note:** This demo uses a fixed set of sample images from Unsplash.
183
  The SigLIP2 model computes similarity between your text query and the images to find the best matches.
184
  """
185
  )
 
6
  from typing import List, Tuple
7
  import requests
8
  from io import BytesIO
9
+ import pandas as pd
10
+ import os
11
 
12
  # Initialize model and processor
13
  MODEL_NAME = "google/siglip2-so400m-patch16-naflex"
 
18
  model = AutoModel.from_pretrained(MODEL_NAME).to(device)
19
  model.eval()
20
 
21
+ # Load image URLs from Excel file
22
+ def load_image_database(excel_file: str = "image_database.xlsx") -> List[str]:
23
+ """Load image URLs from Excel spreadsheet"""
24
+ if not os.path.exists(excel_file):
25
+ raise FileNotFoundError(
26
+ f"Image database file '{excel_file}' not found. "
27
+ f"Please create an Excel file with a column named 'url' containing image URLs."
28
+ )
29
+
30
+ df = pd.read_excel(excel_file)
31
+
32
+ # Look for a column named 'url', 'URL', 'image_url', or similar
33
+ url_column = None
34
+ for col in df.columns:
35
+ if col.lower() in ['url', 'image_url', 'image_urls', 'urls', 'link', 'image']:
36
+ url_column = col
37
+ break
38
+
39
+ if url_column is None:
40
+ raise ValueError(
41
+ f"Could not find URL column in Excel file. "
42
+ f"Please use one of these column names: 'url', 'URL', 'image_url', 'urls', 'link', or 'image'. "
43
+ f"Found columns: {list(df.columns)}"
44
+ )
45
+
46
+ # Extract URLs and remove any NaN values
47
+ urls = df[url_column].dropna().tolist()
48
+
49
+ # Convert to strings and strip whitespace
50
+ urls = [str(url).strip() for url in urls]
51
+
52
+ print(f"Loaded {len(urls)} image URLs from {excel_file}")
53
+ return urls
54
+
55
+ # Load the image database from Excel
56
+ IMAGE_DATABASE = load_image_database()
57
 
58
  # Cache for loaded images
59
  image_cache = {}
 
149
  Search through a collection of images using natural language queries!
150
  The model used is **google/siglip2-so400m-patch16-naflex**.
151
 
152
+ Image URLs are loaded from **image_database.xlsx**.
153
+
154
  Try queries like:
155
  - "a cat"
156
  - "mountain landscape"
 
201
  gr.Markdown(
202
  """
203
  ---
204
+ **Note:** This demo uses images from the **image_database.xlsx** file.
205
  The SigLIP2 model computes similarity between your text query and the images to find the best matches.
206
  """
207
  )
image_database.xlsx ADDED
Binary file (6.91 kB). View file
 
requirements.txt CHANGED
@@ -4,3 +4,5 @@ transformers==4.46.0
4
  Pillow==10.1.0
5
  numpy==1.24.3
6
  requests==2.31.0
 
 
 
4
  Pillow==10.1.0
5
  numpy==1.24.3
6
  requests==2.31.0
7
+ pandas==2.1.1
8
+ openpyxl==3.1.2