chandrakalagowda commited on
Commit
d0594bb
·
1 Parent(s): 3157905

Upload 4 files

Browse files
1_build_text_image_search_engine9.py ADDED
@@ -0,0 +1,266 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python
2
+ # coding: utf-8
3
+
4
+ # # Build a Milvus Powered Text-Image Search Engine in Minutes
5
+ #
6
+ # This notebook illustrates how to build an text-image search engine from scratch using [Milvus](https://milvus.io/). Milvus is the most advanced open-source vector database built for AI applications and supports nearest neighbor embedding search across tens of millions of entries. We'll go through text-image search procedures and evaluate the performance. Moreover, we managed to make the core functionality as simple as a dozen lines of code, with which you can start hacking your own image search engine.
7
+
8
+ # ## Preparation
9
+ # ### Install Dependencies
10
+ # First we need to install dependencies such as pymilvus, towhee, gradio and opencv-python.
11
+
12
+ # In[7]:
13
+
14
+
15
+ #! python -m pip install -q pymilvus towhee gradio opencv-python
16
+ # pip3 install transformers #pip3 install torch #pip3 install torchvision
17
+
18
+
19
+ # ### Prepare the data
20
+ #
21
+ # The dataset used in this demo is a subset of the ImageNet dataset (100 classes, 10 images for each class), and the dataset is available via [Github](https://github.com/towhee-io/examples/releases/download/data/reverse_image_search.zip).
22
+ #
23
+ # The dataset is organized as follows:
24
+ # - **train**: directory of candidate images;
25
+ # - **test**: directory of test images;
26
+ # - **reverse_image_search.csv**: a csv file containing an ***id***, ***path***, and ***label*** for each image;
27
+ #
28
+ # Let's take a quick look:
29
+
30
+ # In[8]:
31
+
32
+
33
+ get_ipython().system(' curl -L https://github.com/towhee-io/examples/releases/download/data/reverse_image_search.zip -O')
34
+ get_ipython().system(' unzip -q -o reverse_image_search.zip')
35
+
36
+
37
+ # In[9]:
38
+
39
+
40
+ import pandas as pd
41
+
42
+ df = pd.read_csv('reverse_image_search.csv')
43
+ df.head()
44
+
45
+
46
+ # In[10]:
47
+
48
+
49
+ dfnew=df.loc[df['label'] == 'hunger_games']
50
+ dfnew
51
+
52
+
53
+ # To use the dataset for text-image search, let's first define some helper function:
54
+ #
55
+ # - **read_images(results)**: read images by image IDs;
56
+
57
+ # In[11]:
58
+
59
+
60
+ import cv2
61
+ from towhee.types.image import Image
62
+
63
+ id_img = df.set_index('id')['path'].to_dict()
64
+ def read_images(results):
65
+ imgs = []
66
+ for re in results:
67
+ path = id_img[re.id]
68
+ imgs.append(Image(cv2.imread(path), 'BGR'))
69
+ return imgs
70
+
71
+
72
+ # ### Create a Milvus Collection
73
+ #
74
+ # Before getting started, please make sure you have [installed milvus](https://milvus.io/docs/v2.0.x/install_standalone-docker.md). Let's first create a `text_image_search` collection that uses the [L2 distance metric](https://milvus.io/docs/v2.0.x/metric.md#Euclidean-distance-L2) and an [IVF_FLAT index](https://milvus.io/docs/v2.0.x/index.md#IVF_FLAT).
75
+
76
+ # In[12]:
77
+
78
+
79
+ from pymilvus import connections, FieldSchema, CollectionSchema, DataType, Collection, utility
80
+
81
+ def create_milvus_collection(collection_name, dim):
82
+ connections.connect(host='127.0.0.1', port='19530')
83
+
84
+ if utility.has_collection(collection_name):
85
+ utility.drop_collection(collection_name)
86
+
87
+ fields = [
88
+ FieldSchema(name='id', dtype=DataType.INT64, descrition='ids', is_primary=True, auto_id=False),
89
+ FieldSchema(name='embedding', dtype=DataType.FLOAT_VECTOR, descrition='embedding vectors', dim=dim)
90
+ ]
91
+ schema = CollectionSchema(fields=fields, description='text image search')
92
+ collection = Collection(name=collection_name, schema=schema)
93
+
94
+ # create IVF_FLAT index for collection.
95
+ index_params = {
96
+ 'metric_type':'L2',
97
+ 'index_type':"IVF_FLAT",
98
+ 'params':{"nlist":512}
99
+ }
100
+ collection.create_index(field_name="embedding", index_params=index_params)
101
+ return collection
102
+
103
+ collection = create_milvus_collection('text_image_search', 512)
104
+
105
+
106
+ # ## Text Image Search
107
+ #
108
+ # In this section, we'll show how to build our text-image search engine using Milvus. The basic idea behind our text-image search is the extract embeddings from images and texts using a deep neural network and compare the embeddings with those stored in Milvus.
109
+ #
110
+ # We use [Towhee](https://towhee.io/), a machine learning framework that allows for creating data processing pipelines, and it also provides predefined operators which implement insert and query operation in Milvus.
111
+ #
112
+ # <img src="./workflow.png" width = "60%" height = "60%" align=center />
113
+
114
+ # ### Generate image and text embeddings with CLIP
115
+ #
116
+
117
+ # This operator extracts features for image or text with [CLIP](https://openai.com/blog/clip/) which can generate embeddings for text and image by jointly training an image encoder and text encoder to maximize the cosine similarity.
118
+
119
+ # In[13]:
120
+
121
+
122
+ from towhee import ops, pipe, DataCollection
123
+ import numpy as np
124
+
125
+
126
+ # In[14]:
127
+
128
+
129
+ ###. This section needs to have the teddy.png in the folder. Else it will throw an error.
130
+ p = (
131
+ pipe.input('path')
132
+ .map('path', 'img', ops.image_decode.cv2('rgb'))
133
+ .map('img', 'vec', ops.image_text_embedding.clip(model_name='clip_vit_base_patch16', modality='image'))
134
+ .map('vec', 'vec', lambda x: x / np.linalg.norm(x))
135
+ .output('img', 'vec')
136
+ )
137
+
138
+ DataCollection(p('./teddy.png')).show()
139
+
140
+
141
+ # In[15]:
142
+
143
+
144
+ p2 = (
145
+ pipe.input('text')
146
+ .map('text', 'vec', ops.image_text_embedding.clip(model_name='clip_vit_base_patch16', modality='text'))
147
+ .map('vec', 'vec', lambda x: x / np.linalg.norm(x))
148
+ .output('text', 'vec')
149
+ )
150
+
151
+ DataCollection(p2("A teddybear on a skateboard in Times Square.")).show()
152
+
153
+
154
+ # Here is detailed explanation of the code:
155
+ #
156
+ # - `map('path', 'img', ops.image_decode.cv2_rgb('rgb'))`: for each row from the data, read and decode the image at `path` and put the pixel data into column `img`;
157
+ #
158
+ # - `map('img', 'vec', ops.image_text_embedding.clip(model_name='clip_vit_base_patch16',modality='image'/'text')`: extract image or text embedding feature with `ops.image_text_embedding.clip`, an operator from the [Towhee hub](https://towhee.io/image-text-embedding/clip) . This operator supports seveal models including `clip_vit_base_patch16`,`clip_vit_base_patch32`,`clip_vit_large_patch14`,`clip_vit_large_patch14_336`,etc.
159
+
160
+ # ### Load Image Embeddings into Milvus
161
+ #
162
+ # We first extract embeddings from images with `clip_vit_base_patch16` model and insert the embeddings into Milvus for indexing. Towhee provides a [method-chaining style API](https://towhee.readthedocs.io/en/main/index.html) so that users can assemble a data processing pipeline with operators.
163
+
164
+ # In[16]:
165
+
166
+
167
+ ### If cuda. not installed. replace delete device =0 in (in. this code it is deleted.)
168
+ ### .map('img', 'vec', ops.image_text_embedding.clip(model_name='clip_vit_base_patch16', modality='image', device=0))
169
+ ### This code takes about 5 to 8 mins to execute. on a CPU
170
+
171
+
172
+ # In[17]:
173
+
174
+
175
+ get_ipython().run_cell_magic('time', '', "collection = create_milvus_collection('text_image_search', 512)\n\ndef read_csv(csv_path, encoding='utf-8-sig'):\n import csv\n with open(csv_path, 'r', encoding=encoding) as f:\n data = csv.DictReader(f)\n for line in data:\n yield int(line['id']), line['path']\n\np3 = (\n pipe.input('csv_file')\n .flat_map('csv_file', ('id', 'path'), read_csv)\n .map('path', 'img', ops.image_decode.cv2('rgb'))\n .map('img', 'vec', ops.image_text_embedding.clip(model_name='clip_vit_base_patch16', modality='image'))\n .map('vec', 'vec', lambda x: x / np.linalg.norm(x))\n .map(('id', 'vec'), (), ops.ann_insert.milvus_client(host='127.0.0.1', port='19530', collection_name='text_image_search'))\n .output()\n)\n\nret = p3('reverse_image_search.csv')\n")
176
+
177
+
178
+ # In[18]:
179
+
180
+
181
+ collection.load()
182
+
183
+
184
+ # In[19]:
185
+
186
+
187
+ print('Total number of inserted data is {}.'.format(collection.num_entities))
188
+
189
+
190
+ # ### Query Matched Images from Milvus
191
+
192
+ # Now that embeddings for candidate images have been inserted into Milvus, we can query across it for nearest neighbors. Again, we use Towhee to load the input Text, compute an embedding vector, and use the vector as a query for Milvus. Because Milvus only outputs image IDs and distance values, we provide a `read_images` function to get the original image based on IDs and display.
193
+
194
+ # In[20]:
195
+
196
+
197
+ import pandas as pd
198
+ import cv2
199
+
200
+ def read_image(image_ids):
201
+ df = pd.read_csv('reverse_image_search.csv')
202
+ id_img = df.set_index('id')['path'].to_dict()
203
+ imgs = []
204
+ decode = ops.image_decode.cv2('rgb')
205
+ for image_id in image_ids:
206
+ path = id_img[image_id]
207
+ imgs.append(decode(path))
208
+ return imgs
209
+
210
+
211
+ p4 = (
212
+ pipe.input('text')
213
+ .map('text', 'vec', ops.image_text_embedding.clip(model_name='clip_vit_base_patch16', modality='text'))
214
+ .map('vec', 'vec', lambda x: x / np.linalg.norm(x))
215
+ .map('vec', 'result', ops.ann_search.milvus_client(host='127.0.0.1', port='19530', collection_name='text_image_search', limit=5))
216
+ .map('result', 'image_ids', lambda x: [item[0] for item in x])
217
+ .map('image_ids', 'images', read_image)
218
+ .output('text', 'images')
219
+ )
220
+
221
+ DataCollection(p4("A white dog")).show()
222
+ DataCollection(p4("A black dog")).show()
223
+
224
+
225
+ # ## Release a Showcase
226
+
227
+ # We've done an excellent job on the core functionality of our text-image search engine. Now it's time to build a showcase with interface. [Gradio](https://gradio.app/) is a great tool for building demos. With Gradio, we simply need to wrap the data processing pipeline via a `search_in_milvus` function:
228
+
229
+ # In[21]:
230
+
231
+
232
+ search_pipeline = (
233
+ pipe.input('text')
234
+ .map('text', 'vec', ops.image_text_embedding.clip(model_name='clip_vit_base_patch16', modality='text'))
235
+ .map('vec', 'vec', lambda x: x / np.linalg.norm(x))
236
+ .map('vec', 'result', ops.ann_search.milvus_client(host='127.0.0.1', port='19530', collection_name='text_image_search', limit=5))
237
+ .map('result', 'image_ids', lambda x: [item[0] for item in x])
238
+ .output('image_ids')
239
+ )
240
+
241
+ def search(text):
242
+ df = pd.read_csv('reverse_image_search.csv')
243
+ id_img = df.set_index('id')['path'].to_dict()
244
+ imgs = []
245
+ image_ids = search_pipeline(text).to_list()[0][0]
246
+ return [id_img[image_id] for image_id in image_ids]
247
+
248
+
249
+ # In[22]:
250
+
251
+
252
+ import gradio
253
+
254
+ interface = gradio.Interface(search,
255
+ gradio.inputs.Textbox(lines=1),
256
+ [gradio.outputs.Image(type="filepath", label=None) for _ in range(5)]
257
+ )
258
+
259
+ interface.launch()
260
+
261
+
262
+ # In[ ]:
263
+
264
+
265
+
266
+
docker-compose.yml ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ version: '3.5'
2
+
3
+ services:
4
+ etcd:
5
+ container_name: milvus-etcd
6
+ image: quay.io/coreos/etcd:v3.5.5
7
+ environment:
8
+ - ETCD_AUTO_COMPACTION_MODE=revision
9
+ - ETCD_AUTO_COMPACTION_RETENTION=1000
10
+ - ETCD_QUOTA_BACKEND_BYTES=4294967296
11
+ - ETCD_SNAPSHOT_COUNT=50000
12
+ volumes:
13
+ - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/etcd:/etcd
14
+ command: etcd -advertise-client-urls=http://127.0.0.1:2379 -listen-client-urls http://0.0.0.0:2379 --data-dir /etcd
15
+
16
+ minio:
17
+ container_name: milvus-minio
18
+ image: minio/minio:RELEASE.2023-03-20T20-16-18Z
19
+ environment:
20
+ MINIO_ACCESS_KEY: minioadmin
21
+ MINIO_SECRET_KEY: minioadmin
22
+ volumes:
23
+ - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/minio:/minio_data
24
+ command: minio server /minio_data
25
+ healthcheck:
26
+ test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
27
+ interval: 30s
28
+ timeout: 20s
29
+ retries: 3
30
+
31
+ standalone:
32
+ container_name: milvus-standalone
33
+ image: milvusdb/milvus:v2.2.10
34
+ command: ["milvus", "run", "standalone"]
35
+ environment:
36
+ ETCD_ENDPOINTS: etcd:2379
37
+ MINIO_ADDRESS: minio:9000
38
+ volumes:
39
+ - ${DOCKER_VOLUME_DIRECTORY:-.}/volumes/milvus:/var/lib/milvus
40
+ ports:
41
+ - "19530:19530"
42
+ - "9091:9091"
43
+ depends_on:
44
+ - "etcd"
45
+ - "minio"
46
+
47
+ networks:
48
+ default:
49
+ name: milvus
requirements.txt ADDED
@@ -0,0 +1,123 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ aiofiles==23.1.0
2
+ aiohttp==3.8.4
3
+ aiosignal==1.3.1
4
+ altair==5.0.1
5
+ anyio==3.7.0
6
+ appnope==0.1.3
7
+ asttokens==2.2.1
8
+ async-timeout==4.0.2
9
+ attrs==23.1.0
10
+ backcall==0.2.0
11
+ bleach==6.0.0
12
+ certifi==2023.5.7
13
+ charset-normalizer==3.1.0
14
+ click==8.1.3
15
+ comm==0.1.3
16
+ contourpy==1.1.0
17
+ cycler==0.11.0
18
+ debugpy==1.6.7
19
+ decorator==5.1.1
20
+ docutils==0.20.1
21
+ executing==1.2.0
22
+ fastapi==0.97.0
23
+ ffmpy==0.3.0
24
+ filelock==3.12.2
25
+ fonttools==4.40.0
26
+ frozenlist==1.3.3
27
+ fsspec==2023.6.0
28
+ gradio==3.35.2
29
+ gradio_client==0.2.7
30
+ grpcio==1.53.0
31
+ grpcio-tools==1.53.0
32
+ h11==0.14.0
33
+ httpcore==0.17.2
34
+ httpx==0.24.1
35
+ huggingface-hub==0.15.1
36
+ idna==3.4
37
+ importlib-metadata==6.7.0
38
+ ipykernel==6.23.2
39
+ ipython==8.14.0
40
+ jaraco.classes==3.2.3
41
+ jedi==0.18.2
42
+ Jinja2==3.1.2
43
+ jsonschema==4.17.3
44
+ jupyter_client==8.2.0
45
+ jupyter_core==5.3.1
46
+ keyring==24.0.1
47
+ kiwisolver==1.4.4
48
+ linkify-it-py==2.0.2
49
+ markdown-it-py==2.2.0
50
+ MarkupSafe==2.1.3
51
+ matplotlib==3.7.1
52
+ matplotlib-inline==0.1.6
53
+ mdit-py-plugins==0.3.3
54
+ mdurl==0.1.2
55
+ mmh3==4.0.0
56
+ more-itertools==9.1.0
57
+ mpmath==1.3.0
58
+ multidict==6.0.4
59
+ nest-asyncio==1.5.6
60
+ networkx==3.1
61
+ numpy==1.25.0
62
+ opencv-python==4.7.0.72
63
+ orjson==3.9.1
64
+ packaging==23.1
65
+ pandas==2.0.2
66
+ parso==0.8.3
67
+ pexpect==4.8.0
68
+ pickleshare==0.7.5
69
+ Pillow==9.5.0
70
+ pkginfo==1.9.6
71
+ platformdirs==3.7.0
72
+ prompt-toolkit==3.0.38
73
+ protobuf==4.23.3
74
+ psutil==5.9.5
75
+ ptyprocess==0.7.0
76
+ pure-eval==0.2.2
77
+ pydantic==1.10.9
78
+ pydub==0.25.1
79
+ Pygments==2.15.1
80
+ pymilvus==2.2.5
81
+ pyparsing==3.1.0
82
+ pyrsistent==0.19.3
83
+ python-dateutil==2.8.2
84
+ python-multipart==0.0.6
85
+ pytz==2023.3
86
+ PyYAML==6.0
87
+ pyzmq==25.1.0
88
+ readme-renderer==40.0
89
+ regex==2023.6.3
90
+ requests==2.31.0
91
+ requests-toolbelt==1.0.0
92
+ rfc3986==2.0.0
93
+ rich==13.4.2
94
+ safetensors==0.3.1
95
+ semantic-version==2.10.0
96
+ six==1.16.0
97
+ sniffio==1.3.0
98
+ stack-data==0.6.2
99
+ starlette==0.27.0
100
+ sympy==1.12
101
+ tabulate==0.9.0
102
+ tenacity==8.2.2
103
+ tokenizers==0.13.3
104
+ toolz==0.12.0
105
+ torch==2.0.1
106
+ torchvision==0.15.2
107
+ tornado==6.3.2
108
+ towhee==1.1.0
109
+ tqdm==4.65.0
110
+ traitlets==5.9.0
111
+ transformers==4.30.2
112
+ twine==4.0.2
113
+ typing_extensions==4.6.3
114
+ tzdata==2023.3
115
+ uc-micro-py==1.0.2
116
+ ujson==5.8.0
117
+ urllib3==2.0.3
118
+ uvicorn==0.22.0
119
+ wcwidth==0.2.6
120
+ webencodings==0.5.1
121
+ websockets==11.0.3
122
+ yarl==1.9.2
123
+ zipp==3.15.0
teddy.png ADDED