dabbu2000 commited on
Commit
8ea1ecf
·
1 Parent(s): 942c848

Update germanShepherdStreamlit.py

Browse files
Files changed (1) hide show
  1. germanShepherdStreamlit.py +84 -158
germanShepherdStreamlit.py CHANGED
@@ -14,11 +14,11 @@ def main():
14
  """
15
  <style>
16
  [data-testid="stSidebar"][aria-expanded="true"] > div:first-child {
17
- width: 250px;
18
  }
19
  [data-testid="stSidebar"][aria-expanded="false"] > div:first-child {
20
- width: 250px;
21
- margin-left: -250px;
22
  }
23
  </style>
24
  """,
@@ -32,208 +32,134 @@ def main():
32
  Created by [Abhinit Sundar](https://github.com/asundar0128)
33
  """)
34
 
35
- germanShepherdEvaluationMethod = germanShepherdHeaderColumns[-1].button('Interested in learning how this works?')
36
- if display_method:
37
- describe_method()
38
  else:
39
- galaxy_search()
40
 
41
 
42
- def describe_method():
43
- st.button('Back to Galaxy Finder')
44
 
45
  st.markdown(
46
  """
47
- ### A bit about the method:
48
- - The similarity of two images is quite easy to judge by eye - but writing an algorithm to do the same is not as easy as one might think! This is because as hunans we can easily identify and understand what object is in the image.
49
- - A machine is different - it simply looks individual pixel values. Yet two images that to us have very similar properties and appearences will likely have vastly different pixel values. For example, imagine rotating a galaxy image by 90 degrees. It it obviously still the same galaxy, but the pixel values have completeley changed.
50
- - So the first step is to teach a computer to understand what is actually in the image on a deeper level than just looking at pixel values. Unfortunately we do not have any information alongside the image specifying what type of galaxy is actually in it - so where do we start?
51
- - We used a type of machine learning called "self-supervised representation learning" to boil down each image into a concentrated vector of information, or "representation", that encapsulates the appearance and properties of the galaxy.
52
- - Self-supervised learning works by creating multiple versions of each image which approximate the observational symmetries, errors, and uncertainties within the dataset, such as image rotations, adding noise, blurring it, etc., and then teaching the machine to learn the same representation for all these versions of the same galaxy. In this way, we move beyond looking at pixel values, and teach the machine a deeper understanding of the image.
53
- - Once we have trained the machine learning model on millions of galaxies we calculate and save the representation of every image in the dataset, and precompute the similarity of any two galaxies. Then, you tell us what galaxy to use as a starting point, we find the representation belonging to the image of that galaxy, compare it to millions of other representations from all the other galaxies, and return the most similar images!
54
-
55
- **Please see [our overview paper](https://arxiv.org/abs/2110.13151) for more technical details, or see our recent application of the app to find [strong gravitational lenses](https://arxiv.org/abs/2012.13083) -- some of the rarest and most interesting objects in the universe!**
56
-
57
  Dataset:
58
 
59
- - We used galaxy images from [DECaLS DR9](https://www.legacysurvey.org/), randomly sampling 3.5 million galaxies to train the machine learning model. We then apply it on every galaxy in the dataset, about 42 million galaxies with z-band magnitude < 20, so most bright things in the sky should be included, with very dim and small objects likely missing - more to come soon!
60
- - The models were trained using images of size 96 pixels by 96 pixels centered on the galaxy. So features outside of this central region are not used to calculate the similarity, but are sometimes nice to look at
61
- Please note this project is ongoing, and results will continue to be updated and improved.
62
- Created by [George Stein](https://georgestein.github.io/)
63
  """
64
  )
65
- st.button('Back to Galaxy Finder', key='galaxies') # will change state and hence trigger rerun and hence reset should_tell_me_more
66
 
67
 
68
- def galaxy_search():
69
-
70
- # Hardcode parameter options
71
- ra_unit_formats = 'degrees or HH:MM:SS'
72
- dec_unit_formats = 'degrees or DD:MM:SS'
73
-
74
- similarity_types = ['most similar', 'least similar']
75
 
76
- # choices for number of images to display
77
- num_nearest_vals = [i**2 for i in range(4, 11)]
 
 
 
 
 
78
 
79
- # maximum number of similar objects allowed in data table
80
- num_nearest_max = 1000
81
-
82
- npix_types = [96, 152, 256]
83
-
84
- model_versions = ['v1', 'v2']
85
-
86
- # don't use galaxies up to this index, as lots can have weird observing errors
87
- index_use_min = 2500
88
-
89
- # Read in selected options and run program
90
- tstart = time.time()
91
-
92
- with st.sidebar.expander('Instructions'):
93
  st.markdown(
94
  """
95
- **Enter the coordinates of your favourite galaxy and we'll search for the most similar looking ones in the universe!**
96
 
97
- Click the 'search random galaxy' button, or try finding a cool galaxy at [legacysurvey.org](https://www.legacysurvey.org/viewer)
98
- - Use the south survey (select the <Legacy Surveys DR9-south images> option). Currently not all galaxies are included, but most bright ones should be.
99
  """
100
  )
101
- #st.sidebar.markdown('### Set up and submit your query!')
102
-
103
- ra_search = st.sidebar.text_input('RA', key='ra',
104
- help="Right Ascension of query galaxy ({:s})".format(ra_unit_formats),
105
- value='199.3324')
106
- dec_search = st.sidebar.text_input('Dec', key='dec',
107
- help="Declination of query galaxy ({:s})".format(dec_unit_formats),
108
- value='20.6382')
109
-
110
- ra_search, dec_search = radec_string_to_degrees(ra_search, dec_search, ra_unit_formats, dec_unit_formats)
111
-
112
- # similarity_option = st.sidebar.selectbox(
113
- # 'Want to see the most similar galaxies or the least similar?',
114
- # similarity_types)
115
-
116
- num_nearest = st.sidebar.select_slider('Number of similar galaxies to display', num_nearest_vals)
117
 
118
- npix_show = st.sidebar.select_slider('Image size (pixels)', npix_types, value=npix_types[1])
119
 
120
- model_version = st.sidebar.select_slider('Model version', model_versions, value=model_versions[-1])
121
 
122
- num_similar_query = 1000
123
 
124
- similarity_inv = False
125
- #if similarity_option == 'least similar':
126
- # similarity_inv = True
127
 
128
- start_search = st.sidebar.button('Search query')
129
- start_search_random = st.sidebar.button('Search random galaxy')
130
-
131
- # load in full datasets needed
132
  LC = LoadCatalogue()
133
- cat = LC.download_catalogue_files(include_extra_features=True)
134
-
135
- #cat = LC.load_catalogue_coordinates(include_extra_features=True)
136
- ngals_tot = cat['ngals_tot']
137
-
138
- # Set up class containing search operations
139
- CAT = Catalogue(cat)
140
 
 
141
 
142
- # start search when prompted by user
143
- if start_search or start_search_random:
144
- if start_search_random:
145
- # Galaxies are sorted by brightness, so earlier ones are more interesting to look at
146
- # Sample with this in mind by using lognormal distribution
147
 
148
- ind_max = ngals_tot-1
149
- ind_random = 0
150
- while (ind_random < index_use_min) or (ind_random > ind_max):
151
- #ind_random = int(np.random.lognormal(10., 2.)) # strongly biased towards bright galaxies
152
- ind_random = int(np.random.lognormal(12., 3.)) # biased towards bright galaxies
 
 
153
 
154
- radec_random = CAT.load_from_catalogue_indices(include_extra_features=False,
155
  inds_load=[ind_random])
156
- ra_search = radec_random['ra'][0]
157
- dec_search = radec_random['dec'][0]
158
-
159
- # Find index of closest galaxy to search location. This galaxy becomes query
160
- CAT.search_catalogue(ra_search, dec_search)
161
-
162
- print('Galaxy index used= ', CAT.query_ind)
163
- # Find indexes of similar galaxies to query
164
- #st.write('Searching through the brightest {:,} galaxies in the DECaLS survey to find the most similar to your request. More to come soon!'.format(ngals_tot))
165
-
166
- CAT.similarity_search(nnearest=num_similar_query+1,
167
- similarity_inv=similarity_inv,
168
- model_version=model_version) # +1 to include self
169
 
170
- # Get info for similar objects
171
- similarity_catalogue = CAT.load_from_catalogue_indices(include_extra_features=True)
172
- similarity_catalogue['similarity'] = CAT.similarity_score
173
-
174
- # Get urls from legacy survey
175
- urls = urls_from_coordinates(similarity_catalogue, npix=npix_show)
176
- similarity_catalogue['url'] = np.array(urls)
177
-
178
- # Plot query image. Put in center columns to ensure it remains centered upon display
179
 
180
- ncolumns = min(11, int(math.ceil(np.sqrt(num_nearest))))
181
- nrows = int(math.ceil(num_nearest/ncolumns))
182
-
183
- lab = 'Query galaxy'
184
- lab_radec = 'RA, Dec = ({:.4f}, {:.4f})'.format(similarity_catalogue['ra'][0], similarity_catalogue['dec'][0])
185
- cols = st.columns([2]+[1*ncolumns])
186
- cols[0].subheader(lab)
187
- cols[1].subheader('Most similar galaxies')
 
 
 
 
188
 
189
- cols = st.columns([2]+[1]*ncolumns)
190
- cols[0].image(urls[0],
191
  use_column_width='always',
192
  caption=lab_radec)#use_column_width='auto')
193
- # plot rest of images in smaller grid format
194
-
195
-
196
- iimg = 1 # start at 1 as we already included first image above
197
- for irow in range(nrows):
198
- for icol in range(ncolumns):
199
- url = urls[iimg]
200
- lab = 'Similarity={:.2f}\n'.format(similarity_catalogue['similarity'][iimg]) #+ lab
201
- if ncolumns > 5:
202
- lab = None
203
-
204
- # add image to grid
205
- cols[icol+1].image(url, caption=lab, use_column_width='always')
206
- iimg += 1
207
 
208
- # convert similarity_catalogue to pandas dataframe to display and download
209
  bands = ['g', 'r', 'z']
210
 
211
- similarity_catalogue_out = {} # split > 1D arrays into 1D columns
212
- for k, v in similarity_catalogue.items():
213
- # assume max dimensionality of 2
214
- if v.ndim == 2:
215
- for iband in range(v.shape[1]):
216
- similarity_catalogue_out['{:s}_{:s}'.format(k, bands[iband])] = v[:, iband]
217
 
218
  else:
219
- similarity_catalogue_out[k] = v
220
 
221
- # convert format of source_type, else does not show properly in table
222
- similarity_catalogue_out['source_type'] = similarity_catalogue_out['source_type'].astype('str')
223
- df = pd.DataFrame.from_dict(similarity_catalogue_out)
224
 
225
- # Sort columns to lead with the most useful ones
226
- cols_leading = ['ra', 'dec', 'similarity']
227
- cols = cols_leading + [col for col in df if col not in cols_leading]
228
- df = df[cols]
229
 
230
  # display table
231
- st.write(df.head(num_nearest_max))#vals[-1]))
232
 
233
  # show a downloadable link
234
- st.markdown(get_table_download_link(df), unsafe_allow_html=True)
235
 
236
- tend = time.time()
237
 
238
  st.set_page_config(
239
  page_title='Galaxy Finder',
 
14
  """
15
  <style>
16
  [data-testid="stSidebar"][aria-expanded="true"] > div:first-child {
17
+ width: 400px;
18
  }
19
  [data-testid="stSidebar"][aria-expanded="false"] > div:first-child {
20
+ width: 400px;
21
+ margin-left: -400px;
22
  }
23
  </style>
24
  """,
 
32
  Created by [Abhinit Sundar](https://github.com/asundar0128)
33
  """)
34
 
35
+ germanShepherdEvaluationMethod = germanShepherdHeaderColumns[-1].button('Intrigued with German Shepherds?')
36
+ if germanShepherdEvaluationMethod:
37
+ germanShepherdExplainMethod()
38
  else:
39
+ germanShepherdEvaluation()
40
 
41
 
42
+ def germanShepherdExplainMethod():
43
+ st.button('Traverse Back to German Shepherd Dog Photos')
44
 
45
  st.markdown(
46
  """
 
 
 
 
 
 
 
 
 
 
47
  Dataset:
48
 
49
+ - We used German Shepherd Dog images from random online sources and fed them into the neural network using ResNet-50 to train the model accuractely. Afterwards, this model is representative of every German Shepherd and is an accurate classifier and predictor for the entire dataset, or paramter of interest, which is the German Shepherd breed of dogs.
50
+ Created by [Abhinit Sundar](https://github.com/asundar0128)
 
 
51
  """
52
  )
53
+ st.button('Traverse Back to German Shepherd Dog Photos', key='German Shepherd')
54
 
55
 
56
+ def germanShepherdEvaluation():
 
 
 
 
 
 
57
 
58
+ germanShepherdSimilarityMeasureIndex = ['maximum correlation and maximum similarity', 'least correlation and least similarity']
59
+ germanShepherdNearestValues = [w**2 for w in range(4, 11)]
60
+ germanShepherdMaximumProximity = 1000
61
+ germanShepherdPixel = [96, 152, 256]
62
+ germanShepherdModelVersions = ['v1', 'v2']
63
+ germanShepherdMinimalIndex = 2500
64
+ germanShepherdModelStartTime = time.time()
65
 
66
+ with st.sidebar.expander('Germam Shepherd Similarity Search Instructions'):
 
 
 
 
 
 
 
 
 
 
 
 
 
67
  st.markdown(
68
  """
69
+ **Select a photo of a particular German Shepherd of interest!**
70
 
71
+ Click the 'Find German Shepherd' button.
 
72
  """
73
  )
74
+ germanshepherdNumNearest = st.sidebar.select_slider('Total Quantity of Similar German Shepherds to Show', germanShepherdNearestValues)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
75
 
76
+ germanShepherdEvaluatePixel = st.sidebar.select_slider('Picture Size (In Pixels)', germanShepherdPixel, value=germanSheperdPixel[1])
77
 
78
+ germanShepherdModelVersion = st.sidebar.select_slider('German Shepherd Current Model Version', germanShepherdModelVersions, value=germanShepherdModelVersions[-1])
79
 
80
+ germanShepherdQuantitySimilarElements = 1000
81
 
82
+ germanShepherdSimilarityMetric = False
83
+ germanShepherdBeginSearch = st.sidebar.button('Enter your search parameter of interest for German Shepherd')
84
+ germanShepherdBeginRandomSearch = st.sidebar.button('Evaluate Random German Shepherd Images')
85
 
 
 
 
 
86
  LC = LoadCatalogue()
87
+ germanShepherdCategory = LC.download_catalogue_files(include_extra_features=True)
 
 
 
 
 
 
88
 
89
+ germanShepherdNumberTotal = germanShepherdCategory['Total Number of German Shepherds']
90
 
91
+ CAT = Catalogue(germanShepherdCategory)
 
 
 
 
92
 
93
+ if germanShepherdBeginSearch or germanShepherdRandomSearch:
94
+ if germanShepherdRandomSearch:
95
+
96
+ germanShepherdMaximumIndex = germanShepherdNumberTotal-1
97
+ germanShepherdRandomIndex = 0
98
+ while (germanShepherdRandomIndex < germanShepherdMinimalIndex) or (germanShepherdRandomIndex > germanShepherdMaximumIndex):
99
+ germanShepherdRandomIndex = int(np.random.lognormal(12., 3.))
100
 
101
+ germanShepherdRawDecimalRandom = CAT.load_from_catalogue_indices(include_extra_features=False,
102
  inds_load=[ind_random])
103
+
104
+ print('German Shepherd Image Index Used = ', CAT.query_ind)
 
 
 
 
 
 
 
 
 
 
 
105
 
106
+ CAT.similarity_search(germanShepherdClosest=germanShepherdQuantitySimilarElements+1,
107
+ germanShepherdSimilarityMetric=similarity_inv,
108
+ germanShepherdModelVersion=model_version)
 
 
 
 
 
 
109
 
110
+ germanShepherdSimilarityValue = CAT.load_from_catalogue_indices(include_extra_features=True)
111
+ germanShepherdSimilarityValue['German Shepherd Proximity Value'] = CAT.similarity_score
112
+ germanShepherdLinks = urls_from_coordinates(germanShepherdSimilarityValue, germanShepherdPixelValue=germanShepherdEvaluatePixel)
113
+ germanShepherdSimilarityValue['Link to German Shepherd'] = np.array(germanShepherdLink)
114
+ germanShepherdNumberColumns = min(11, int(math.ceil(np.sqrt(germanshepherdNumNearest))))
115
+ germanShepherdNumberRows = int(math.ceil(germanshepherdNumNearest/germanShepherdNumberColumns))
116
+
117
+ germanShepherdLabel = 'Traverse Through German Shepherd'
118
+ germanShepherdLabelRawDecimal = 'RA, Dec = ({:.4f}, {:.4f})'.format(germanShepherdSimilarityValue['Raw German Shepherd Image'][0], germanShepherdSimilarityValue['Decimal German Shepherd Image'][0])
119
+ germanShepherdColumns = st.columns([2]+[1*germanShepherdNumberColumns])
120
+ germanShepherdColumns[0].subheader(lab)
121
+ germanShepherdColumns[1].subheader('Closest German Shepherd Photos')
122
 
123
+ germanShepherdColumns = st.columns([2]+[1]*germanShepherdNumberColumns)
124
+ germanShepherdColumns[0].image(germanShepherdLink[0],
125
  use_column_width='always',
126
  caption=lab_radec)#use_column_width='auto')
127
+
128
+ germanShepherdIndexImage = 1
129
+ for germanShepherdIndexRow in range(germanShepherdNumberRows):
130
+ for germanShepherdIndexColumn in range(germanShepherdNumberColumns):
131
+ germanShepherdLink = germanShepherdLinks[germanShepherdIndexImage]
132
+ germanShepherdLabelValue = 'German Shepherd Similarity Value={:.2f}\n'.format(germanShepherdSimilarityValue['German Shepherd Similarity'][germanShepherdIndexImage])
133
+ if germanShepherdNumberColumns > 5:
134
+ germanShepherdLabelValue = None
135
+ germanShepherdColumns[germanShepherdIndexColumn+1].image(germanShepherdLink, caption=lab, use_column_width='always')
136
+ germanShepherdIndexImage += 1
 
 
 
 
137
 
 
138
  bands = ['g', 'r', 'z']
139
 
140
+ germanShepherdSimilarityValueOutput = {}
141
+ for q, x in germanShepherdSimilarityValue.items():
142
+ if x.ndim == 2:
143
+ for germanShepherdIndexValues in range(v.shape[1]):
144
+ germanShepherdSimilarityValueOutput['{:s}_{:s}'.format(q, germanShepherdValueatIndex[germanShepherdIndexValues])] = x[:, germanShepherdIndexValues]
 
145
 
146
  else:
147
+ germanShepherdSimilarityValueOutput[q] = x
148
 
149
+ germanShepherdSimilarityValueOutput['German Shepherd Source Value'] = germanShepherdSimilarityValueOutput['German Shepherd Source Value'].astype('str')
150
+ germanShepherdDataFrame = pd.DataFrame.from_dict(germanShepherdSimilarityValueOutput)
 
151
 
152
+ cols_leading = ['German Shepherd Raw Pixel Value', 'German Shepherd Decimal Pixel Value', 'German Shepherd Similarity Index']
153
+ cols = cols_leading + [germanShepherdColumn for germanShepherdColumn in df if germanShepherdColumn not in germanShepherdLeadingColumn]
154
+ germanShepherdDatFrame = germanShepherdDataFrame[germanShepherdColumns]
 
155
 
156
  # display table
157
+ st.write(df.head(germanShepherdMaximumProximity))
158
 
159
  # show a downloadable link
160
+ st.markdown(get_table_download_link(germanShepherdDataFrame), unsafe_allow_html=True)
161
 
162
+ germanShepherdTendency = time.time()
163
 
164
  st.set_page_config(
165
  page_title='Galaxy Finder',