mihir-s commited on
Commit
9636c0b
·
verified ·
1 Parent(s): 3f6015a

refined the plots

Browse files
Files changed (1) hide show
  1. app.py +85 -28
app.py CHANGED
@@ -161,36 +161,93 @@ District of Columbia Mayor’s Office. (2023, June 9). *Air quality continues to
161
  """
162
  )
163
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
164
  st.markdown("""
165
- ### Reflecting on the Process for Part 2
166
-
167
- Throughout this development phase, we approached the observatory as a prototype.
168
- We experimented with different data visualizations and refresh mechanisms. The prompt's pseudo-real-time requirement
169
- led us to implement a simple 'Refresh Data' button. This choice balanced complexity
170
- with user control.
171
-
172
- Before settling on our current format, we tried various approaches:
173
- - We tested line charts displaying multiple pollutants at once, but found it overwhelming.
174
- - We considered scatter plots to reveal correlations between pollutants. While interesting,
175
- it felt too complex for a first draft.
176
- - We focused instead on ensuring data quality, clarity in timestamps, and straightforward
177
- on-demand refreshing to highlight core functionalities.
178
-
179
- This phase wasn't about perfection. It was about testing ideas, learning what worked, and
180
- laying down a foundation for future enhancements.
181
  """)
182
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
183
  st.markdown("""
184
- Early Prototype:
185
- Trying a line chart with multiple pollutants:
186
- This code was functional but visually cluttered.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
187
  """)
188
- selected_pollutant = "pm10"
189
- line_chart = alt.Chart(df_air_quality).mark_line(point=True).encode(
190
- x=alt.X('datetime:O', sort='x'),
191
- y=alt.Y(f'{selected_pollutant}:Q', title=selected_pollutant.upper()),
192
- tooltip=['datetime', selected_pollutant]).properties(width=800, height=400)
193
- st.altair_chart(line_chart, use_container_width=True)
194
-
195
- #By leaving this commented code and narrative here, we give our peers insight into
196
- #our thought process—this is the "viz for peers" stage, not a polished final product.
 
161
  """
162
  )
163
 
164
+ # -----------------------------------------------------------
165
+ # Additional Trial and Error Visualizations (Unpolished/Cluttered)
166
+ # Place this code block at the bottom of your existing code
167
+ # -----------------------------------------------------------
168
+
169
+ import altair as alt
170
+
171
+ st.markdown("""
172
+ ### Early Trial Visualizations (Cluttered Prototypes)
173
+
174
+ Below are some of our initial attempts at visualizing all pollutants together.
175
+ These attempts are intentionally left here to demonstrate the "scaffolding" nature
176
+ of our work. They are cluttered and not very user-friendly, but they show our trial-and-error process.
177
+ """)
178
+
179
+ # Melt the dataframe to a long format for plotting multiple pollutants at once
180
+ long_df = df_air_quality.melt(id_vars="datetime", value_vars=air_quality_vars,
181
+ var_name="Pollutant", value_name="Concentration")
182
+
183
+ # Attempt 1: A single line chart with ALL pollutants at once
184
+ # This leads to a very cluttered chart where it's hard to distinguish individual lines.
185
+ st.markdown("#### Attempt 1: All Pollutants in One Line Chart")
186
+ all_in_one_line = alt.Chart(long_df).mark_line().encode(
187
+ x=alt.X('datetime:O', title='Date and Time'),
188
+ y=alt.Y('Concentration:Q', title='Concentration'),
189
+ color=alt.Color('Pollutant:N', legend=alt.Legend(title='Pollutant')),
190
+ tooltip=['datetime', 'Pollutant', 'Concentration']
191
+ ).properties(
192
+ width=700,
193
+ height=400,
194
+ title="A Very Overcrowded Line Chart"
195
+ )
196
+ st.altair_chart(all_in_one_line, use_container_width=True)
197
+
198
  st.markdown("""
199
+ *As you can see, this single chart becomes difficult to interpret due to the sheer
200
+ number of lines and colors overlapping. While it technically "works," it doesn't provide
201
+ clear insights at a glance.*
 
 
 
 
 
 
 
 
 
 
 
 
 
202
  """)
203
 
204
+ # Attempt 2: A scatter plot of all pollutants over time
205
+ # Again, this will be cluttered. Each pollutant on the same time axis, different colors.
206
+ # With so many pollutants, the chart becomes a mass of points.
207
+ st.markdown("#### Attempt 2: Scatter Plot of All Pollutants Over Time")
208
+ all_in_one_scatter = alt.Chart(long_df).mark_circle(size=40).encode(
209
+ x=alt.X('datetime:O', title='Date and Time'),
210
+ y=alt.Y('Concentration:Q', title='Concentration'),
211
+ color=alt.Color('Pollutant:N', legend=alt.Legend(title='Pollutant')),
212
+ tooltip=['datetime', 'Pollutant', 'Concentration']
213
+ ).properties(
214
+ width=700,
215
+ height=400,
216
+ title="Scatter Plot with All Pollutants"
217
+ )
218
+ st.altair_chart(all_in_one_scatter, use_container_width=True)
219
+
220
  st.markdown("""
221
+ *This scatter plot presents all pollutants simultaneously as well. While we can see
222
+ some variance in concentration over time, the chart is noisy and doesn't direct the user
223
+ to any immediate insights. It's a good reminder that "more data on one chart"
224
+ does not always mean "more understanding."*
225
+ """)
226
+
227
+ st.markdown("#### Attempt 3: Bar Chart of All Pollutants at a Single Timestamp")
228
+ first_time = df_air_quality['datetime'].iloc[0]
229
+ single_point_data = long_df[long_df['datetime'] == first_time]
230
+
231
+ all_in_one_bar = alt.Chart(single_point_data).mark_bar().encode(
232
+ x=alt.X('Pollutant:N', sort=None, title='Pollutant'),
233
+ y=alt.Y('Concentration:Q', title='Concentration'),
234
+ tooltip=['Pollutant', 'Concentration']
235
+ ).properties(
236
+ width=700,
237
+ height=400,
238
+ title=f"Bar Chart at {first_time}"
239
+ )
240
+ st.altair_chart(all_in_one_bar, use_container_width=True)
241
+
242
+ st.markdown("""
243
+ *At a single timestamp, a bar chart of all pollutants quickly becomes unwieldy if
244
+ we have too many pollutants. Even though it's simpler than a time-series plot,
245
+ it's still not very informative due to the volume of categories.*
246
+
247
+ ---
248
+
249
+ These attempts illustrate that while we can technically display all the data
250
+ at once, it's not always the most practical or insightful approach.
251
+ This helps us understand which visualizations to refine
252
+ and which ones to discard or simplify in future iterations.
253
  """)