Spaces:
Sleeping
Sleeping
refined the plots
Browse files
app.py
CHANGED
|
@@ -161,36 +161,93 @@ District of Columbia Mayor’s Office. (2023, June 9). *Air quality continues to
|
|
| 161 |
"""
|
| 162 |
)
|
| 163 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 164 |
st.markdown("""
|
| 165 |
-
|
| 166 |
-
|
| 167 |
-
|
| 168 |
-
We experimented with different data visualizations and refresh mechanisms. The prompt's pseudo-real-time requirement
|
| 169 |
-
led us to implement a simple 'Refresh Data' button. This choice balanced complexity
|
| 170 |
-
with user control.
|
| 171 |
-
|
| 172 |
-
Before settling on our current format, we tried various approaches:
|
| 173 |
-
- We tested line charts displaying multiple pollutants at once, but found it overwhelming.
|
| 174 |
-
- We considered scatter plots to reveal correlations between pollutants. While interesting,
|
| 175 |
-
it felt too complex for a first draft.
|
| 176 |
-
- We focused instead on ensuring data quality, clarity in timestamps, and straightforward
|
| 177 |
-
on-demand refreshing to highlight core functionalities.
|
| 178 |
-
|
| 179 |
-
This phase wasn't about perfection. It was about testing ideas, learning what worked, and
|
| 180 |
-
laying down a foundation for future enhancements.
|
| 181 |
""")
|
| 182 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 183 |
st.markdown("""
|
| 184 |
-
|
| 185 |
-
|
| 186 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 187 |
""")
|
| 188 |
-
selected_pollutant = "pm10"
|
| 189 |
-
line_chart = alt.Chart(df_air_quality).mark_line(point=True).encode(
|
| 190 |
-
x=alt.X('datetime:O', sort='x'),
|
| 191 |
-
y=alt.Y(f'{selected_pollutant}:Q', title=selected_pollutant.upper()),
|
| 192 |
-
tooltip=['datetime', selected_pollutant]).properties(width=800, height=400)
|
| 193 |
-
st.altair_chart(line_chart, use_container_width=True)
|
| 194 |
-
|
| 195 |
-
#By leaving this commented code and narrative here, we give our peers insight into
|
| 196 |
-
#our thought process—this is the "viz for peers" stage, not a polished final product.
|
|
|
|
| 161 |
"""
|
| 162 |
)
|
| 163 |
|
| 164 |
+
# -----------------------------------------------------------
|
| 165 |
+
# Additional Trial and Error Visualizations (Unpolished/Cluttered)
|
| 166 |
+
# Place this code block at the bottom of your existing code
|
| 167 |
+
# -----------------------------------------------------------
|
| 168 |
+
|
| 169 |
+
import altair as alt
|
| 170 |
+
|
| 171 |
+
st.markdown("""
|
| 172 |
+
### Early Trial Visualizations (Cluttered Prototypes)
|
| 173 |
+
|
| 174 |
+
Below are some of our initial attempts at visualizing all pollutants together.
|
| 175 |
+
These attempts are intentionally left here to demonstrate the "scaffolding" nature
|
| 176 |
+
of our work. They are cluttered and not very user-friendly, but they show our trial-and-error process.
|
| 177 |
+
""")
|
| 178 |
+
|
| 179 |
+
# Melt the dataframe to a long format for plotting multiple pollutants at once
|
| 180 |
+
long_df = df_air_quality.melt(id_vars="datetime", value_vars=air_quality_vars,
|
| 181 |
+
var_name="Pollutant", value_name="Concentration")
|
| 182 |
+
|
| 183 |
+
# Attempt 1: A single line chart with ALL pollutants at once
|
| 184 |
+
# This leads to a very cluttered chart where it's hard to distinguish individual lines.
|
| 185 |
+
st.markdown("#### Attempt 1: All Pollutants in One Line Chart")
|
| 186 |
+
all_in_one_line = alt.Chart(long_df).mark_line().encode(
|
| 187 |
+
x=alt.X('datetime:O', title='Date and Time'),
|
| 188 |
+
y=alt.Y('Concentration:Q', title='Concentration'),
|
| 189 |
+
color=alt.Color('Pollutant:N', legend=alt.Legend(title='Pollutant')),
|
| 190 |
+
tooltip=['datetime', 'Pollutant', 'Concentration']
|
| 191 |
+
).properties(
|
| 192 |
+
width=700,
|
| 193 |
+
height=400,
|
| 194 |
+
title="A Very Overcrowded Line Chart"
|
| 195 |
+
)
|
| 196 |
+
st.altair_chart(all_in_one_line, use_container_width=True)
|
| 197 |
+
|
| 198 |
st.markdown("""
|
| 199 |
+
*As you can see, this single chart becomes difficult to interpret due to the sheer
|
| 200 |
+
number of lines and colors overlapping. While it technically "works," it doesn't provide
|
| 201 |
+
clear insights at a glance.*
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 202 |
""")
|
| 203 |
|
| 204 |
+
# Attempt 2: A scatter plot of all pollutants over time
|
| 205 |
+
# Again, this will be cluttered. Each pollutant on the same time axis, different colors.
|
| 206 |
+
# With so many pollutants, the chart becomes a mass of points.
|
| 207 |
+
st.markdown("#### Attempt 2: Scatter Plot of All Pollutants Over Time")
|
| 208 |
+
all_in_one_scatter = alt.Chart(long_df).mark_circle(size=40).encode(
|
| 209 |
+
x=alt.X('datetime:O', title='Date and Time'),
|
| 210 |
+
y=alt.Y('Concentration:Q', title='Concentration'),
|
| 211 |
+
color=alt.Color('Pollutant:N', legend=alt.Legend(title='Pollutant')),
|
| 212 |
+
tooltip=['datetime', 'Pollutant', 'Concentration']
|
| 213 |
+
).properties(
|
| 214 |
+
width=700,
|
| 215 |
+
height=400,
|
| 216 |
+
title="Scatter Plot with All Pollutants"
|
| 217 |
+
)
|
| 218 |
+
st.altair_chart(all_in_one_scatter, use_container_width=True)
|
| 219 |
+
|
| 220 |
st.markdown("""
|
| 221 |
+
*This scatter plot presents all pollutants simultaneously as well. While we can see
|
| 222 |
+
some variance in concentration over time, the chart is noisy and doesn't direct the user
|
| 223 |
+
to any immediate insights. It's a good reminder that "more data on one chart"
|
| 224 |
+
does not always mean "more understanding."*
|
| 225 |
+
""")
|
| 226 |
+
|
| 227 |
+
st.markdown("#### Attempt 3: Bar Chart of All Pollutants at a Single Timestamp")
|
| 228 |
+
first_time = df_air_quality['datetime'].iloc[0]
|
| 229 |
+
single_point_data = long_df[long_df['datetime'] == first_time]
|
| 230 |
+
|
| 231 |
+
all_in_one_bar = alt.Chart(single_point_data).mark_bar().encode(
|
| 232 |
+
x=alt.X('Pollutant:N', sort=None, title='Pollutant'),
|
| 233 |
+
y=alt.Y('Concentration:Q', title='Concentration'),
|
| 234 |
+
tooltip=['Pollutant', 'Concentration']
|
| 235 |
+
).properties(
|
| 236 |
+
width=700,
|
| 237 |
+
height=400,
|
| 238 |
+
title=f"Bar Chart at {first_time}"
|
| 239 |
+
)
|
| 240 |
+
st.altair_chart(all_in_one_bar, use_container_width=True)
|
| 241 |
+
|
| 242 |
+
st.markdown("""
|
| 243 |
+
*At a single timestamp, a bar chart of all pollutants quickly becomes unwieldy if
|
| 244 |
+
we have too many pollutants. Even though it's simpler than a time-series plot,
|
| 245 |
+
it's still not very informative due to the volume of categories.*
|
| 246 |
+
|
| 247 |
+
---
|
| 248 |
+
|
| 249 |
+
These attempts illustrate that while we can technically display all the data
|
| 250 |
+
at once, it's not always the most practical or insightful approach.
|
| 251 |
+
This helps us understand which visualizations to refine
|
| 252 |
+
and which ones to discard or simplify in future iterations.
|
| 253 |
""")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|