Spaces:
Sleeping
Sleeping
File size: 9,959 Bytes
c0e150a 92c232e c20f028 20dc96d 92c232e c0e150a 9a61a7e c0e150a c20f028 c0e150a 7a3f67d c20f028 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 6f1447b c0e150a 6f1447b c0e150a 20dc96d c0e150a 6f1447b c20f028 c0e150a 9a0cd94 c0e150a c20f028 9a61a7e c0e150a c20f028 068ea38 4b239eb 20dc96d 7a3f67d 96b668f c0e150a 9a61a7e 7a3f67d c0e150a c20f028 6f1447b c0e150a 6f1447b c0e150a 9a61a7e 20dc96d 9a61a7e 20dc96d c0e150a 9a61a7e c20f028 9a61a7e c0e150a 9a61a7e c20f028 c0e150a c20f028 c0e150a 19152e1 c0e150a c20f028 7a3f67d c20f028 7a3f67d c20f028 7d7ee3c c20f028 c0e150a 20dc96d 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 20dc96d c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e 1154062 c0e150a 9a61a7e c0e150a 6f1447b 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e 20dc96d c0e150a c20f028 6f1447b c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 9a61a7e c0e150a 6f1447b c0e150a 9a61a7e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 | # streamlit_app.py
"""
Chicago Parks in Motion — Streamlit app
Author: Juhi Khare (jkhare2), Alisha Rawat (alishar4), Sutthana Koo-Anupong (sk188)
Primary dataset: Chicago Park District Activities
Data source (CSV endpoint): https://data.cityofchicago.org/resource/tn7v-6rnw.csv
"""
import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
st.set_page_config(page_title="Chicago Parks in Motion", layout="wide")
# -------------------------
# Helper: Load & preprocess
# -------------------------
@st.cache_data(ttl=3600)
def load_data():
csv_url = "https://data.cityofchicago.org/resource/tn7v-6rnw.csv"
try:
df = pd.read_csv(csv_url, dtype=str)
except Exception as e:
st.error("Could not load dataset from the City of Chicago portal.")
raise e
df.columns = [c.strip() for c in df.columns]
if "fee" in df.columns:
df["fee"] = pd.to_numeric(df["fee"], errors="coerce")
# Extract lat/lon
def extract_latlon(val):
if pd.isna(val):
return (np.nan, np.nan)
sval = str(val)
if "POINT" in sval:
try:
inside = sval.split("(", 1)[1].rstrip(")")
lon, lat = map(float, inside.split())
return lat, lon
except:
return (np.nan, np.nan)
return (np.nan, np.nan)
if "location" in df.columns:
latlon = df["location"].map(extract_latlon)
df["latitude"] = latlon.map(lambda x: x[0])
df["longitude"] = latlon.map(lambda x: x[1])
else:
df["latitude"] = np.nan
df["longitude"] = np.nan
# Dates
for c in ["start_date", "end_date"]:
if c in df.columns:
df[c] = pd.to_datetime(df[c], errors="coerce")
# Activity type clean
if "activity_type" in df.columns:
df["activity_type_clean"] = df["activity_type"].str.title().fillna("Unknown")
else:
df["activity_type_clean"] = "Unknown"
# Park name
possible_names = ["park_name", "park", "location_facility", "location_name", "site_name"]
park_col = next((col for col in possible_names if col in df.columns), None)
if park_col:
df["park_name"] = df[park_col].astype(str).replace(["", "nan", "None"], "Unknown Park")
else:
df["park_name"] = "Unknown Park"
return df
df = load_data()
# -------------------------
# Title
# -------------------------
st.title("Chicago Parks in Motion: How Our City Plays")
st.markdown("**Authors:** Juhi Khare (jkhare2), Alisha Rawat (alishar4), Sutthana Koo-Anupong (sk188)")
# -------------------------
# Sidebar filters
# -------------------------
st.sidebar.header("Filters & Settings")
categories = sorted(df["activity_type_clean"].dropna().unique())
chosen_category = st.sidebar.selectbox("Activity category", ["All"] + categories)
# Season detection
def season_from_date(dt):
if pd.isna(dt): return "Unknown"
m = dt.month
if m in [12,1,2]: return "Winter"
if m in [3,4,5]: return "Spring"
if m in [6,7,8]: return "Summer"
return "Fall"
df["season"] = df["start_date"].map(season_from_date)
seasons = sorted(df["season"].unique())
chosen_season = st.sidebar.selectbox("Season", ["All"] + seasons)
if "fee" in df.columns:
max_fee = float(df["fee"].fillna(0).max())
fee_limit = st.sidebar.slider("Maximum fee (USD)", 0.0, max_fee, max_fee)
else:
fee_limit = None
park_search = st.sidebar.text_input("Search park name (partial)")
# Accessibility hint
st.sidebar.caption("Filters help beginners explore the dataset easily without technical skills.")
# -------------------------
# Filtering logic
# -------------------------
filtered = df.copy()
if chosen_category != "All":
filtered = filtered[filtered["activity_type_clean"] == chosen_category]
if chosen_season != "All":
filtered = filtered[filtered["season"] == chosen_season]
if fee_limit is not None:
filtered = filtered[filtered["fee"].fillna(0) <= fee_limit]
if park_search:
filtered = filtered[filtered["park_name"].str.contains(park_search, case=False)]
st.sidebar.write(f"Programs shown: **{len(filtered):,}**")
# -------------------------
# CENTRAL VISUALIZATION
# -------------------------
st.header("Central Interactive Visualization — Programs by Park")
view = st.radio("Choose a view:", ["Map (recommended)", "Bar chart"], horizontal=True)
if view.startswith("Map"):
# Aggregate for map
agg = (
filtered.groupby(["park_name", "latitude", "longitude"], dropna=True)
.size().reset_index(name="count")
)
if agg.dropna().shape[0] > 0:
fig_map = px.scatter_mapbox(
agg,
lat="latitude",
lon="longitude",
size="count",
color="count",
color_continuous_scale="Bluered",
size_max=28,
zoom=10,
hover_name="park_name",
hover_data={"count": True},
height=600,
)
fig_map.update_layout(mapbox_style="open-street-map", margin=dict(l=0,r=0,b=0,t=0))
st.plotly_chart(fig_map, use_container_width=True)
else:
st.warning("No geographic coordinates available for this filtered view.")
else:
agg = filtered.groupby("park_name").size().reset_index(name="count")
agg = agg.sort_values("count", ascending=False).head(20)
fig_bar = px.bar(
agg,
x="count",
y="park_name",
orientation="h",
color="count",
color_continuous_scale="Cividis",
height=600,
)
fig_bar.update_layout(yaxis={'categoryorder':'total ascending'})
st.plotly_chart(fig_bar, use_container_width=True)
# Explanation under central viz
st.markdown("""
**What this visualization shows:**
This is our main visualization because it helps readers understand where activities are happening across Chicago’s parks.
The map shows each park as a circle, where larger and darker circles represent locations with more programs.
This makes it easy to see which areas are activity hubs and which are quieter. The filters allow anyone to explore patterns by season,
category, price, or park—without needing technical experience.
""")
# -------------------------
# CONTEXTUAL VISUALIZATION 1
# -------------------------
st.header("Contextual Visualization 1 — Activity Category Breakdown")
cat_counts = df["activity_type_clean"].value_counts().reset_index()
cat_counts.columns = ["activity_type", "count"]
fig_cat = px.pie(
cat_counts,
names="activity_type",
values="count",
hole=0.35,
color_discrete_sequence=px.colors.sequential.RdBu
)
st.plotly_chart(fig_cat, use_container_width=True)
st.markdown("""
**Why this matters:**
This chart shows what kinds of activities Chicago parks offer most often—such as sports, aquatics, arts, or youth programs.
It helps readers understand the variety of programs available across the city.
Using a simple color palette keeps the chart readable for people who may not be familiar with data visualization.
""")
# -------------------------
# CONTEXTUAL VISUALIZATION 2
# -------------------------
st.header("Contextual Visualization 2 — Programs by Season")
season_counts = df["season"].value_counts().reset_index()
season_counts.columns = ["Season", "Program Count"]
fig_season = px.bar(
season_counts,
x="Season",
y="Program Count",
color="Program Count",
color_continuous_scale="Tealgrn",
text="Program Count",
height=500,
)
fig_season.update_traces(textposition="outside")
st.plotly_chart(fig_season, use_container_width=True)
st.markdown("""
**Why this is helpful:**
This chart shows when programs are most active throughout the year.
Comparing seasons helps readers see whether summer is the busiest time, or whether activities are spread evenly.
This makes it easier for residents and planners to understand how weather, school schedules, and community needs
shape the timing of park programs.
""")
# -------------------------
# FINAL 3-PARAGRAPH EXPLANATION (as provided by you, unchanged)
# -------------------------
st.header("📝 What this data story is showing")
st.markdown("""
Chicago’s parks offer many kinds of activities for people of all ages. These include sports, arts, fitness classes, youth programs, and seasonal events. Each row in this dataset represents one program offered at a park. Our main interactive map helps readers quickly see which parks offer the most activities. Bigger or darker circles show parks with more programs, making it easy to spot busy parks versus quieter ones.
Where a park is located also matters. Neighborhoods that are larger or more central usually have more programs because they have more space, more facilities, and more visitors. With the filters on the left, anyone can explore the data by season, activity type, price, or park name. This makes the information easy to use even for someone with no data experience. For example, you can look for free programs, summer-only programs, or activities at a specific park in your neighborhood.
This project also highlights questions about access and opportunities. Some parks offer a wide range of programs, while others have fewer options or mostly offer only one type of activity. By looking at categories, seasons, and fees, readers can start to see patterns in which communities have more choices and which ones may need more support. Our goal is to turn public data into something simple and useful, so Chicago residents and decision-makers can better understand how parks are serving their communities.
""")
# -------------------------
# CITATIONS
# -------------------------
st.markdown("---")
st.subheader("Citations & Data Sources")
st.markdown("""
**Primary dataset:**
Chicago Park District Activities — City of Chicago Data Portal
https://data.cityofchicago.org/Parks-Recreation/Chicago-Park-District-Activities/tn7v-6rnw
""")
|