File size: 9,959 Bytes
c0e150a
 
 
 
 
 
 
 
92c232e
c20f028
 
 
20dc96d
92c232e
c0e150a
9a61a7e
c0e150a
 
 
c20f028
 
c0e150a
 
 
 
 
 
 
 
7a3f67d
c20f028
 
 
9a61a7e
c0e150a
9a61a7e
c0e150a
 
 
9a61a7e
c0e150a
 
 
9a61a7e
c0e150a
 
6f1447b
 
c0e150a
6f1447b
 
c0e150a
 
 
20dc96d
c0e150a
6f1447b
 
 
c20f028
c0e150a
9a0cd94
c0e150a
c20f028
9a61a7e
 
 
c0e150a
 
 
 
 
 
c20f028
068ea38
4b239eb
20dc96d
7a3f67d
96b668f
c0e150a
 
 
 
9a61a7e
7a3f67d
c0e150a
 
 
 
c20f028
6f1447b
c0e150a
6f1447b
c0e150a
 
9a61a7e
20dc96d
9a61a7e
 
 
20dc96d
 
c0e150a
9a61a7e
c20f028
 
9a61a7e
c0e150a
9a61a7e
c20f028
 
 
c0e150a
c20f028
c0e150a
 
19152e1
c0e150a
 
 
c20f028
7a3f67d
c20f028
7a3f67d
c20f028
7d7ee3c
c20f028
c0e150a
 
20dc96d
9a61a7e
 
c0e150a
 
 
 
9a61a7e
c0e150a
9a61a7e
c0e150a
 
9a61a7e
c0e150a
 
20dc96d
 
c0e150a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9a61a7e
c0e150a
 
 
9a61a7e
c0e150a
9a61a7e
 
c0e150a
 
 
 
 
 
 
 
 
 
9a61a7e
1154062
c0e150a
 
 
 
9a61a7e
 
c0e150a
6f1447b
9a61a7e
 
c0e150a
 
9a61a7e
c0e150a
9a61a7e
 
20dc96d
c0e150a
 
 
 
 
c20f028
6f1447b
c0e150a
 
 
 
9a61a7e
 
 
 
 
 
 
 
 
c0e150a
9a61a7e
c0e150a
9a61a7e
 
c0e150a
9a61a7e
 
c0e150a
 
 
 
 
 
9a61a7e
 
c0e150a
 
 
 
9a61a7e
c0e150a
 
9a61a7e
c0e150a
9a61a7e
c0e150a
 
 
 
9a61a7e
c0e150a
6f1447b
c0e150a
 
 
 
9a61a7e
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
# streamlit_app.py
"""
Chicago Parks in Motion — Streamlit app
Author: Juhi Khare (jkhare2), Alisha Rawat (alishar4), Sutthana Koo-Anupong (sk188)
Primary dataset: Chicago Park District Activities
Data source (CSV endpoint): https://data.cityofchicago.org/resource/tn7v-6rnw.csv
"""

import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px


st.set_page_config(page_title="Chicago Parks in Motion", layout="wide")

# -------------------------
# Helper: Load & preprocess
# -------------------------
@st.cache_data(ttl=3600)
def load_data():
    csv_url = "https://data.cityofchicago.org/resource/tn7v-6rnw.csv"
    try:
        df = pd.read_csv(csv_url, dtype=str)
    except Exception as e:
        st.error("Could not load dataset from the City of Chicago portal.")
        raise e

    df.columns = [c.strip() for c in df.columns]

    if "fee" in df.columns:
        df["fee"] = pd.to_numeric(df["fee"], errors="coerce")

    # Extract lat/lon
    def extract_latlon(val):
        if pd.isna(val):
            return (np.nan, np.nan)
        sval = str(val)
        if "POINT" in sval:
            try:
                inside = sval.split("(", 1)[1].rstrip(")")
                lon, lat = map(float, inside.split())
                return lat, lon
            except:
                return (np.nan, np.nan)
        return (np.nan, np.nan)

    if "location" in df.columns:
        latlon = df["location"].map(extract_latlon)
        df["latitude"] = latlon.map(lambda x: x[0])
        df["longitude"] = latlon.map(lambda x: x[1])
    else:
        df["latitude"] = np.nan
        df["longitude"] = np.nan

    # Dates
    for c in ["start_date", "end_date"]:
        if c in df.columns:
            df[c] = pd.to_datetime(df[c], errors="coerce")

    # Activity type clean
    if "activity_type" in df.columns:
        df["activity_type_clean"] = df["activity_type"].str.title().fillna("Unknown")
    else:
        df["activity_type_clean"] = "Unknown"

    # Park name
    possible_names = ["park_name", "park", "location_facility", "location_name", "site_name"]
    park_col = next((col for col in possible_names if col in df.columns), None)
    if park_col:
        df["park_name"] = df[park_col].astype(str).replace(["", "nan", "None"], "Unknown Park")
    else:
        df["park_name"] = "Unknown Park"

    return df


df = load_data()

# -------------------------
# Title
# -------------------------
st.title("Chicago Parks in Motion: How Our City Plays")
st.markdown("**Authors:** Juhi Khare (jkhare2), Alisha Rawat (alishar4), Sutthana Koo-Anupong (sk188)")

# -------------------------
# Sidebar filters
# -------------------------
st.sidebar.header("Filters & Settings")

categories = sorted(df["activity_type_clean"].dropna().unique())
chosen_category = st.sidebar.selectbox("Activity category", ["All"] + categories)

# Season detection
def season_from_date(dt):
    if pd.isna(dt): return "Unknown"
    m = dt.month
    if m in [12,1,2]: return "Winter"
    if m in [3,4,5]: return "Spring"
    if m in [6,7,8]: return "Summer"
    return "Fall"

df["season"] = df["start_date"].map(season_from_date)
seasons = sorted(df["season"].unique())
chosen_season = st.sidebar.selectbox("Season", ["All"] + seasons)

if "fee" in df.columns:
    max_fee = float(df["fee"].fillna(0).max())
    fee_limit = st.sidebar.slider("Maximum fee (USD)", 0.0, max_fee, max_fee)
else:
    fee_limit = None

park_search = st.sidebar.text_input("Search park name (partial)")

# Accessibility hint
st.sidebar.caption("Filters help beginners explore the dataset easily without technical skills.")

# -------------------------
# Filtering logic
# -------------------------
filtered = df.copy()
if chosen_category != "All":
    filtered = filtered[filtered["activity_type_clean"] == chosen_category]
if chosen_season != "All":
    filtered = filtered[filtered["season"] == chosen_season]
if fee_limit is not None:
    filtered = filtered[filtered["fee"].fillna(0) <= fee_limit]
if park_search:
    filtered = filtered[filtered["park_name"].str.contains(park_search, case=False)]

st.sidebar.write(f"Programs shown: **{len(filtered):,}**")

# -------------------------
# CENTRAL VISUALIZATION
# -------------------------
st.header("Central Interactive Visualization — Programs by Park")

view = st.radio("Choose a view:", ["Map (recommended)", "Bar chart"], horizontal=True)

if view.startswith("Map"):
    # Aggregate for map
    agg = (
        filtered.groupby(["park_name", "latitude", "longitude"], dropna=True)
        .size().reset_index(name="count")
    )

    if agg.dropna().shape[0] > 0:
        fig_map = px.scatter_mapbox(
            agg,
            lat="latitude",
            lon="longitude",
            size="count",
            color="count",
            color_continuous_scale="Bluered",
            size_max=28,
            zoom=10,
            hover_name="park_name",
            hover_data={"count": True},
            height=600,
        )
        fig_map.update_layout(mapbox_style="open-street-map", margin=dict(l=0,r=0,b=0,t=0))
        st.plotly_chart(fig_map, use_container_width=True)
    else:
        st.warning("No geographic coordinates available for this filtered view.")
else:
    agg = filtered.groupby("park_name").size().reset_index(name="count")
    agg = agg.sort_values("count", ascending=False).head(20)

    fig_bar = px.bar(
        agg,
        x="count",
        y="park_name",
        orientation="h",
        color="count",
        color_continuous_scale="Cividis",
        height=600,
    )
    fig_bar.update_layout(yaxis={'categoryorder':'total ascending'})
    st.plotly_chart(fig_bar, use_container_width=True)

# Explanation under central viz
st.markdown("""
**What this visualization shows:**  
This is our main visualization because it helps readers understand where activities are happening across Chicago’s parks.  
The map shows each park as a circle, where larger and darker circles represent locations with more programs.  
This makes it easy to see which areas are activity hubs and which are quieter. The filters allow anyone to explore patterns by season, 
category, price, or park—without needing technical experience.
""")

# -------------------------
# CONTEXTUAL VISUALIZATION 1
# -------------------------
st.header("Contextual Visualization 1 — Activity Category Breakdown")

cat_counts = df["activity_type_clean"].value_counts().reset_index()
cat_counts.columns = ["activity_type", "count"]

fig_cat = px.pie(
    cat_counts,
    names="activity_type",
    values="count",
    hole=0.35,
    color_discrete_sequence=px.colors.sequential.RdBu
)
st.plotly_chart(fig_cat, use_container_width=True)

st.markdown("""
**Why this matters:**  
This chart shows what kinds of activities Chicago parks offer most often—such as sports, aquatics, arts, or youth programs.  
It helps readers understand the variety of programs available across the city.  
Using a simple color palette keeps the chart readable for people who may not be familiar with data visualization.
""")

# -------------------------
# CONTEXTUAL VISUALIZATION 2
# -------------------------
st.header("Contextual Visualization 2 — Programs by Season")

season_counts = df["season"].value_counts().reset_index()
season_counts.columns = ["Season", "Program Count"]

fig_season = px.bar(
    season_counts,
    x="Season",
    y="Program Count",
    color="Program Count",
    color_continuous_scale="Tealgrn",
    text="Program Count",
    height=500,
)
fig_season.update_traces(textposition="outside")

st.plotly_chart(fig_season, use_container_width=True)

st.markdown("""
**Why this is helpful:**  
This chart shows when programs are most active throughout the year.  
Comparing seasons helps readers see whether summer is the busiest time, or whether activities are spread evenly.  
This makes it easier for residents and planners to understand how weather, school schedules, and community needs 
shape the timing of park programs.
""")

# -------------------------
# FINAL 3-PARAGRAPH EXPLANATION (as provided by you, unchanged)
# -------------------------
st.header("📝 What this data story is showing")

st.markdown("""
Chicago’s parks offer many kinds of activities for people of all ages. These include sports, arts, fitness classes, youth programs, and seasonal events. Each row in this dataset represents one program offered at a park. Our main interactive map helps readers quickly see which parks offer the most activities. Bigger or darker circles show parks with more programs, making it easy to spot busy parks versus quieter ones.

Where a park is located also matters. Neighborhoods that are larger or more central usually have more programs because they have more space, more facilities, and more visitors. With the filters on the left, anyone can explore the data by season, activity type, price, or park name. This makes the information easy to use even for someone with no data experience. For example, you can look for free programs, summer-only programs, or activities at a specific park in your neighborhood.

This project also highlights questions about access and opportunities. Some parks offer a wide range of programs, while others have fewer options or mostly offer only one type of activity. By looking at categories, seasons, and fees, readers can start to see patterns in which communities have more choices and which ones may need more support. Our goal is to turn public data into something simple and useful, so Chicago residents and decision-makers can better understand how parks are serving their communities.
""")

# -------------------------
# CITATIONS
# -------------------------
st.markdown("---")
st.subheader("Citations & Data Sources")
st.markdown("""
**Primary dataset:**  
Chicago Park District Activities — City of Chicago Data Portal  
https://data.cityofchicago.org/Parks-Recreation/Chicago-Park-District-Activities/tn7v-6rnw  
""")