richtaing commited on
Commit
7cff1aa
·
verified ·
1 Parent(s): aa1653e

Update src/streamlit_app.py

Browse files
Files changed (1) hide show
  1. src/streamlit_app.py +26 -5
src/streamlit_app.py CHANGED
@@ -19,7 +19,6 @@ df_clean = load_data()
19
 
20
  alt.data_transformers.disable_max_rows()
21
 
22
-
23
  season_dropdown = alt.binding_select(options=df_clean['season'].unique().tolist(), name="Select Season: ")
24
  season_select = alt.selection_point(fields=['season'], bind=season_dropdown)
25
 
@@ -57,16 +56,38 @@ dashboard = map_chart | heatmap
57
 
58
  st.altair_chart(dashboard, use_container_width=True)
59
 
60
-
61
  st.divider()
62
  st.header("Analysis & Write-up")
63
 
64
- st.subheader("Visualization 1: Geographic Density Map")
65
  st.markdown("""
66
- This visualization highlights the spatial distribution of Bigfoot reports across the United States. I chose a **rectangular binning** approach (`mark_rect`) rather than plotting individual points to better handle the data density and avoid overplotting in high-activity areas like the Pacific Northwest. The **'inferno' color scheme** was selected to provide high contrast, where lighter colors immediately draw the viewer's eye to areas of high report density. This plot is interactive; it includes a dropdown to filter by season, allowing users to analyze whether sighting locations shift during different times of the year, and it acts as a filter for the second chart. **If I had more time**, I would overlay these bins onto a geographic base map (using `mark_geoshape`) to provide better context regarding state borders and physical geography.
 
 
 
 
 
 
 
 
67
  """)
68
 
 
 
 
 
 
 
 
 
 
 
69
  st.subheader("Visualization 2: Reports by State")
70
  st.markdown("""
71
- This bar chart highlights the frequency of reports aggregated by state. I used a **bar mark** as it effectively compares magnitudes across categories (states). The states are **sorted in descending order** of report counts to instantly reveal the most active locations without requiring the user to scan the entire axis. The color encoding represents the 'season', providing a secondary dimension of information consistent with the map. This visualization is linked to the map via a brush filter; dragging a selection box on the map dynamically updates this bar chart to show the state breakdown for only the selected region. **If I had more time**, I would normalize the data by state population or land area to provide a per-capita or per-square-mile perspective, which might reveal different hotspots than raw counts alone.
 
 
 
 
 
72
  """)
 
19
 
20
  alt.data_transformers.disable_max_rows()
21
 
 
22
  season_dropdown = alt.binding_select(options=df_clean['season'].unique().tolist(), name="Select Season: ")
23
  season_select = alt.selection_point(fields=['season'], bind=season_dropdown)
24
 
 
56
 
57
  st.altair_chart(dashboard, use_container_width=True)
58
 
 
59
  st.divider()
60
  st.header("Analysis & Write-up")
61
 
62
+ st.subheader("Changes from Homework #5")
63
  st.markdown("""
64
+ The two visualizations in this dashboard are based on the Altair code developed in Homework #5 (`Workbook (7).ipynb`).
65
+ The primary change for this assignment was migrating the code from a static Jupyter Notebook environment to this live, interactive Streamlit web application.
66
+
67
+ This involved:
68
+ * Refactoring the notebook code into the `streamlit_app.py` script.
69
+ * Using Streamlit commands (e.g., `st.title`, `st.markdown`, `st.altair_chart`) to build the user interface.
70
+ * Employing `st.cache_data` to optimize data loading for a web environment.
71
+
72
+ The write-up below is from Homework #5.
73
  """)
74
 
75
+ st.subheader("Visualization 1: Geographic Density (STATIC)")
76
+ st.markdown("""
77
+ >For the first visualization, I created a static density heatmap to represent the geographic distribution of Bigfoot reports across the United States. Instead of plotting individual points, which often results in overplotting and obscures patterns in large datasets, this view aggregates the data into a grid. This provides a clear, high-level overview of where sightings are most concentrated without needing user input.
78
+
79
+ * Encoding Types: I used the rect mark to construct the heatmap. Latitude and Longitude are mapped to the Y and X axes respectively.
80
+ * Color Scheme: I utilized the inferno color scheme to encode the count() of reports within each grid cell. I chose this sequential palette (dark to bright) because it is perceptually uniform and draws the eye immediately to “hotspots” (bright yellow/orange) against the darker background, effectively communicating density differences.
81
+
82
+ I filtered the dataset in Python to remove rows with missing geographic coordinates (NaN in latitude/longitude). Additionally, I used Altair’s internal binning transformation (bin=alt.Bin(maxbins=60)) on the coordinates to transform the raw, scattered data points into the structured grid format seen in the plot.
83
+ \n> """)
84
+
85
  st.subheader("Visualization 2: Reports by State")
86
  st.markdown("""
87
+ Seasonal Breakdown (Added Interaction) For the second visualization, I built a stacked bar chart that adds a layer of interactivity to the first plot. While the map shows where reports happen, this chart explains when they happen. I linked this chart to the map so that it dynamically updates based on the user’s selection, effectively turning the static map into an interactive exploration tool.
88
+
89
+ * Encoding Types: I used a bar mark with state on the Y-axis and count() on the X-axis.
90
+ * Color Scheme: I colored the bars by season using a nominal color palette. This “stacked” design allows users to see the seasonal composition of reports for any given state.
91
+
92
+ For interactivity, I added a brushing and linking interaction to connect the two plots. This allows users to click and drag on the map (Visualization 1) to select a specific geographic region, which then filters Visualization 2 to show only the states included in that selection. User can then access specific clusters (like the Pacific Northwest) and see the specific seasonal breakdown for that area. I implemented a “cross-filter” transformation, which allows the barchart to accept the filter from the map’s brush (transform_filter(brush)). The interactivity works hierarchically: the Season Dropdown filters the entire dataset first (updating the map), and the Map Brush then filters those results spatially.
93
  """)