import streamlit as st
import pandas as pd
import altair as alt

st.set_page_config(layout="wide", page_title="BFRO Sightings Analysis")

st.title("BFRO Sightings Dashboard")
st.markdown("### Exploration of Bigfoot Reports using Streamlit and Altair")
st.write("This dashboard visualizes reports from the BFRO database, focusing on geographic distribution and seasonal patterns.")

@st.cache_data
def load_data():
    url = "https://raw.githubusercontent.com/UIUC-iSchool-DataViz/is445_data/main/bfro_reports_fall2022.csv"
    df = pd.read_csv(url)
    df_clean = df.dropna(subset=['season', 'state', 'latitude', 'longitude'])
    return df_clean

df_clean = load_data()

alt.data_transformers.disable_max_rows()

season_dropdown = alt.binding_select(options=df_clean['season'].unique().tolist(), name="Select Season: ")
season_select = alt.selection_point(fields=['season'], bind=season_dropdown)

brush = alt.selection_interval()

map_chart = alt.Chart(df_clean).mark_rect().encode(
    x=alt.X('longitude:Q', bin=alt.Bin(maxbins=60), title='Longitude'),
    y=alt.Y('latitude:Q', bin=alt.Bin(maxbins=40), title='Latitude'),
    color=alt.Color('count()', scale=alt.Scale(scheme='inferno'), title='Report Density'),
    tooltip=['count()']
).add_params(
    brush, season_select  
).transform_filter(
    season_select 
).properties(
    width=400,
    height=400,
    title='1. Geographic Density (Drag to Select)'
)

heatmap = alt.Chart(df_clean).mark_bar().encode(
    x=alt.X('count()', title='Total Reports'),
    y=alt.Y('state:N', title='State', sort='-x'),
    color=alt.Color('season:N', title='Season'), 
    tooltip=['state', 'season', 'count()']
).transform_filter(
    brush  
).properties(
    width=300,
    height=400,
    title='2. Reports by State (Filtered by Map)'
)

dashboard = map_chart | heatmap

st.altair_chart(dashboard, use_container_width=True)

st.divider()
st.header("Analysis & Write-up")

st.subheader("Changes from Homework #5")
st.markdown("""
The two visualizations in this dashboard are based on the Altair code developed in Homework #5 (`Workbook (7).ipynb`). 
The primary change for this assignment was migrating the code from a static Jupyter Notebook environment to this live, interactive Streamlit web application. 

This involved:
* Refactoring the notebook code into the `streamlit_app.py` script.
* Using Streamlit commands (e.g., `st.title`, `st.markdown`, `st.altair_chart`) to build the user interface.
* Employing `st.cache_data` to optimize data loading for a web environment.

The write-up below is from Homework #5.
""")

st.subheader("\"Visualization 1: Geographic Density (STATIC)\"")
st.markdown("""
\"For the first visualization, I created a static density heatmap to represent the geographic distribution of Bigfoot reports across the United States. Instead of plotting individual points, which often results in overplotting and obscures patterns in large datasets, this view aggregates the data into a grid. This provides a clear, high-level overview of where sightings are most concentrated without needing user input.

* Encoding Types: I used the rect mark to construct the heatmap. Latitude and Longitude are mapped to the Y and X axes respectively.
* Color Scheme: I utilized the inferno color scheme to encode the count() of reports within each grid cell. I chose this sequential palette (dark to bright) because it is perceptually uniform and draws the eye immediately to “hotspots” (bright yellow/orange) against the darker background, effectively communicating density differences.

I filtered the dataset in Python to remove rows with missing geographic coordinates (NaN in latitude/longitude). Additionally, I used Altair’s internal binning transformation (bin=alt.Bin(maxbins=60)) on the coordinates to transform the raw, scattered data points into the structured grid format seen in the plot.\"
""")

st.subheader("\"Visualization 2: Reports by State\"")
st.markdown("""
\"Seasonal Breakdown (Added Interaction) For the second visualization, I built a stacked bar chart that adds a layer of interactivity to the first plot. While the map shows where reports happen, this chart explains when they happen. I linked this chart to the map so that it dynamically updates based on the user’s selection, effectively turning the static map into an interactive exploration tool.

* Encoding Types: I used a bar mark with state on the Y-axis and count() on the X-axis.
* Color Scheme: I colored the bars by season using a nominal color palette. This “stacked” design allows users to see the seasonal composition of reports for any given state.

For interactivity, I added a brushing and linking interaction to connect the two plots. This allows users to click and drag on the map (Visualization 1) to select a specific geographic region, which then filters Visualization 2 to show only the states included in that selection. User can then access specific clusters (like the Pacific Northwest) and see the specific seasonal breakdown for that area. I implemented a “cross-filter” transformation, which allows the barchart to accept the filter from the map’s brush (transform_filter(brush)). The interactivity works hierarchically: the Season Dropdown filters the entire dataset first (updating the map), and the Map Brush then filters those results spatially.\"
""")