Spaces:
Sleeping
Sleeping
| import streamlit as st | |
| import pandas as pd | |
| import altair as alt | |
| #Config Setup | |
| st.set_page_config(page_title="Building Inventory Analysis", layout="centered") | |
| #Page title | |
| st.title("Gov buildings in Illinois") | |
| #Loading the data with the cache method | |
| def load_data(): | |
| url = 'https://raw.githubusercontent.com/UIUC-iSchool-DataViz/is445_data/main/building_inventory.csv' | |
| df = pd.read_csv(url) | |
| return df | |
| df = load_data() | |
| #Preparing the data for visulization: Using Python. | |
| df['Year Acquired'] = pd.to_numeric(df['Year Acquired'], errors='coerce') | |
| df = df.dropna(subset=['Year Acquired']) | |
| df = df[df['Year Acquired'] >= 1750] #Filtering any years before 1750 | |
| df = df[df['Bldg Status'] != '0'] | |
| #Here, for data viz1, I've chosen to look at time in the unit of a "decade" to simplify and declutter the viz. | |
| #Agg years by decade | |
| df['Decade'] = (df['Year Acquired'] // 10 * 10).astype(int) # Group by decade | |
| df_status = df.groupby(['Decade', 'Bldg Status']).size().reset_index(name='Building Count') | |
| #=============Viz1.1==================== | |
| st.header("Viz1.1: Building Status over the decades") | |
| st.write(""" | |
| **Desc:** | |
| The purpose is to highlight the status of building usage over time. As a by-product, it also shows the acquisition trends. By grouping acquisitions by decade, one can see acquisition patterns over the years, with significant increases in the late 20th century. The status segmentation leads to the obvious conclusion that the vast majority of acquired buildings remain "In Use," with smaller proportions either "Abandon" or "In Progress." | |
| **Design Choices:** | |
| - **Bar Chart**: This shows absolute counts over time and highlights changes in acquisition volume and building status in each decade. | |
| - **Color Scheme**: The building statuses are color-coded intuitively, making it easy to distinguish "In Use," "Abandon," and "In Progress" buildings. | |
| - **Tooltip**: Providing counts for each status within each decade, making the viz more accurate. | |
| - **Axis Labels**: Clear labels on both axes helps in understanding the time-scale better, with the x-axis representing the acquisition decades. The y-axis shows the count of buildings. | |
| **Improvements:** | |
| Future improvements could include adding interactivity, such as filtering by agencies, to provide more targeted insights. Additionally, an option to view acquisitions by year (rather than by decade) could reveal more granular trends. Combining both the improvements together would be even more powerful. | |
| This will give the users the option to view at the acquired buildings year-by-year, by agency name, and the building status. | |
| """) | |
| #altair code for bar chart | |
| bar_chart = alt.Chart(df_status).mark_bar().encode( | |
| x=alt.X('Decade:O', title='Decade'), | |
| y=alt.Y('Building Count:Q', title='Number of Buildings'), | |
| color=alt.Color('Bldg Status:N', title='Building Status', scale=alt.Scale(scheme='set1')), | |
| tooltip=['Decade', 'Bldg Status', 'Building Count'] | |
| ).properties( | |
| width=600, | |
| height=400 | |
| ) | |
| st.altair_chart(bar_chart, use_container_width=True) | |
| #============Viz1.2================ | |
| st.header("Viz1.2: Relative building status over time") | |
| st.write(""" | |
| **Desc:** | |
| This alternative visualization shows the relative proportions of building statuses over time. The main difference is the normalization of the y-axis. Now, one can observe how the distribution of "In Use," "Abandon," and "In Progress" statuses has changed over the decades, focusing on their relative sizes rather than absolute counts. | |
| I realized that both of the versions of the first viz had their own advantages. But not enough to be included as two different vizualizations. | |
| The motive behind the area chart was to highlight the proportions of the abandoned and in-progress buildings. Which, I feel, weren't highlighted adequetly in the bar graph. | |
| **Design Choices:** | |
| - **Stacked Area Chart**: This provides a continuous and smooth view of acquisition trends and the relative proportion of building statuses. | |
| - **Normalized Stack**: This highlights the relative importance of each building status in each decade, making it easier to spot patterns in "Abandon" and "In Progress" buildings over time. | |
| - **Color Encoding**: Consistent color encoding across both charts allows for easy interpretation of statuses. | |
| - **Tooltip**: Providing exact counts for each status per decade, enhancing data accessibility. | |
| **Improvements:** | |
| With additional interactivity, users could toggle between absolute and relative counts, providing both perspectives in a single view. This would be ideal for this concept. | |
| """) | |
| #altair code for the stacked chart. | |
| area_chart = alt.Chart(df_status).mark_area().encode( | |
| x=alt.X('Decade:O', title='Decade'), | |
| y=alt.Y('Building Count:Q', title='Proportion of Buildings', stack='normalize'), | |
| color=alt.Color('Bldg Status:N', title='Building Status', scale=alt.Scale(scheme='set1')), | |
| tooltip=['Decade', 'Bldg Status', 'Building Count'] | |
| ).properties( | |
| width=600, | |
| height=400 | |
| ) | |
| st.altair_chart(area_chart, use_container_width=True) | |
| # ================= Viz2 ================= | |
| st.header("Viz2: Total Square Footage by Agency and Decade") | |
| st.write(""" | |
| **Desc:** | |
| This visualization shows the total square footage acquired by each agency over time, broken down by decade. This approach enables us to see not only which agencies have the largest holdings but also how their acquisitions have varied over the decades. | |
| **Design Choices:** | |
| - **Stacked Bar Chart**: This allows us to see both the total square footage per agency and the decade-wise contribution. | |
| - **Color Coding by Decade**: Each decade is color-coded gradually. The darker shades shows the older times, and the brighter colors highlight the recent times. | |
| - **Sorting by Total Square Footage**: This is done to highlight the largest area holders. | |
| - **Tooltips**: Displaying the agency name, decade, and square footage for each segment, improves readability. | |
| **Improvements:** | |
| An interactive filter could allow users to select specific decades or agencies for a closer look. Adding a toggle to switch between absolute and relative square footage contributions per decade could also make the viz better. Homework 6 will be about implementing these interactive elements in the viz. | |
| """) | |
| #Data prep using Python | |
| df['Year Acquired'] = pd.to_numeric(df['Year Acquired'], errors='coerce') # Ensure Year Acquired is numeric | |
| df = df.dropna(subset=['Year Acquired', 'Square Footage', 'Agency Name']) # Drop rows where essential data is NaN | |
| df = df[df['Year Acquired'] >= 1750] # Filter out years before 1750 | |
| #Data by agency and decade | |
| df['Decade'] = (df['Year Acquired'] // 10 * 10).astype(int) | |
| df_sqft_decade = df.groupby(['Agency Name', 'Decade'])['Square Footage'].sum().reset_index() | |
| df_sqft_decade = df_sqft_decade.sort_values(by='Square Footage', ascending=False) | |
| #altair code for the viz | |
| chart2 = alt.Chart(df_sqft_decade).mark_bar().encode( | |
| x=alt.X('Square Footage:Q', title='Total Square Footage'), | |
| y=alt.Y('Agency Name:N', sort='-x', title='Agency Name', axis=alt.Axis(labelLimit=200)), | |
| color=alt.Color('Decade:O', title='Decade', scale=alt.Scale(scheme='viridis')), #viridis for a flowy color scheme. | |
| tooltip=['Agency Name', 'Decade', 'Square Footage']).properties(width=600, height=600) | |
| st.altair_chart(chart2, use_container_width=True) | |
| # Footer | |
| st.write("---") | |
| st.write("Created by Mihir Sahasrabudhe for HW5-IS445") |