Spaces:
Build error
Build error
| import streamlit as st | |
| import pandas as pd | |
| import altair as alt | |
| st.title('IS 445 Homework 5: Visualization with Streamlit') | |
| st.text("The URL for this app is: https://huggingface.co/spaces/namdini/is445_demo") | |
| source = "https://github.com/UIUC-iSchool-DataViz/is445_data/raw/main/licenses_fall2022.csv" | |
| license_df = pd.read_csv(source) | |
| # First visualization: License Status Distribution | |
| license_status = license_df['License Status'].value_counts().reset_index() | |
| license_status.columns = ['License Status', 'Count'] | |
| license_status = license_status.sort_values(by='Count', ascending=False) | |
| bar_plot = alt.Chart(license_status).mark_bar().encode( | |
| x = alt.X('License Status:N', title='License Status', sort='-y'), | |
| y = alt.Y('Count:Q', title='License Count'), | |
| color=alt.Color('License Status:N'), | |
| ).properties(title = alt.TitleParams(text="1. License Status Distribution", fontSize=30), width=550, height=300) | |
| st.altair_chart(bar_plot, theme="streamlit", use_container_width=True) | |
| st.text(""" | |
| This bar plot displays the distribution of license statuses. The x-axis was | |
| originally in alphabetical order, but has been reorganized by count to | |
| provide users with a more intuitive visualization, highlighting the | |
| most common license statuses first. It shows how many licenses are active, | |
| not renewed, cancelled, and so on, providing a clear overview of the | |
| current state of various licenses. I also made the font size bigger for | |
| each plot title for better readability. If I had more time, I would've | |
| get rid of the ellipsis in some statuses to provide the full status | |
| name for the users. | |
| """) | |
| # Second visualization: Issued License Over Time by License Type | |
| license_df["Issue Year"] = pd.to_datetime(license_df['Original Issue Date'], errors='coerce').dt.year | |
| yearly_license_count = license_df.dropna(subset=['Issue Year']) | |
| yearly_license_count = yearly_license_count.groupby(['Issue Year', 'License Type']).size().reset_index(name='Count') | |
| top5_license_types = yearly_license_count.groupby('License Type')['Count'].sum().nlargest(3).index.tolist() | |
| yearly_license_count['Top3'] = yearly_license_count['License Type'].isin(top5_license_types).replace({True: 'Top 3', False: 'Other'}) | |
| line_plot = alt.Chart(yearly_license_count).mark_line().encode( | |
| x = alt.X('Issue Year:O', title='Year of Issue'), | |
| y = alt.Y('Count:Q', title='License Issued Count'), | |
| color=alt.Color('License Type:N', title = "License Type"), | |
| size=alt.Size('Top3:N', scale=alt.Scale(domain=['Top 3', 'Other'], range=[3, 1])) | |
| ).properties(title = alt.TitleParams(text="2. Issued License Over Time by License Type", fontSize=30), width=700, height=400) | |
| st.altair_chart(line_plot, theme="streamlit", use_container_width=True) | |
| st.text(""" | |
| This line plot shows the trends in license issuance for all license types | |
| over time. Each line represents a different license type, allowing users to | |
| see how the issuance of each type has changed over the years. The top 3 license | |
| types are represented with thicker lines to help users quickly identify the most | |
| significant trends. This visualization helps identify which license types have | |
| increased in popularity and which have seen declines. If I had more time, it | |
| would be great to have a slider widget that allows users to select a specific | |
| time range to focus on certain periods of time. | |
| """) | |