Aniruddha commited on
Commit
51b00a4
·
1 Parent(s): d21976f
Files changed (1) hide show
  1. app.py +37 -33
app.py CHANGED
@@ -2,86 +2,90 @@ import streamlit as st
2
  import pandas as pd
3
  import altair as alt
4
 
 
5
  st.set_page_config(layout="wide")
6
 
7
  # App title
8
- st.title('Building Inventory Visualization – IS445 Homework')
9
 
10
- # Load dataset
11
  DATA_URL = "https://raw.githubusercontent.com/UIUC-iSchool-DataViz/is445_data/main/building_inventory.csv"
12
  df = pd.read_csv(DATA_URL)
 
 
13
  df.columns = df.columns.str.strip()
14
 
15
- # Dataset preview
16
- st.markdown("### Dataset Preview")
17
  st.dataframe(df.head())
18
 
19
- # === Chart 1: Number of Buildings by Usage Description ===
20
- st.markdown("### 1. Number of Buildings by Usage Description")
21
 
22
  chart1 = (
23
  alt.Chart(df)
24
  .mark_bar()
25
  .encode(
26
- x=alt.X('Usage Description:N', sort='-y', title='Usage Description'),
27
  y=alt.Y('count()', title='Number of Buildings'),
28
  color=alt.Color('Usage Description:N', legend=None)
29
  )
30
- .properties(width=700, height=400, title="Number of Buildings by Usage Description")
31
  )
32
 
33
  st.altair_chart(chart1, use_container_width=True)
34
 
35
- # === Write-up for Chart 1 (Humanized + Justified) ===
36
- st.markdown("#### Write-up for Chart 1")
37
  st.markdown(
38
  """
39
  <div style='text-align: justify'>
40
- This chart displays how many buildings are assigned to each type of use, like storage, education, or business.
41
- I picked a bar chart for this because it makes it easy to compare different categories at a glance. The categories
42
- are sorted from most to least frequent so that it’s easier to see which uses are most common. I also used colors
43
- to help distinguish between them visually, without adding a separate legend.
44
-
45
- If I had more time, I would try to add filters for different locations like city or county. That way, people could
46
- explore the building usage more specifically. I also think some usage categories are very similar, so grouping them
47
- might make things cleaner and more understandable.
48
  </div>
49
  """,
50
  unsafe_allow_html=True
51
  )
52
 
53
- # === Chart 2: Average Square Footage by Usage Description ===
54
- st.markdown("### 2. Average Square Footage by Usage Description")
55
 
 
56
  avg_sqft = df.groupby("Usage Description")["Square Footage"].mean().reset_index()
57
 
58
  chart2 = (
59
  alt.Chart(avg_sqft)
60
  .mark_bar()
61
  .encode(
62
- x=alt.X('Usage Description:N', sort='-y', title='Usage Description'),
63
  y=alt.Y('Square Footage:Q', title='Average Square Footage'),
64
  color=alt.Color('Usage Description:N', legend=None)
65
  )
66
- .properties(width=700, height=400, title="Average Square Footage by Usage Description")
67
  )
68
 
69
  st.altair_chart(chart2, use_container_width=True)
70
 
71
- # === Write-up for Chart 2 (Humanized + Justified) ===
72
- st.markdown("#### Write-up for Chart 2")
73
  st.markdown(
74
  """
75
  <div style='text-align: justify'>
76
- This chart shows how big the buildings are on average, depending on what they’re used for. Education buildings,
77
- for example, tend to be a lot bigger than something like utility or public use buildings. I stuck with a bar
78
- chart again for consistency, and sorted it from largest to smallest so you can quickly pick out the biggest categories.
79
-
80
- If I had more time, I would include a tooltip or label that shows how many buildings are in each group, since
81
- average size alone doesn’t always give the full picture. Maybe I’d even try a box plot instead of a bar chart
82
- to show the range of building sizes, not just the average. An interactive dropdown to toggle between mean and
83
- median could also make this more flexible for the user.
84
  </div>
85
  """,
86
  unsafe_allow_html=True
87
- )
 
2
  import pandas as pd
3
  import altair as alt
4
 
5
+ # Set page configuration
6
  st.set_page_config(layout="wide")
7
 
8
  # App title
9
+ st.title('Building Inventory Data Visualization – IS445 Homework')
10
 
11
+ # Load the dataset
12
  DATA_URL = "https://raw.githubusercontent.com/UIUC-iSchool-DataViz/is445_data/main/building_inventory.csv"
13
  df = pd.read_csv(DATA_URL)
14
+
15
+ # Clean up column names (strip any extra spaces)
16
  df.columns = df.columns.str.strip()
17
 
18
+ # Display dataset preview
19
+ st.markdown("### Dataset Overview")
20
  st.dataframe(df.head())
21
 
22
+ # === Chart 1: Distribution of Buildings by Usage Type ===
23
+ st.markdown("### 1. Distribution of Buildings by Usage Type")
24
 
25
  chart1 = (
26
  alt.Chart(df)
27
  .mark_bar()
28
  .encode(
29
+ x=alt.X('Usage Description:N', sort='-y', title='Building Usage Type'),
30
  y=alt.Y('count()', title='Number of Buildings'),
31
  color=alt.Color('Usage Description:N', legend=None)
32
  )
33
+ .properties(width=700, height=400, title="Distribution of Buildings by Usage Type")
34
  )
35
 
36
  st.altair_chart(chart1, use_container_width=True)
37
 
38
+ # === Write-up for Chart 1 (Detailed Explanation) ===
39
+ st.markdown("#### Explanation for Chart 1")
40
  st.markdown(
41
  """
42
  <div style='text-align: justify'>
43
+ This bar chart illustrates the distribution of buildings according to their designated usage types, such as storage,
44
+ education, or business. I chose a bar chart as it’s effective for comparing categories and visualizing frequencies.
45
+ The categories are arranged in descending order, which helps in quickly identifying the most common usage types.
46
+ I used color coding for each usage type to visually separate them, eliminating the need for an additional legend.
47
+
48
+ If I had more time, I would implement filters for location, such as by city or county, so that users could focus on
49
+ specific regions. Additionally, I would consider grouping similar usage categories to simplify the chart and make it more
50
+ digestible for viewers.
51
  </div>
52
  """,
53
  unsafe_allow_html=True
54
  )
55
 
56
+ # === Chart 2: Average Square Footage by Usage Type ===
57
+ st.markdown("### 2. Average Square Footage for Different Usage Types")
58
 
59
+ # Calculate average square footage by usage type
60
  avg_sqft = df.groupby("Usage Description")["Square Footage"].mean().reset_index()
61
 
62
  chart2 = (
63
  alt.Chart(avg_sqft)
64
  .mark_bar()
65
  .encode(
66
+ x=alt.X('Usage Description:N', sort='-y', title='Building Usage Type'),
67
  y=alt.Y('Square Footage:Q', title='Average Square Footage'),
68
  color=alt.Color('Usage Description:N', legend=None)
69
  )
70
+ .properties(width=700, height=400, title="Average Square Footage by Usage Type")
71
  )
72
 
73
  st.altair_chart(chart2, use_container_width=True)
74
 
75
+ # === Write-up for Chart 2 (Detailed Explanation) ===
76
+ st.markdown("#### Explanation for Chart 2")
77
  st.markdown(
78
  """
79
  <div style='text-align: justify'>
80
+ This chart visualizes the average size of buildings based on their usage type. For example, educational buildings
81
+ are generally much larger than those used for utilities or public services. I used a bar chart for consistency,
82
+ sorting from the largest to the smallest values for easier comparison across the categories.
83
+
84
+ If I had more time, I would incorporate additional features, such as tooltips or data labels, to display the total
85
+ number of buildings in each category. This would offer more context to the average size data. I also might consider
86
+ switching to a box plot to show the variation in building sizes rather than just the average. An interactive dropdown
87
+ could allow users to toggle between the mean and median, providing more flexibility in understanding the data.
88
  </div>
89
  """,
90
  unsafe_allow_html=True
91
+ )