import streamlit as st
import altair as alt
import pandas as pd
# Custom CSS for background, fonts, and text styling
st.markdown("""
""", unsafe_allow_html=True)
# Sidebar for navigation
st.sidebar.title("Navigation")
st.sidebar.markdown("Use the sidebar to navigate through different sections.")
# Title Section
st.title("1 : INTRODUCTION TO STATISTICS")
st.markdown("""
In this section, we'll explore the basics of data analysis using Python. **Data Analysis** involves collecting, cleaning, and analyzing data to extract valuable insights. Let's start by understanding what we mean by *data*.
""", unsafe_allow_html=True)
# Header Section
st.header("What does the term 'data' refer to?")
st.subheader("DATA")
st.markdown("""
Data refers to a collection of information gathered from various sources. Here are a few examples:
""", unsafe_allow_html=True)
st.markdown("""
- Images
- Text
- Videos
- Audio recordings
""", unsafe_allow_html=True)
# Data Classification Section with a chart
st.header("Data Classification")
st.subheader("Structured Data")
st.markdown("""
Structured data is organized and formatted, making it easy to search, analyze, and store in databases. Common examples include:
- Excel Documents
- SQL Databases
""", unsafe_allow_html=True)
st.image('https://cdn-uploads.huggingface.co/production/uploads/64c972774515835c4dadd754/dSbyOXaQ6N_Kg2TLxgEyt.png', width=400)
# Visualization example for Structured Data
data = pd.DataFrame({
'Category': ['Excel', 'SQL', 'CSV', 'JSON'],
'Count': [45, 35, 30, 40]
})
chart = alt.Chart(data).mark_bar().encode(
x=alt.X('Category', title='Data Format'),
y=alt.Y('Count', title='Count'),
color=alt.Color('Category', legend=None)
).properties(
title='Structured Data Types',
width=500,
height=300
).configure_title(
fontSize=18,
anchor='middle',
font='Roboto',
color='#343a40'
)
st.altair_chart(chart)
st.subheader("Unstructured Data")
st.markdown("""
Unstructured data doesn't follow a specific format and is often difficult to organize. Examples include:
- Images
- Videos
- Text documents
- Social Media Feeds
""", unsafe_allow_html=True)
st.image("https://cdn-uploads.huggingface.co/production/uploads/64c972774515835c4dadd754/xhaNBRanDaj8esumqo9hl.png", width=400)
st.subheader("Semi-Structured Data")
st.markdown("""
Semi-structured data contains elements of both structured and unstructured data. Examples include:
- CSV Files
- JSON Files
- Emails
- HTML Documents
""", unsafe_allow_html=True)
st.image("https://cdn-uploads.huggingface.co/production/uploads/64c972774515835c4dadd754/Nupc6BePInRVo9gJwLfWH.png", width=400)
# Introduction to Statistics
st.title("2 : INTRODUCTION TO STATISTICS")
st.markdown("""
_Statistics is a branch of mathematics focused on collecting, analyzing, interpreting, and presenting data. It can be divided into two main categories:_
""", unsafe_allow_html=True)
# Descriptive Statistics Section with interactive elements
st.subheader("2.1 Descriptive Statistics")
st.markdown("""
Descriptive statistics summarize and describe the main features of a dataset. Key concepts include:
- Measures of Central Tendency (Mean, Median, Mode)
- Measures of Dispersion (Range, Variance, Standard Deviation)
- Data Distributions (e.g., Gaussian, Random, Normal)
""", unsafe_allow_html=True)
# Example of an interactive chart for Central Tendency
values = st.slider('Select a range of values', 0, 100, (25, 75))
mean_value = sum(values) / len(values)
st.write(f"Mean Value: {mean_value}")
# Inferential Statistics Section
st.subheader("2.2 Inferential Statistics")
st.markdown("""
Inferential statistics involve making predictions or inferences about a population based on a sample. These methods are used to test hypotheses and estimate population parameters.
""", unsafe_allow_html=True)