MachineLearning / pages /4Data Collection.py
LakshmiHarika's picture
Rename pages/Data Collection.py to pages/4Data Collection.py
e07f2b4 verified
import streamlit as st
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import cv2
import zipfile
import io
st.set_page_config(
page_title="HomePage",
page_icon="πŸš€",
layout="wide"
)
# Ensure session state for navigation
if "current_page" not in st.session_state:
st.session_state.current_page = "main"
# Navigation function
def navigate_to(page_name):
st.session_state.current_page = page_name
# Main Page
if st.session_state.current_page == "main":
# Reset scroll to top
st.query_params.update({})
# Page Title
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h2 style="color: #BB3385;">What is Data? πŸ“Šβœ¨</h2>
</div>
""", unsafe_allow_html=True)
# Introduction Text
st.write("""
**Data** is the measurements that are collected as a source of Information.
It refers raw facts, figures, and observations that can be collected, stored, and processed.
It has no meaning on its own until it is organized or analyzed to derive useful information.""")
# Types of Data Section
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h2 style="color: #2a52be;">Types of Data</h2>
</div>
""", unsafe_allow_html=True)
# Radio Button for Data Type Selection
data_type = st.radio(
"Select the type of Data:",
("Structured Data", "Unstructured Data", "Semi-Structured Data")
)
if data_type == "Structured Data":
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h3 style="color: #e25822;">What is Structured Data?</h3>
</div>
""", unsafe_allow_html=True)
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #5b2c6f;">Definition:</h4>
</div>
""", unsafe_allow_html=True)
st.markdown("""
**Structured data** refers to information that is organized and formatted in a predefined manner,
making it easy to store, retrieve, and analyze.
It is typically stored in tabular formats like rows and columns, where each field contains a specific type of information.
""")
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #5b2c6f;">Characteristics:</h4>
</div>
""", unsafe_allow_html=True)
st.write("""
- Follows a fixed schema.
- Can be easily searched using query languages like SQL.
- Suitable for quantitative analysis.
""")
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #5b2c6f;">Examples:</h4>
</div>
""", unsafe_allow_html=True)
st.write("""
A database of students with fields like ID, name, age, and gender:
""")
student_data = {
"Id": [100, 101, 102, 103],
"Name": ["Lakshmi Harika", "Varshitha", "Hari Chandan", "Shamitha"],
"Age": [22, 23, 22, 23],
"Gender": ["Female", "Female", "Male", "Female"]
}
df = pd.DataFrame(student_data)
st.markdown(df.style.set_table_styles(
[{'selector': 'thead th', 'props': 'font-weight: bold;'}]
).hide(axis="index").to_html(), unsafe_allow_html=True)
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #5b2c6f;">Data Formats in Structured Data:</h4>
</div>
""", unsafe_allow_html=True)
st.write("Click to explore Structured Data Formats:")
col1, col2 = st.columns(2) # Define two columns using Streamlit's columns method
col1, col2 = st.columns(2)
with col1:
if st.button("πŸ“Š Explore Excel"):
navigate_to("explore_excel")
st.query_params.update({}) # Ensure page starts from the top after navigation
elif data_type == "Unstructured Data":
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h3 style="color: #e25822;">What is Unstructured Data?</h3>
</div>
""", unsafe_allow_html=True)
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #5b2c6f;">Definition:</h4>
</div>
""", unsafe_allow_html=True)
st.write("""
**Unstructured data** refers to information that does not follow a predefined format or structure.
It is typically raw data that lacks a clear, organized schema, making it harder to store and analyze using traditional tools.
Examples include multimedia files (images, videos, audio), emails, and social media posts.
""")
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #5b2c6f;">Characteristics:</h4>
</div>
""", unsafe_allow_html=True)
st.write("""
- Does not follow a specific schema or structure.
- Cannot be stored in traditional tabular formats like rows and columns.
- Requires advanced tools like machine learning or natural language processing (NLP) for analysis.
""")
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #5b2c6f;">Examples:</h4>
</div>
""", unsafe_allow_html=True)
st.write("""
- **Images**: Photos, screenshots, or scanned documents.
- **Audio**: Podcasts, voice recordings, or music files.
- **Videos**: Recorded lectures, surveillance footage, or YouTube videos.
- **Text**: Emails, social media posts, and blog articles.
""")
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #5b2c6f;">Data Formats in UnStructured Data:</h4>
</div>
""", unsafe_allow_html=True)
st.write("Click to explore Unstructured Data Formats:")
col1, col2, col3 = st.columns(3)
with col1:
if st.button("πŸ“Έ Images & Videos"):
navigate_to("explore_images_video")
st.query_params.update({})
with col2:
if st.button("🎡 Audio"):
navigate_to("explore_audio")
st.query_params.update({})
with col3:
if st.button("✍️ Text"):
navigate_to("explore_text")
st.query_params.update({})
elif data_type == "Semi-Structured Data":
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h3 style="color: #e25822;">What is Semi-Structured Data?</h3>
</div>
""", unsafe_allow_html=True)
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #5b2c6f;">Definition:</h4>
</div>
""", unsafe_allow_html=True)
st.write("""
**Semi-Structured data** refers to information that does not follow a strict tabular format but contains tags or markers to separate data elements.
This type of data is more flexible than structured data but still organized enough to allow for easier analysis than unstructured data.
""")
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #5b2c6f;">Characteristics:</h4>
</div>
""", unsafe_allow_html=True)
st.write("""
- Contains markers or tags (e.g., XML, JSON keys) to provide structure.
- More flexible than structured data, allowing for varying schemas.
- Easier to process than unstructured data.
- Can store hierarchical relationships.
""")
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #5b2c6f;">Examples:</h4>
</div>
""", unsafe_allow_html=True)
st.write("""
Examples of semi-structured data include:
- **CSV**: Comma-separated values in plain-text files.
- **JSON**: A lightweight data-interchange format used in web applications.
- **XML**: Extensible Markup Language for structured document encoding.
- **HTML**: Markup language for web pages.
""")
st.markdown("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #5b2c6f;">Data Formats in Semi-Structured Data:</h4>
</div>
""", unsafe_allow_html=True)
st.write("Click to explore Semi-Structured Data Formats:")
col1, col2, col3, col4 = st.columns(4)
with col1:
if st.button("πŸ“„ CSV"):
navigate_to("explore_csv")
st.query_params.update({})
with col2:
if st.button("πŸ“‹ JSON"):
navigate_to("explore_json")
st.query_params.update({})
with col3:
if st.button("πŸ“œ XML"):
navigate_to("explore_xml")
st.query_params.update({})
with col4:
if st.button("🌐 HTML"):
navigate_to("explore_html")
st.query_params.update({})
#--------------------------------------------------------- Excel--------------------------------------------------------------------------------
# Page for Explore Excel
if st.session_state.current_page == "explore_excel":
st.query_params.update({})
# Main Heading
st.markdown("""
<h2 style="color: #BB3385;">Excel</h2>
""", unsafe_allow_html=True)
# Overview Section
st.write("""
- **Excel** is a powerful spreadsheet software developed by Microsoft.
- It is widely used for:
- Data organization
- Analysis
- Visualization
- Key features include:
- Storing data in tabular format
- Performing complex calculations
- Creating charts
- Applying various data manipulation techniques
- Excel is an essential tool for managing and analyzing structured data in various industries.
""")
# Reading Excel Files Section
st.markdown("""
<h3 style="color: #5b2c6f;">Reading Excel Files in Python</h3>
""", unsafe_allow_html=True)
st.code("""
import pandas as pd
# Read the Excel file
data = pd.read_excel('path_to_file.xlsx')
print(data.head()) # Displays first 5 rows in Excel file
""", language="python")
st.markdown("""
<h3 style="color: #5b2c6f;">Working with Sheets in Excel</h3>
""", unsafe_allow_html=True)
# Importing a Single Sheet
st.write("#### Importing a Single Excel Sheet")
st.code("""
df = pd.read_excel('path_to_file.xlsx', sheet_name=0)
print(df)
""", language="python")
# Importing Multiple Sheets
st.write("#### Importing Multiple Sheets from Excel")
st.code("""
df_dict = pd.read_excel('path_to_file.xlsx', sheet_name=[0, 1, 2])
for sheet, data in df_dict.items():
print(f"Sheet: {sheet}")
print(data.head())
""", language="python")
# Exporting Data Section
st.markdown("""
<h3 style="color: #5b2c6f;">Exporting Data to Excel Files</h3>
""", unsafe_allow_html=True)
# Exporting a Single DataFrame
st.write("#### Exporting a Single DataFrame")
st.code("""
data = pd.DataFrame({
'name': ['a', 'b', 'c', 'd'],
'age': [12, 23, 44, 43]
})
# Export the DataFrame to an Excel file
data.to_excel('single_sheet_output.xlsx', index=False)
""", language="python")
# Exporting Multiple DataFrames
st.write("#### Exporting Multiple DataFrames to Different Sheets")
st.code("""
data1 = pd.DataFrame({
'name': ['a', 'b', 'c', 'd'],
'age': [12, 23, 44, 43]
})
data2 = pd.DataFrame({
'maths': [43, 32, 45, 45],
'science': [32, 54, 45, 13]
})
data3 = pd.DataFrame({
'hindi': [32, 45, 53, 53],
'english': [53, 32, 24, 65]
})
# Export multiple DataFrames to an Excel file with multiple sheets
with pd.ExcelWriter('multi_sheet_output.xlsx') as writer:
data1.to_excel(writer, sheet_name='Personal Info', index=False)
data2.to_excel(writer, sheet_name='Academic Scores', index=False)
data3.to_excel(writer, sheet_name='Language Scores', index=False)
""", language="python")
# Issues Section
st.markdown("""
<h3 style="color: #BB3385;">Common Issues with Excel Files</h3>
""", unsafe_allow_html=True)
# 1. File Format Compatibility
st.markdown("""
<h4 style="color: #5b2c6f;">1. File Format Compatibility</h4>
""", unsafe_allow_html=True)
st.write("Excel files may come in different formats like `.xls` and `.xlsx`, which can lead to compatibility issues.")
st.code("""
data = pd.read_excel('file.xls', engine='xlrd') # For .xls files
data = pd.read_excel('file.xlsx', engine='openpyxl') # For .xlsx files
print(data.head())
""", language="python")
# 2. Encoding Issues
st.markdown("""
<h4 style="color: #5b2c6f;">2. Encoding Issues</h4>
""", unsafe_allow_html=True)
st.write("Sometimes Excel files might have special characters that cause encoding problems.")
st.code("""
data = pd.read_excel('file.xlsx', encoding='utf-8') # Replace with the correct encoding
print(data.head())
""", language="python")
# 3. Missing or Incomplete Data
st.markdown("""
<h4 style="color: #5b2c6f;">3. Missing or Incomplete Data</h4>
""", unsafe_allow_html=True)
st.write("Missing values can lead to errors during data processing.")
st.code("""
data = pd.read_excel('file.xlsx')
data.fillna(0, inplace=True) # Replace NaN values with 0 or other defaults
print(data.head())
""", language="python")
# 4. Large File Sizes
st.markdown("""
<h4 style="color: #5b2c6f;">4. Large File Sizes</h4>
""", unsafe_allow_html=True)
st.write("Large Excel files may cause performance issues or run out of memory.")
st.code("""
chunk_size = 1000
for chunk in pd.read_excel('large_file.xlsx', chunksize=chunk_size):
print(chunk.head())
""", language="python")
# 5. Sheet Name Selection
st.markdown("""
<h4 style="color: #5b2c6f;">5. Sheet Name Selection</h4>
""", unsafe_allow_html=True)
st.write("Excel files may have multiple sheets, and reading the wrong one can lead to incorrect analysis.")
st.code("""
# Specify the sheet name explicitly
data = pd.read_excel('file.xlsx', sheet_name='Sheet1')
print(data.head())
""", language="python")
# 6. Data Type Conversion
st.markdown("""
<h4 style="color: #5b2c6f;">6. Data Type Conversion</h4>
""", unsafe_allow_html=True)
st.write("Excel files may have columns with inconsistent or incorrect data types.")
st.code("""
# Convert columns to appropriate data types
data = pd.read_excel('file.xlsx')
data['column_name'] = data['column_name'].astype(int) # Replace 'column_name' with your column
print(data.dtypes)
""", language="python")
# 7. Hidden Characters or Whitespace
st.markdown("""
<h4 style="color: #5b2c6f;">7. Hidden Characters or Whitespace</h4>
""", unsafe_allow_html=True)
st.write("Whitespace or hidden characters in the data can cause parsing issues.")
st.code("""
# Remove leading/trailing whitespaces
data = pd.read_excel('file.xlsx')
data.columns = data.columns.str.strip() # Remove whitespace from column names
data['column_name'] = data['column_name'].str.strip() # Clean specific column
print(data.head())
""", language="python")
# 8. Merged Cells
st.markdown("""
<h4 style="color: #5b2c6f;">8. Merged Cells</h4>
""", unsafe_allow_html=True)
st.write("Merged cells in Excel can lead to missing or misaligned data.")
st.code("""
# Handle merged cells by filling forward
data = pd.read_excel('file.xlsx', merge_cells=False) # Disable merging
print(data.head())
""", language="python")
col1, col2 = st.columns(2)
with col1:
st.markdown("""
<a href="https://colab.research.google.com/drive/1qSWM2h-_ND9Nv7GVW9q5onpdcrlFauHc?usp=sharing" target="_blank" title="Open the associated Google Colab file">
<button style="background-color: #4CAF50; color: white; padding: 10px; font-size: 16px; border: none; cursor: pointer;">
Open Google Colab File
</button>
</a>
""", unsafe_allow_html=True)
with col2:
if st.button("⬅️ Back to Previous Page"):
st.session_state.current_page = "main"
st.query_params.update({})
#--------------------------------------------------------- Images and Video --------------------------------------------------------------------------------
elif st.session_state.current_page == "explore_images_video":
st.query_params.update({})
st.markdown("""
<h2 style="color: #BB3385;">Introduction to Images and VideosπŸ“ΈπŸ–ΌοΈ</h2>
""", unsafe_allow_html=True)
# Subheading 1: What is an Image?
st.write("""
<div style="text-align: left; margin-top: 20px;">
<h3 style="color: #5b2c6f;">What is an Image?</h3>
<p style="font-size: 16px; color: #333;">
An image is a two-dimensional representation of the visible light spectrum, often captured
using devices like cameras or scanners. It can store details such as <strong>colors</strong>, <strong>shapes</strong>, and <strong>textures</strong>,
enabling us to visually interpret and analyze information.
Common formats include JPEG, PNG, and BMP.
</p>
</div>
""", unsafe_allow_html=True)
# Subheading 2: What is a Video?
st.write("""
<div style="text-align: left; margin-top: 20px;">
<h3 style="color: #5b2c6f;">What is a Video?</h3>
<p style="font-size: 16px; color: #333;">
A video is a collection of images, often referred to as frames, displayed one after another quickly
to show continuous movement. Each frame captures a moment in time, and when these frames are played
sequentially, they show continuous movement. Videos typically have a frame rate (e.g., 30 frames
per second or 60 frames per second), which determines how many frames are displayed per second.
Common video formats include MP4, AVI, and MOV.
</p>
</div>
""", unsafe_allow_html=True)
# Subheading 3: Why is an Image Called a Grid-Like Structure?
st.write("""
<div style="text-align: left; margin-top: 20px;">
<h3 style="color: #5b2c6f;">Why is an Image Called a Grid-Like Structure?</h3>
<p style="font-size: 16px; color: #333;">
Images are called <strong>grid-like structures</strong> because they are composed of <strong>pixels</strong> arranged in rows and columns,
forming a rectangular grid. Each <strong>pixel</strong> contains information about <strong>shape</strong>, <strong>color</strong>, and <strong>patterns</strong>, which
together define the image's content.
The total number of <strong>pixels</strong> is determined by the image's height and width (resolution), and a higher resolution provides better clarity.
</p>
<p style="font-size: 16px; color: #333;">
In images, <strong>pixels</strong> act as features, and the entire grid represents a single data point. This combination
of features and data points gives images their grid-like nature.
</p>
<p style="font-size: 16px; color: #333;">
While images and tabular data are both grid-like, the difference lies in interpretation: in images, the
grid represents one data point, while in tabular data, rows represent data points, and columns represent features.
</p>
</div>
""", unsafe_allow_html=True)
# Interactive Pixel Grid Section
st.subheader("Interactive Pixel Grid")
# User Input for Height and Width
height = st.number_input("Enter Image Height (pixels):", min_value=1, max_value=50, value=10, step=1)
width = st.number_input("Enter Image Width (pixels):", min_value=1, max_value=50, value=10, step=1)
# Display Resolution
resolution = height * width
st.write(f"**Image Resolution**: {resolution} pixels")
# Generate and Display Pixel Grid
st.write("**Pixel Grid Visualization:**")
grid = np.random.rand(int(height), int(width)) # Generate random grid values
fig, ax = plt.subplots()
cax = ax.imshow(grid, cmap="Pastel1")
plt.colorbar(cax, ax=ax) # Add color bar for context
ax.set_title("Pixel Grid")
ax.set_xlabel("Width(pixels)", fontsize=8) # Set smaller font size
ax.set_ylabel("Height(pixels)", fontsize=8) # Set smaller font size
# Render the Plot
st.pyplot(fig)
# Section: What are Color Spaces?
st.write("""
<div style="text-align: left; margin-top: 20px;">
<h3 style="color: #5b2c6f;">What are Color Spaces?</h3>
<p style="font-size: 16px; color: #333;">
A <strong>color space</strong> is a technique used to represent the <strong>colors of an image</strong> in a
numerical format. It allows us to preserve the <strong>color information</strong> while converting it into a
form that machines can understand. Since machines cannot <strong>"see"</strong> images as humans do, they interpret
<strong>numerical values</strong>. Therefore, color spaces are crucial for converting images into a format
that can be processed by a machine.
</p>
</div>
""", unsafe_allow_html=True)
# Section: Example of How ML Models Work with Images
st.write("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #e25822;">For Example:</h4>
<p style="font-size: 16px; color: #333;">
Imagine you're building a <strong>machine learning model</strong> to classify images of
<strong>dogs and cats</strong>. You provide the model with images, but since the machine cannot understand
images directly, you need to convert them into <strong>numerical data</strong>. This is where
<strong>color spaces</strong> play a vital role. They help convert the <strong>color information</strong> in
the images into numbers that the machine can process, allowing it to <strong>learn from the data</strong>
and make accurate predictions.
</p>
</div>
""", unsafe_allow_html=True)
# Section: Common Color Spaces
st.write("""
<div style="text-align: left; margin-top: 20px;">
<h4 style="color: #5b2c6f;">Common Color Spaces</h4>
<p style="font-size: 16px; color: #333;">
These are some of the <strong>common color spaces</strong> used in <strong>image processing</strong>:
</p>
<ol style="font-size: 16px; color: #333;">
<li><strong>Black and White</strong></li>
<li><strong>Grayscale</strong></li>
<li><strong>Red, Green, Blue (RGB)</strong></li>
</ol>
</div>
""", unsafe_allow_html=True)
st.subheader("What is Black and White Color Space?")
st.write("""
Black and White color space, also known as binary color space, represents an image using only two colors:
**black** and **white**.
- **0** represents **black**.
- **1** or **255** (depending on the encoding) represents **white**.
Each pixel in this color space is either completely black or completely white.
Black and White color space eliminates all color information, focusing entirely on light intensity.
""")
st.image(
"https://huggingface.co/spaces/LakshmiHarika/MachineLearning/resolve/main/Images/black_white_img.png",
caption="Black and White Color Space.",
use_container_width=True)
# Section: What is Grayscale Color Space?
st.subheader("What is Grayscale Color Space?")
st.write("""
Grayscale color space represents an image using different shades of gray, ranging from **black** to **white**.
- **0** represents **black** (no light intensity).
- **255** represents **white** (maximum light intensity).
- Values between **0 and 255** represent varying shades of gray.
Grayscale eliminates color information, focusing entirely on the intensity of light in an image. Each pixel has only one intensity value, making it a simpler and more compact representation compared to color images.
""")
# Create grayscale gradient with labeled intensity values
gradient = np.linspace(0, 255, 256) # Generate gradient values
gradient = np.tile(gradient, (10, 1)) # Repeat the gradient to make it visually clear
# Plot the gradient
fig, ax = plt.subplots(figsize=(8, 1), facecolor='none') # Reduce height by half
ax.imshow(gradient, cmap='gray', aspect='auto')
ax.set_xticks(np.linspace(0, 255, 11)) # Set ticks for every 25.5 (0, 25, ..., 255)
ax.set_xticklabels([str(int(x)) for x in np.linspace(0, 255, 11)], fontsize=8, color='red') # Adjust font size
ax.set_yticks([]) # Remove y-axis ticks
ax.set_title("Grayscale Representation", fontsize=10)
# Save the figure with a transparent background
plt.savefig('grayscale_representation.png', transparent=True)
# Render the plot in Streamlit
st.pyplot(fig)
st.image(
"https://huggingface.co/spaces/LakshmiHarika/MachineLearning/resolve/main/Images/gray_scale.jpg",
caption="Gray Scale Color Space.",
use_container_width=True)
st.subheader("What is RGB Color Space?")
st.write("""
RGB color space represents an image using three primary colors: **Red**, **Green**, and **Blue**. These colors form the basis of digital images and can be combined in different intensities to create a wide range of colors.
A colored image in the RGB color space is split into three separate channels:
- **Red Channel**: Contains the intensity of the red color at each pixel.
- **Green Channel**: Contains the intensity of the green color at each pixel.
- **Blue Channel**: Contains the intensity of the blue color at each pixel.
Each of these channels is represented as a **2D array**, where:
- Each pixel in the 2D array contains a value ranging from **0** (no intensity) to **255** (maximum intensity) for that color.
By combining the three channels, a wide range of colors can be formed. For example:
- **(255, 0, 0)** represents pure **Red**.
- **(0, 255, 0)** represents pure **Green**.
- **(0, 0, 255)** represents pure **Blue**.
- **(255, 255, 255)** represents **White**, where all channels are at maximum intensity.
- **(0, 0, 0)** represents **Black**, where all channels have no intensity.
- Combining colors, such as **Red + Green = Yellow**, **Green + Blue = Cyan**, and **Blue + Red = Magenta**, creates even more colors. By adjusting the intensity of each channel, millions of unique colors can be generated.
Computers interpret RGB images as **3D arrays**:
- The **width** and **height** of the 3D array correspond to the dimensions of the image.
- The **depth** of the 3D array corresponds to the number of color channels.
Altogether, these three channels combine to form a complete color image, enabling vibrant, precise, and dynamic representations of colors in digital media.
""")
st.image(
"https://huggingface.co/spaces/LakshmiHarika/MachineLearning/resolve/main/Images/rgb_1.jpg",
use_container_width=True)
st.image(
"https://huggingface.co/spaces/LakshmiHarika/MachineLearning/resolve/main/Images/rgb_2.jpg",
use_container_width=True)
st.image(
"https://huggingface.co/spaces/LakshmiHarika/MachineLearning/resolve/main/Images/rgb_3.jpg",
use_container_width=True)
st.write("""
In the next section, we'll dive into the exciting world of **image processing using OpenCV**. We'll cover how to:
- **Read, display, and manipulate images** programmatically.
- Understand the **core operations** used in computer vision.
- Transform images to uncover hidden insights.
Curious to see how?πŸ‘‡Click **Image Operations with OpenCV** to start your journey into OpenCV Basics!πŸš€
""")
col1, col2 = st.columns(2)
with col1:
if st.button("⬅️ Back to Previous Page"):
navigate_to("main")
st.query_params.update({})
with col2:
if st.button("➑️ Image Operations with OpenCV"):
navigate_to("opencv_operations")
st.query_params.update({})
elif st.session_state.current_page == "opencv_operations":
st.query_params.update({})
# Introduction to OpenCV Page
st.markdown("""
<h2 style="color: #BB3385;">OpenCV(Open Source Computer Vision Library)</h2>
""", unsafe_allow_html=True)
# Informative Content
st.write("""
Before diving into OpenCV basics, let's understand a few key points:
- In Python, we have several libraries to work with images. One of the most powerful and popular libraries is **OpenCV**.
- With **OpenCV**, we can provide machines with **artificial vision**, enabling them to perceive and process visual information.
- OpenCV allows us to work with both **images and videos**, making it a versatile tool for various computer vision applications.
""")
# What is OpenCV Section
st.markdown("""
<h3 style="color: #9400d3;">What is OpenCV?</h3>
""", unsafe_allow_html=True)
st.write("""
OpenCV, short for **Open Source Computer Vision Library**, is a popular open-source library designed for real-time computer vision and image processing tasks.
**Key Points**:
- **Purpose**: OpenCV helps provide artificial vision to machines, enabling them to understand and process visual information like images and videos.
- **Features**: OpenCV allows you to work with images and videos for tasks like transformation, filtering, and enhancement. It also supports real-time processing, making it ideal for dynamic applications.
- **Applications**: Commonly used in tasks such as image recognition, motion detection, video analytics, and robotics.
OpenCV is cross-platform, free to use, and designed for high performance, making it an essential tool for computer vision projects.
""")
# Installing OpenCV Section
st.markdown("""
<h3 style="color: #9400d3;">Installing OpenCV</h3>
""", unsafe_allow_html=True)
st.write("""
To start working with OpenCV, you need to install it in your Python environment. Here’s how:
""")
st.write("1. Install OpenCV using pip:")
st.code("pip install opencv-python", language="bash")
st.write("2. Import OpenCV in your Python script:")
st.code("""
import cv2
print(cv2.__version__) # This will display the installed OpenCV version
""", language="python")
st.write("With OpenCV installed, let's learn basic image handling in OpenCV.")
st.write("## Basic Operations in OpenCV")
# Heading for Reading Images with Custom Color
st.markdown("""
<h3 style="color: #9400d3;">Reading an Image</h3>
""", unsafe_allow_html=True)
# About imread() function
st.write("""
To read an image and convert it into a machine-readable format, we use the **imread()** function from the cv2 module.
It reads the image file and converts it into a numerical array, where each element represents pixel intensity.
""")
# Code example
st.code("""
# Read the image
img = cv2.imread('path_to_image.jpg') # Replace 'path_to_image.jpg' with the image file path
# Display the numerical matrix
print(img) # This will print the image as an array of pixel values
""", language="python")
# Explanation for Grayscale Conversion
st.write("""
By default, the `imread()` function reads an image as a 3D array in the RGB color space.
To convert the image to grayscale, pass `0` as the second argument to the `imread()` function. This will return a 2D array where each pixel value represents intensity.
""")
# Code example for Grayscale Conversion
st.code("""
# Read the image in grayscale
gray_img = cv2.imread('path_to_image.jpg', 0) # Replace 'path_to_image.jpg' with your image file path
# Display the numerical matrix for the grayscale image
print(gray_img) # This will print the 2D array representing pixel intensity
""", language="python")
# Displaying Images with OpenCV in Custom Color
st.markdown("""
<h3 style="color: #9400d3;">Displaying Images with OpenCV</h3>
""", unsafe_allow_html=True)
# Explanation of the functions
st.write("""
After creating or reading an image, we can display it using OpenCV. Here’s how the key functions work together:
#### imshow()
- The `imshow()` function creates a **pop-up window** to display the image.
- Internally, it converts the numerical array into a visual image.
- **Parameters**:
1. `Window Name`: Title of the pop-up window (string).
2. `Image Array`: The array representing the image.
#### waitKey()
- **Purpose**: Waits for a key press and adds a delay before closing the pop-up window.
- **Key Modes**:
- `waitKey(0)` or `waitKey()`: Keeps the window open indefinitely until a key is pressed.
- `waitKey(n)`: Delays for `n` milliseconds, closing the window after the delay if no key is pressed.
#### destroyAllWindows()
- The `destroyAllWindows()` function is used to close the pop-up window from **RAM**.
- It ensures that all windows opened by `imshow()` are completely removed.
- Without this, the window may stay allocated in memory even after being closed visually.
These three functions must work together to display and manage images effectively.
""")
st.code("""
# imshow()
cv2.imshow(window_name, img_array)
# window_name: The title of the pop-up window
# img_array: The image data (Array)
# waitKey()
cv2.waitKey(delay_in_milliseconds)
# delay_in_milliseconds: Time in milliseconds to keep the window open
# Use 0 for infinite delay until a key is pressed
# destroyAllWindows()
cv2.destroyAllWindows()
# This ensures all windows opened by imshow() are cleared from RAM
""", language="python")
# Heading for Saving Images
st.markdown("""
<h3 style="color: #9400d3;">Saving an Image</h3>
""", unsafe_allow_html=True)
# About imwrite() function
st.write("""
To save an image file in OpenCV, we use the **imwrite()** function.
It converts the numerical array (image data) back into an image file format, such as `.jpg`, `.png`, or `.bmp`.
""")
# Code example
st.code("""
# Example: Save an image
cv2.imwrite('saved_image.jpg', image_array) # 'saved_image.jpg' is the name of the output file
print("Image saved successfully!")
""", language="python")
# Add a link to the OpenCV documentation
st.markdown("""
For more detailed information on OpenCV functions and tutorials, visit the official OpenCV documentation:
[OpenCV Documentation](https://docs.opencv.org/4.x/)
""")
st.write("""
In the next section, we'll take a closer look at **image creation and manipulation using OpenCV**. We'll discuss:
- **Creating different types of images** (black-and-white, grayscale, and RGB).
- **Splitting images** into individual channels.
- **Converting images** between various color spaces.
Curious to learn more?πŸ‘‡Click **Explore Image Creation and Manipulation** to continue your journey with OpenCV! πŸš€
""")
# First row: Colab Notes and Main Page
col1, col2 = st.columns(2)
with col1:
st.markdown("""
<a href="https://colab.research.google.com/drive/1P9nT1HmOcCah5nTctxCm7hVdqqYPUkzn?usp=sharing" target="_blank" title="Open Colab Notes">
<button style="background-color: #4CAF50; color: white; padding: 10px; font-size: 16px; border: none; cursor: pointer;">
Colab Notes
</button>
</a>
""", unsafe_allow_html=True)
with col2:
if st.button("Unstructured Data"):
navigate_to("main")
st.query_params.update({}) # Ensure scroll resets
# Second row: Previous Page and Explore Operations
col3, col4 = st.columns(2)
with col3:
if st.button("⬅️ Overview Page"):
navigate_to("explore_images_video")
st.query_params.update({}) # Ensure scroll resets
with col4:
if st.button("➑️ Explore Operations"):
navigate_to("image_operations")
st.query_params.update({}) # Ensure scroll resets
elif st.session_state.current_page == "image_operations":
st.query_params.update({})
# Heading for the section
st.markdown("""
<h2 style="color: #BB3385;">Creating, Splitting, and Converting Images with OpenCV</h2>
""", unsafe_allow_html=True)
# Short introduction to the section
st.write("""
In this section, we’ll learn how to create different types of images, split them into their color channels, and convert between various color spaces to manipulate images more effectively.
""")
# Heading for Creating Black and White Image
st.markdown("""
<h3 style="color: #9400d3;">Creating a Black and White Image</h3>
""", unsafe_allow_html=True)
# Explanation
st.write("""
In OpenCV, black and white images are created by filling a matrix with pixel values:
- **Black image**: All pixel values are set to 0.
- **White image**: All pixel values are set to 255.
""")
# Code example
st.code("""
white_img = np.full((500, 500), 255, dtype=np.uint8) # Create a white image
black_img = np.zeros((500, 500), dtype=np.uint8) # Create a black image
# Display the images
cv2.imshow("White", white_img)
cv2.imshow("Black", black_img)
cv2.waitKey(0) # 0 means infinite delay
cv2.destroyAllWindows()
""", language="python")
# Heading for Creating Grayscale Image
st.markdown("""
<h3 style="color: #9400d3;">Creating a Grayscale Image</h3>
""", unsafe_allow_html=True)
# Explanation
st.write("""
In OpenCV, grayscale images are created by filling a matrix with pixel intensity values. The values range from 0 (black) to 255 (white).
""")
# Code example
st.code("""
gray_img = np.full((500, 500), 127, dtype=np.uint8) # Create a grayscale image (127 represents medium gray)
# Display the grayscale image
cv2.imshow("Grayscale", gray_img)
cv2.waitKey(0) # 0 means infinite delay
cv2.destroyAllWindows()
""", language="python")
# Heading for cv2.merge() function
st.markdown("""
<h3 style="color: #e25822;">Merging Color Channels</h3>
""", unsafe_allow_html=True)
# About cv2.merge() function
st.write("""
To combine multiple single-channel images (like Red, Green, and Blue) into a single multi-channel image, we use the **cv2.merge()** function.
This function merges individual color channels into a complete color image.
""")
# Syntax example for cv2.merge()
st.code("""
# Merging individual color channels (Blue, Green, Red)
merged_image = cv2.merge([blue_channel, green_channel, red_channel])
# blue_channel,green_channel,red_channel are single-channel images representing individual color channels(Blue, Green, Red)
""", language="python")
# Heading for Creating RGB Image
st.markdown("""
<h3 style="color: #9400d3;">Creating a Colored RGB Image</h3>
""", unsafe_allow_html=True)
# Explanation
st.write("""
To create a colored image, we use individual color channels (Red, Green, Blue) and merge them using `cv2.merge()`.
In this example:
- The **Blue channel** is filled with 255 (full intensity).
- The **Green channel** is set to 0 (no intensity).
- The **Red channel** is also set to 0 (no intensity).
The channels are then merged into a single RGB image, which is displayed using OpenCV.
""")
# Code example
st.code("""
# Create individual color channels
b = np.full((300, 300), 255, dtype=np.uint8) # Blue channel
g = np.zeros((300, 300), dtype=np.uint8) # Green channel
r = np.zeros((300, 300), dtype=np.uint8) # Red channel
# Merge the color channels to create RGB images
b_img = cv2.merge([b, g, r]) # Blue image
g_img = cv2.merge([g, b, r]) # Green image
r_img = cv2.merge([r, g, b]) # Red image
# Display the images
cv2.imshow("Blue", b_img)
cv2.imshow("Green", g_img)
cv2.imshow("Red", r_img)
cv2.waitKey(0) # Wait until a key is pressed
cv2.destroyAllWindows() # Close all OpenCV windows
""", language="python")
st.image(
"https://huggingface.co/spaces/LakshmiHarika/MachineLearning/resolve/main/Images/merging_rgb.jpg",
use_container_width=True)
# Heading for Splitting Channels
st.markdown("""
<h3 style="color: #e25822;">Splitting Channels</h3>
""", unsafe_allow_html=True)
# About cv2.split() function
st.write("""
The `cv2.split()` function in OpenCV is used to divide an image into its individual color channels.
It creates separate single-channel arrays for each color, allowing you to work with them independently.
For example, it can split an RGB image into its Red, Green, and Blue channels.
""")
# Syntax for cv2.split() function
st.code("""
# Syntax for cv2.split()
channels = cv2.split(image)
# image: The input image (e.g., an RGB image).
# channels: A list of single-channel images (e.g., Blue, Green, Red).
""", language="python")
# Heading for the section
st.markdown("""
<h3 style="color: #9400d3;">Splitting and Merging Color Channels</h3>
""", unsafe_allow_html=True)
# Code Example for Splitting and Merging Color Channels
st.code("""
img = cv2.imread(r"P:\\BSG(P)\\7b144ce3dff5652ff59f2eb694eba472.jpg") # Read the image
b, g, r = cv2.split(img) # Split the image into Blue, Green, and Red channels
zeros = np.zeros(img.shape[:-1], dtype=np.uint8) # Create a zeros array to hold the empty channels
blue_channel = cv2.merge([b, zeros, zeros]) # Merge the Blue channel with zeros for Green and Red
green_channel = cv2.merge([zeros, g, zeros]) # Merge the Green channel with zeros for Blue and Red
red_channel = cv2.merge([zeros, zeros, r]) # Merge the Red channel with zeros for Blue and Green
# Display the individual color channels and the original image
cv2.imshow("Blue_channel", blue_channel)
cv2.imshow("Green_channel", green_channel)
cv2.imshow("Red_channel", red_channel)
cv2.imshow("Original_img", cv2.merge([b, g, r]))
cv2.waitKey(0)
cv2.destroyAllWindows()""", language="python")
st.image(
"https://huggingface.co/spaces/LakshmiHarika/MachineLearning/resolve/main/Images/splitting_rgb.jpeg",
use_container_width=True)
st.write("Once you upload an image, it will be split into its color channels (Blue, Green, and Red), with each channel displayed separately. You can then download the processed image.")
# Allow user to upload an image
uploaded_file = st.file_uploader("Upload an image", type=["jpg", "png", "jpeg"])
if uploaded_file is not None:
try:
# Convert the uploaded image to an OpenCV-compatible format
image = np.array(bytearray(uploaded_file.read()), dtype=np.uint8)
img = cv2.imdecode(image, 1) # Decode into an image
if img is None:
st.error("Uploaded file is not a valid image. Please upload a valid JPG or PNG file.")
else:
# Split the image into Blue, Green, and Red channels
b, g, r = cv2.split(img)
# Create a zeros array to hold the empty channels
zeros = np.zeros(img.shape[:-1], dtype=np.uint8)
# Merge the Blue channel with zeros for Green and Red
blue_channel = cv2.merge([b, zeros, zeros])
green_channel = cv2.merge([zeros, g, zeros])
red_channel = cv2.merge([zeros, zeros, r])
# Display the images with captions
st.image(blue_channel, caption="Blue Channel", channels="BGR", use_container_width=True)
st.image(green_channel, caption="Green Channel", channels="BGR", use_container_width=True)
st.image(red_channel, caption="Red Channel", channels="BGR", use_container_width=True)
# Merge the channels back together for the original image
original_img = cv2.merge([b, g, r])
# Display the original image
st.image(original_img, caption="Original Image", channels="BGR", use_container_width=True)
# Provide a download link for the processed image
st.download_button(
label="Download Merged Image",
data=cv2.imencode('.jpg', original_img)[1].tobytes(),
file_name="merged_image.jpg",
mime="image/jpeg"
)
except Exception as e:
st.error(f"An error occurred: {e}")
else:
st.write("Please upload an image to proceed.")
# Heading for cv2.cvtColor() function
st.markdown("""
<h3 style="color: #9400d3;">Converting Color Spaces</h3>
""", unsafe_allow_html=True)
# About cv2.cvtColor() function
st.write("""
The **`cv2.cvtColor()`** function in OpenCV is used to convert an image from one color space to another.
This function is widely used for various color space transformations, such as converting a color image to grayscale or converting between RGB and HSV.
""")
# Syntax example for cv2.cvtColor()
st.code("""
# Converting an RGB image to Grayscale
gray_img = cv2.cvtColor(rgb_image, cv2.COLOR_BGR2GRAY)""", language="python") # Convert from BGR to Grayscale
st.image(
"https://huggingface.co/spaces/LakshmiHarika/MachineLearning/resolve/main/Images/rgb_to_gray.jpg",
use_container_width=True)
st.write("""
In the next section, we will dive into **video processing using OpenCV**. We will explore how to:
- Use various OpenCV functions for handling video data.
- Play videos using OpenCV.
- Capture images from live video streams.
Stay tuned for an exciting exploration of video handling!
""")
# First row: Colab Notes and Main Page
col1, col2 = st.columns(2)
with col1:
st.markdown("""
<a href="https://colab.research.google.com/drive/YOUR_COLAB_NOTEBOOK_LINK" target="_blank" title="Open Colab Notes">
<button style="background-color: #4CAF50; color: white; padding: 10px; font-size: 16px; border: none; cursor: pointer;">
Colab Notes
</button>
</a>
""", unsafe_allow_html=True)
with col2:
if st.button("Unstructured Data"):
navigate_to("main")
st.query_params.update({}) # Reset scroll position
# Second row: Custom Navigation Buttons
col3, col4 = st.columns(2)
with col3:
if st.button("⬅️Back to Image Operations"):
navigate_to("opencv_operations") # Replace with the actual previous page identifier
st.query_params.update({}) # Reset scroll position
with col4:
if st.button("➑️Explore Video Processing"):
navigate_to("video_processing") # Replace with the actual next page identifier
st.query_params.update({}) # Reset scroll position
elif st.session_state.current_page == "video_processing":
st.query_params.update({})
# Heading for Introduction to Video Processing
st.markdown("""
<h3 style="color: #9400d3;">Introduction to Video Processing</h3>
""", unsafe_allow_html=True)
# Explanation about Video Processing
st.write("""
In computer vision, **video processing** refers to the analysis and manipulation of video data, which is essentially a series of images (frames) displayed in sequence. Each frame is processed individually, and the sequence is used to analyze changes or actions over time.
Video processing allows us to work with various types of video data, including video files or real-time video streams. Using OpenCV, we can read, display, manipulate, and save video files, as well as capture video from a camera or webcam.
""")
# Heading for How OpenCV Handles Videos
st.markdown("""
<h3 style="color: #9400d3;">How OpenCV Handles Videos</h3>
""", unsafe_allow_html=True)
# Explanation about How OpenCV Handles Videos
st.write("""
OpenCV provides simple and efficient methods to handle videos. Videos are essentially a sequence of images (frames) shown in rapid succession. OpenCV reads and processes each frame of the video in real time, much like how it handles individual images.
To work with videos in OpenCV, the primary function is **`cv2.VideoCapture()`**, which allows you to:
- **Load** video files or live video streams.
- **Read** individual frames from the video.
- **Display** the frames in a window.
- **Process** each frame just like an image.
Once the video is loaded, OpenCV processes each frame in a loop until the video ends or the user stops it. You can apply image processing techniques to each frame, such as transformations, filtering, or object detection, before displaying or saving the modified video.
""")
# Heading for Playing Videos with OpenCV
st.markdown("""
<h3 style="color: #9400d3;">Playing Videos with OpenCV</h3>
""", unsafe_allow_html=True)
# Explanation
st.write("""
To play a video using OpenCV, we load the video with **`cv2.VideoCapture()`** and display each frame using **`cv2.imshow()`**. You can stop the video by pressing 'q'.
""")
# Code example for playing a video
st.code("""
# Load the video
vid = cv2.VideoCapture('path_to_video.mp4')
# Loop to read frames
while True:
succ, img = vid.read() # Read each frame
if not succ: # Exit if no frame is read
break
cv2.imshow("video", img) # Show the frame
# Press 'q' to quit
if cv2.waitKey(1) & 255 == ord("q"):
break
# Release resources and close window
cv2.destroyAllWindows()
""", language="python")
# Heading for cv2.read()
st.markdown("""
<h3 style="color: #e25822;"> Understanding vid.read()</h3>
""", unsafe_allow_html=True)
# Explanation for vid.read()
st.write("""
The **`vid.read()`** function is used to read one frame at a time from the video file.
It returns two values:
1. **`succ`**: A boolean that indicates whether the frame was successfully read.
- **True** if the frame was read successfully.
- **False** if the frame could not be read (usually when the video ends).
2. **`img`**: The actual frame (image) read from the video. This frame is returned as a NumPy array and can be processed just like any image.
""")
# Short Heading
st.markdown("""
<h3 style="color: #e25822;">Understanding cv2.waitKey()</h3>
""", unsafe_allow_html=True)
# Explanation
st.write("""
The line `if cv2.waitKey(1) & 255 == ord('q'):` is used in OpenCV to check if a specific key is pressed while processing video. Here’s a simple explanation of what it does:
- **`cv2.waitKey(1)`**:
- Waits for a key press for **1 millisecond**.
- If a key is pressed, it returns the key’s code. If no key is pressed, it returns `-1`.
- **`& 255`**:
- Ensures the key code is compatible across different systems.
- Keeps only the last **8 bits**, which represent the key code.
- **`ord('q')`**:
- Finds the ASCII code for the letter `'q'`.
- The ASCII code for `'q'` is **113**.
- This is used to check if the user pressed the `'q'` key to stop the program.
### Full Condition:
```python
if cv2.waitKey(1) & 255 == ord('q'):
break
```
This stops the video when the 'q' key is pressed.
""")
# Heading for Converting BGR to Grayscale
st.markdown("""
<h3 style="color: #9400d3;">Converting BGR Video to Grayscale</h3>
""", unsafe_allow_html=True)
# Brief Explanation
st.write("""
You can handle video frames one at a time and process them as needed. The following example shows how to:
- Convert each frame of a video from BGR to grayscale.
- Display both the original and grayscale video frames side by side.
""")
# Code Example
st.code("""
vid = cv2.VideoCapture('path_to_video.mp4')
while True:
succ, img = vid.read()
if succ == False:
break
# Convert frame from BGR to grayscale
img1 = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Display the original (colored) and grayscale frames
cv2.imshow("Colored Video", img)
cv2.imshow("Grayscale Video", img1)
# Press 'q' to quit the video
if cv2.waitKey(1) & 255 == ord("q"):
break
cv2.destroyAllWindows()
""", language="python")
# Heading for Splitting Video into Color Channels
st.markdown("""
<h3 style="color: #9400d3;">Splitting Colored Video into Different Channels</h3>
""", unsafe_allow_html=True)
# Brief Explanation
st.write("""
Each frame of a colored video consists of three channels: Blue, Green, and Red (BGR).
The following example demonstrates how to:
- Split the video frames into separate Blue, Green, and Red channels.
- Display the original video alongside each color channel.
""")
# Code Example
st.code("""
vid = cv2.VideoCapture('path_to_video.mp4')
while True:
succ, img = vid.read()
if succ == False:
break
# Convert frame to grayscale
img1 = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Split the frame into B, G, R channels
b, g, r = cv2.split(img)
z = np.zeros(img.shape[:-1], dtype=np.uint8)
blue_channel = cv2.merge([b, z, z])
green_channel = cv2.merge([z, g, z])
red_channel = cv2.merge([z, z, r])
# Display the frames
cv2.imshow("Colored Video", img)
cv2.imshow("Grayscale Video", img1)
cv2.imshow("Blue Channel", blue_channel)
cv2.imshow("Green Channel", green_channel)
cv2.imshow("Red Channel", red_channel)
# Press 'q' to quit
if cv2.waitKey(1) & 255 == ord("q"):
break
cv2.destroyAllWindows()
""", language="python")
# Heading for Capturing Frames via Webcam
st.markdown("""
<h3 style="color: #9400d3;">Capturing Frames While Live Streaming Using Webcam</h3>
""", unsafe_allow_html=True)
# Brief Explanation
st.write("""
OpenCV allows you to access your webcam for live video streaming. The `cv2.VideoCapture()` function is used to activate the webcam. Here's how it works:
- **`cv2.VideoCapture(0)`**:
- The argument `0` tells OpenCV to access the default webcam on your computer.
- If you have multiple cameras, you can pass other IDs (like `1`, `2`) to access them.
- It creates a connection with the webcam and starts capturing frames in real time.
The following example demonstrates how to:
- Activate the webcam.
- Display the live stream.
- Close the webcam window by pressing the 'p' key.
""")
# Code Example
st.code("""
vid = cv2.VideoCapture(0) # 0 indicates the default webcam
while True:
succ, img = vid.read()
if succ == False: # Optional: Check if the webcam is working
print("Camera not working")
break
# Display the live stream
cv2.imshow("Live Stream", img)
# Press 'p' to stop the live stream
if cv2.waitKey(1) & 255 == ord("p"):
break
vid.release()
cv2.destroyAllWindows()
""", language="python")
# Heading for Capturing and Saving Frames
st.markdown("""
<h3 style="color: #9400d3;">Capturing and Saving Frames</h3>
""", unsafe_allow_html=True)
# Brief Explanation
st.write("""
This code uses OpenCV to access the webcam, display the video feed, and save specific frames as image files:
- **Webcam Activation**: The `cv2.VideoCapture(0)` function initializes the default webcam.
- **Capturing Frames**: Press **'s'** to capture and save the current frame to a specified directory.
- **Stopping the Stream**: Press **'p'** to stop the webcam and close the application.
""")
# Code Example
st.code("""
vid = cv2.VideoCapture(0) # Open webcam
c = 0 # Counter for naming saved images
while True:
succ, img = vid.read()
if succ == False: # Check if the webcam is working
print("Camera not working")
break
cv2.imshow("Live Stream", img) # Display live stream
# Save frame as an image file when 's' is pressed
if cv2.waitKey(1) & 255 == ord("s"):
cv2.imwrite('path_to_save_directory/{}.jpg'.format(c), img)
print("Image is captured and saved")
c += 1 # Increment counter for next image name
# Quit live stream when 'p' is pressed
if cv2.waitKey(1) & 255 == ord("p"):
break
vid.release()
cv2.destroyAllWindows()
""", language="python")
# Concluding the Current Section
st.write("""
In the next section, we will explore **image transformations using OpenCV**. We will cover how to:
- Rotate images at various angles.
- Flip images horizontally and vertically.
- Scale and resize images to different dimensions.
Get ready to learn about powerful image transformation techniques!
""")
# First row: Colab Notes and Main Page
col1, col2 = st.columns(2)
with col1:
st.markdown("""
<a href="https://colab.research.google.com/drive/1zzovaAr4NNJoJSR-lA9ApSyPguLKf3UB?usp=sharing" target="_blank">
<button style="background-color: #4CAF50; color: white; padding: 10px; font-size: 16px; border: none; cursor: pointer;">
Colab Notes
</button>
</a>
""", unsafe_allow_html=True)
with col2:
if st.button("Unstructured Data"):
navigate_to("main") # Main page: Images & Videos
st.query_params.update({}) # Ensure scroll resets
# Second row: Previous Page and Video Transformations
col3, col4 = st.columns(2)
with col3:
if st.button("⬅️ Back to Image Operations"):
navigate_to("image_operations") # Previous page: Explore Image Creation and Manipulation
st.query_params.update({}) # Ensure scroll resets
with col4:
if st.button("➑️ Explore Image Transformations"):
navigate_to("image_transformations") # Next page: Explore Video Transformations with OpenCV
st.query_params.update({}) # Ensure scroll resets
elif st.session_state.current_page == "image_transformations":
st.query_params.update({})
st.markdown("""
<h2 style="color: #BB3385;">Image Augmentation Techniques</h2>
""", unsafe_allow_html=True)
# Page: What is Image Augmentation?
# Heading
st.markdown("""
<h3 style="color: #9400d3;">What is Image Augmentation?</h3>
""", unsafe_allow_html=True)
# Definition
st.write("""
Image augmentation is a method used to enhance the size and variety of an image dataset by applying transformations to existing images.
These transformations introduce variations while preserving the core features of the image, making it useful for training machine learning models to handle diverse inputs.
**How It Works**
Image augmentation applies transformations like resizing, rotation, flipping, and more to the original image. These changes simulate real-world variations, ensuring that machine learning models can identify patterns even with differences in perspective, scale, or lighting conditions.
The key idea is to preserve the original features of the image while introducing diversity. For example, if we take an image and apply five different transformations, we generate five new variations of that image. This provides the model with more data to learn from, improving its performance and ability to generalize.
""")
# Types of Image Augmentation
st.markdown("""
<h3 style="color: #9400d3;">Types of Image Augmentation</h3>
""", unsafe_allow_html=True)
st.write("""
Image augmentation is broadly categorized into two types:
1. **Affine Transformations**
2. **Non-Affine Transformations**
""")
# Affine Transformations
st.markdown("""
<h3 style="color: #9400d3;">Affine Transformations</h3>
""", unsafe_allow_html=True)
st.write("""
**Affine Transformations** are transformations where:
1. The transformed image and the original image maintain **parallelism between lines**.
2. In some cases, the **angle between lines** and the **length of the lines** may also be preserved.
These transformations ensure that the geometric relationships within the image remain intact, even as the image is resized, rotated, or shifted.
Affine transformations are performed using a mathematical operation known as an **Affine Matrix**, which maps the original image coordinates to new coordinates.
""")
st.markdown("""
<h3 style="color: #e25822;">Common Affine Transformations:</h3>
""", unsafe_allow_html=True)
st.write("""
1. **Scaling**: Changing the size of the image while maintaining its proportions.
2. **Translation**: Shifting the image horizontally, vertically, or both.
3. **Rotation**: Rotating the image around a specified center point.
4. **Shearing**: Slanting the image along the x or y axis, creating a skewed effect.
5. **Cropping**: Extracting a specific portion of the image, usually to focus on a region of interest.
These transformations are linear, meaning the relationships between points in the image remain consistent.
""")
st.image(
"https://huggingface.co/spaces/LakshmiHarika/MachineLearning/resolve/main/Images/affine_transformations.png",
use_container_width=True)
# Explanation for Translation
st.markdown("""
<h3 style="color: #9400d3;">Translation</h3>
""", unsafe_allow_html=True)
st.write("""
**Translation** involves moving an image from one location to another along the x-axis, y-axis, or both. It adjusts the position of the image on the canvas without modifying its original content.
The transformation is performed using a translation matrix:
""")
st.write("""
The translation matrix is represented as:
[[1, 0, tx], [0, 1, ty]]
Here:
- **tx**: Specifies the shift along the x-axis (horizontal axis).
- **ty**: Specifies the shift along the y-axis (vertical axis).
""")
st.code("""
# Load the image
img = cv2.imread('path_to_image.jpg')
# Define translation parameters
tx = 100 # Shift 100 pixels along the x-axis
ty = 50 # Shift 50 pixels along the y-axis
# Create the translation matrix
translation_matrix = np.array([[1, 0, tx], [0, 1, ty]], dtype=np.float32)
# Apply translation
translated_img = cv2.warpAffine(img, translation_matrix, (300, 300))
# Display the images
cv2.imshow("Original Image", img)
cv2.imshow("Translated Image", translated_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
""", language="python")
# Explanation for Rotation
st.markdown("""
<h3 style="color: #9400d3;">Rotation</h3>
""", unsafe_allow_html=True)
st.write("""
**Rotation** involves rotating an image around a specified center point by a given angle. It changes the orientation of the image while preserving its content.
The rotation is performed using a rotation matrix:
[[cos(ΞΈ), -sin(ΞΈ), tx], [sin(ΞΈ), cos(ΞΈ), ty]]
Here:
- **ΞΈ (theta)**: Specifies the rotation angle in degrees.
- **tx, ty**: Specifies the adjustments to reposition the rotated image.
- **Scale**: A factor that can resize the image during rotation.
""")
# Code Example
st.code("""
# Load the image
img = cv2.imread('path_to_image.jpg')
# Define the rotation matrix
r_m = cv2.getRotationMatrix2D((1347, 900), 50, 1) # Center at (1347, 900), Rotate by 50 degrees, Scale = 1
# Apply rotation
r_img = cv2.warpAffine(img, r_m, (580, 500), borderMode=cv2.BORDER_DEFAULT)
# Display the images
cv2.imshow("Original Image", img)
cv2.imshow("Rotated Image", r_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
""", language="python")
# Explanation for Direct Rotation
st.markdown("""
<h3 style="color: #9400d3;">Direct Rotation Using cv2.rotate</h3>
""", unsafe_allow_html=True)
st.write("""
OpenCV provides a direct method for rotating images with predefined angles: `cv2.rotate`.
This method simplifies rotation operations for 90Β°, 180Β°, and 270Β° (clockwise or counterclockwise) without requiring a custom rotation matrix.
- **`cv2.ROTATE_180`**: Rotates the image by 180Β°.
- **`cv2.ROTATE_90_CLOCKWISE`**: Rotates the image by 90Β° clockwise.
- **`cv2.ROTATE_90_COUNTERCLOCKWISE`**: Rotates the image by 90Β° counterclockwise.
""")
# Code Example
st.code("""
# Rotate the image using predefined rotation modes
img1 = cv2.rotate(img, cv2.ROTATE_180) # Rotate 180 degrees
img2 = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE) # Rotate 90 degrees clockwise
img3 = cv2.rotate(img, cv2.ROTATE_90_COUNTERCLOCKWISE) # Rotate 90 degrees counterclockwise
# Display the images
cv2.imshow("Original Image", img)
cv2.imshow("Rotated 180Β°", img1)
cv2.imshow("Rotated 90Β° Clockwise", img2)
cv2.imshow("Rotated 90Β° Counterclockwise", img3)
cv2.waitKey(0)
cv2.destroyAllWindows()
""", language="python")
# Explanation for Shearing
st.markdown("""
<h3 style="color: #9400d3;">Shearing</h3>
""", unsafe_allow_html=True)
st.write("""
**Shearing** is a transformation that slants the shape of an image along the x-axis, y-axis, or both. It skews the content of the image, creating a shifted or stretched effect.
The transformation is performed using a shearing matrix:
""")
st.write("""
The shearing matrix is represented as:
For x-axis shear:
[[1, shx, 0], [0, 1, 0]]
For y-axis shear:
[[1, 0, 0], [shy, 1, 0]]
Here:
- **shx**: Shear factor along the x-axis.
- **shy**: Shear factor along the y-axis.
""")
st.code("""
# Load the image
img = cv2.imread('path_to_image.jpg')
# Define shearing parameters
shx = 1 # Shear factor along the x-axis
shy = 3 # Shear factor along the y-axis
tx = 0 # Translation along the x-axis
ty = 0 # Translation along the y-axis
# Create the shearing matrix
shearing_matrix = np.array([[1, shx, tx], [shy, 1, ty]], dtype=np.float32)
# Apply the shearing transformation
sheared_img = cv2.warpAffine(img, shearing_matrix, (300, 300))
# Display the original and sheared images
cv2.imshow("Original Image", img)
cv2.imshow("Sheared Image", sheared_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
""", language="python")
# Explanation for Scaling
st.markdown("""
<h3 style="color: #9400d3;">Scaling</h3>
""", unsafe_allow_html=True)
st.write("""
**Scaling** is a transformation that changes the size of an image. It can be used to enlarge or shrink the image while maintaining its original proportions or altering them.
Scaling is performed using a scaling matrix:
""")
st.write("""
The scaling matrix is represented as:
[[sx, 0, 0], [0, sy, 0]]
Here:
- **sx**: Scaling factor along the x-axis.
- **sy**: Scaling factor along the y-axis.
- If `sx` and `sy` are greater than 1, the image is enlarged.
- If `sx` and `sy` are less than 1, the image is shrunk.
""")
st.code("""
# Load the image
img = cv2.imread('path_to_image.jpg')
# Define scaling and translation parameters
sx, sy = 2, 1 # Scale by 2 along the x-axis and 1 along the y-axis
tx, ty = 0, 0 # No translation
# Create the scaling matrix
scaling_matrix = np.array([[sx, 0, tx], [0, sy, ty]], dtype=np.float32)
# Apply scaling
scaled_img = cv2.warpAffine(img, scaling_matrix, (2 * 300, 300))
# Display the images
cv2.imshow("Original Image", img)
cv2.imshow("Scaled Image", scaled_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
""", language="python")
# Explanation for Cropping
st.markdown("""
<h3 style="color: #9400d3;">Cropping</h3>
""", unsafe_allow_html=True)
st.write("""
**Cropping** is a transformation that extracts a specific portion of an image, usually to focus on a region of interest.
It is achieved by selecting a rectangular region of the image using pixel coordinates.
The process involves defining the coordinates for:
- **Top-left corner (x1, y1)**: Starting point of the crop.
- **Bottom-right corner (x2, y2)**: Ending point of the crop.
""")
st.code("""
# Load the image
img = cv2.imread('path_to_image.jpg')
# Define crop coordinates
x1, y1 = 50, 50 # Top-left corner
x2, y2 = 200, 200 # Bottom-right corner
# Crop the image
cropped_img = img[y1:y2, x1:x2]
# Display the images
cv2.imshow("Original Image", img)
cv2.imshow("Cropped Image", cropped_img)
cv2.waitKey(0)
cv2.destroyAllWindows()
""", language="python")
# Function to apply affine transformations
def apply_affine_transformation(image, transformation_type):
transformed_images = []
rows, cols, _ = image.shape
for i in range(1, 11): # Generate 10 variations
if transformation_type == "Rotation":
angle = i * 10
M = cv2.getRotationMatrix2D((cols / 2, rows / 2), angle, 1)
elif transformation_type == "Scaling":
scale = 1 + (i * 0.05) # Reduced scale increments
M = np.float32([[scale, 0, 0], [0, scale, 0]])
elif transformation_type == "Translation":
tx, ty = i * 5, i * 5 # Reduced translation
M = np.float32([[1, 0, tx], [0, 1, ty]])
elif transformation_type == "Shearing":
shear = 0.05 * i # Reduced shear factor
M = np.float32([[1, shear, 0], [shear, 1, 0]])
elif transformation_type == "Cropping":
# Simple cropping: reduce the size incrementally
x1, y1 = i * 5, i * 5
x2, y2 = cols - i * 5, rows - i * 5
transformed_image = image[y1:y2, x1:x2]
transformed_images.append(transformed_image)
continue # Skip warpAffine for cropping
else:
st.error("Invalid transformation type!")
return []
transformed_image = cv2.warpAffine(image, M, (cols, rows))
transformed_images.append(transformed_image)
return transformed_images
# Streamlit App
st.title("Dynamic Affine Transformation Tool")
st.write("Select a transformation type to proceed and learn how it works before uploading an image.")
# Transformation Options
transformation = st.selectbox(
"Step 1: Select a transformation type:",
["Select a Transformation", "Rotation", "Scaling", "Translation", "Shearing", "Cropping"]
)
# Ensure the user selects a valid transformation
if transformation != "Select a Transformation":
# Provide guidance based on the selected transformation
if transformation == "Rotation":
st.info("Rotation rotates the image around a fixed point. Angles are applied in steps of 10 degrees.")
elif transformation == "Scaling":
st.info("Scaling adjusts the size of the image. The scale factor increases incrementally.")
elif transformation == "Translation":
st.info("Translation shifts the image horizontally and vertically in small steps.")
elif transformation == "Shearing":
st.info("Shearing skews the image along the x-axis or y-axis, creating a slanted effect.")
elif transformation == "Cropping":
st.info("Cropping trims the image edges step by step to focus on a smaller region.")
# Image Uploader (Only appears after selection)
uploaded_file = st.file_uploader("Step 2: Now, upload an image", type=["jpg", "jpeg", "png"])
if uploaded_file:
# Read the uploaded file into a numpy array using OpenCV
file_bytes = np.asarray(bytearray(uploaded_file.read()), dtype=np.uint8)
image = cv2.imdecode(file_bytes, cv2.IMREAD_COLOR)
# Display the uploaded image
st.image(cv2.cvtColor(image, cv2.COLOR_BGR2RGB), caption="Uploaded Image", use_container_width=True)
# Automatically apply the transformation after upload
transformed_images = apply_affine_transformation(image, transformation)
if transformed_images:
st.write(f"Generated 10 images using {transformation}:")
# Display all transformed images
for i, img in enumerate(transformed_images):
st.image(cv2.cvtColor(img, cv2.COLOR_BGR2RGB), caption=f"{transformation} {i+1}", use_container_width=True)
# Create ZIP file for download
zip_buffer = io.BytesIO()
with zipfile.ZipFile(zip_buffer, "w") as zip_file:
for i, img in enumerate(transformed_images):
# Save each image as bytes
_, img_encoded = cv2.imencode('.jpg', img)
zip_file.writestr(f"{transformation}_image_{i+1}.jpg", img_encoded.tobytes())
zip_buffer.seek(0)
st.download_button(
label=f"Download All {transformation} Images",
data=zip_buffer,
file_name=f"{transformation}_transformed_images.zip",
mime="application/zip"
)
else:
st.warning("Please select a valid transformation type to proceed.")
col1, col2 = st.columns(2)
# First row: Colab Notes and Unstructured Data
with col1:
st.markdown("""
<a href="https://colab.research.google.com/drive/19mQM-7VY0F0y9NyecrhbU5kyOY9j9oIk?usp=sharing" target="_blank" title="Open Colab Notes">
<button style="background-color: #4CAF50; color: white; padding: 10px; font-size: 16px; border: none; cursor: pointer;">
Colab Notes
</button>
</a>
""", unsafe_allow_html=True)
with col2:
if st.button("Back to Unstructured Data"):
st.session_state.current_page = "main"
st.query_params.update({})
# Second row: Back to and Explore Navigation
col3, col4 = st.columns(2)
with col3:
if st.button("⬅️ Back to Video Processing"):
st.session_state.current_page = "video_processing"
st.query_params.update({})
with col4:
if st.button("➑️ Explore OpenCV Projects"):
st.session_state.current_page = "opencv_projects"
st.query_params.update({})
elif st.session_state.current_page == "opencv_projects":
st.query_params.update({})
st.markdown("""
<h2 style="color: #BB3385;">OpenCV Projects</h2>
""", unsafe_allow_html=True)
# Project 1: Converting an Image into Tabular Data
st.markdown("""
<h3 style="color: #5b2c6f;">Converting an Image into Tabular Data</h3>
<p>
This project explains how to convert an image into tabular data by extracting pixel values
and representing them as structured rows and columns for analysis or machine learning tasks.
</p>
""", unsafe_allow_html=True)
st.markdown("""
<a href="https://github.com/lakshmiharikaa34/Machine-Learning/blob/main/Extract%20Images%20to%20Tabular%20Data.ipynb"
target="_blank" style="color: #2a52be;">
Check out the project on GitHub
</a>
""", unsafe_allow_html=True)
# Project 2: Converting a Video into Tabular Data
st.markdown("""
<h3 style="color: #5b2c6f;">Converting a Video into Tabular Data</h3>
<p>
Learn to process videos frame by frame and extract pixel data from each frame.
This project demonstrates how to represent video data in a structured tabular format.
</p>
""", unsafe_allow_html=True)
st.markdown("""
<a href="https://github.com/lakshmiharikaa34/Machine-Learning/blob/main/Extract%20Videos%20to%20Tabular%20Data.ipynb"
target="_blank" style="color: #2a52be;">
Check out the project on GitHub
</a>
""", unsafe_allow_html=True)
# Project 3: Animation Project
st.markdown("""
<h3 style="color: #5b2c6f;">A Tale of Integrity: Finding Money, Choosing Honesty</h3>
<p>
This animation tells the story of a young boy who finds a bundle of money. Torn between keeping it or doing the right thing,
he remembers his grandfather’s words: <em>"True character shines through in moments of choice."</em>
He chooses integrity and returns the money to its rightful owner, reminding us all that honesty is priceless.
</p>
""", unsafe_allow_html=True)
st.video("https://huggingface.co/spaces/LakshmiHarika/MachineLearning/resolve/main/Images/full_frame_video.mp4")
st.markdown("""
<a href="https://github.com/lakshmiharikaa34/Machine-Learning/blob/main/Animation%20Project.ipynb"
target="_blank" style="color: #2a52be;">
Check out the animation on GitHub
</a>
""", unsafe_allow_html=True)
# Project 4: GIF Project
st.markdown("""
<h3 style="color: #5b2c6f;">The Coding Journey: Debugging Woes Turned Joy (GIF)</h3>
<p>
This humorous and relatable GIF portrays every coder’s struggle: A boy sits at his desk exclaiming,
<em>"My code is not working; I don’t know what to do!"</em> Moments later, he joyfully discovers,
<em>"It’s working perfectly!"</em> – capturing the emotional highs and lows of debugging.
</p>
""", unsafe_allow_html=True)
st.video("https://huggingface.co/spaces/LakshmiHarika/MachineLearning/resolve/main/Images/giphy_animation%20(1).mp4")
st.markdown("""
<a href="https://github.com/lakshmiharikaa34/Machine-Learning/blob/main/Giffy.ipynb"
target="_blank" style="color: #2a52be;">
Check out the GIF on GitHub
</a>
""", unsafe_allow_html=True)
col1, col2 = st.columns(2)
# Column 1 - Button to go back to the Image Transformations page
with col1:
if st.button("⬅️ Image Transformations with OpenCV", help="Navigate back to the Image Transformations section"):
navigate_to("image_transformations")
st.query_params.update({}) # Ensure scroll resets
# Column 2 - Button for the main page (Images & Videos)
with col2:
if st.button("Unstructured Data", help="Navigate to the main page for Images & Videos"):
navigate_to("main")
st.query_params.update({}) # Ensure scroll resets
#--------------------------------------------------------- Audio--------------------------------------------------------------------------------
elif st.session_state.current_page == "explore_audio":
st.markdown("""
<h3 style="color: #e25822;">Exploring Audio</h3>
""", unsafe_allow_html=True)
st.write("""
Audio formats include MP3 and WAV for storing sound.
""")
if st.button("Go Back"):
navigate_to("main")
#--------------------------------------------------------- text--------------------------------------------------------------------------------
elif st.session_state.current_page == "explore_text":
st.markdown("""
<h3 style="color: #e25822;">Exploring Text</h3>
""", unsafe_allow_html=True)
st.write("""
Text includes unstructured data like emails or plain-text files.
""")
if st.button("Go Back"):
navigate_to("main")
#--------------------------------------------------------- CSV --------------------------------------------------------------------------------
elif st.session_state.current_page == "explore_csv":
st.query_params.update({})
st.markdown("""
<h2 style="color: #BB3385;">Comma-Separated Values(CSV)</h2>
""", unsafe_allow_html=True)
st.write("""
- **CSV (Comma-Separated Values)** is a simple file format used to store tabular data.
- Each row in a CSV file corresponds to a row of data, with columns separated by delimeter(default Commas).
- It is widely used for:
- Data storage and exchange between applications.
- Importing/exporting data in tools like Excel, databases, and programming languages.
- CSV files are lightweight, easy to read, and supported by most data-handling tools.
""")
st.markdown("""
<h3 style="color: #5b2c6f;">Reading CSV File</h3>
""", unsafe_allow_html=True)
st.code("""
# Read the CSV file
data = pd.read_csv('path_to_file.csv')
print(data.head()) # Displays the first 5 rows
""", language="python")
st.markdown("""
<h3 style="color: #5b2c6f;">Exporting CSV File</h3>
""", unsafe_allow_html=True)
st.code("""
# Sample DataFrame
data = pd.DataFrame({
'sepal_length': [1.5, 1.4, 1.5],
'sepal_width': [2.5, 2.8, 2.5],
'petal_length': [2.3, 2.2, 2.3],
'petal_width': [1.5, 1.1, 1.5],
'species': ['setosa', 'versicolor', 'virginica']
})
# Export the DataFrame to a CSV file
data.to_csv('iris_dataset.csv', index=False)
""", language="python")
# Issues Section
st.markdown("""
<h3 style="color: #5b2c6f;">Common Issues with CSV Files</h3>
""", unsafe_allow_html=True)
st.markdown("""
<h3 style="color: #2a52be;">1. ParserError</h3>
""", unsafe_allow_html=True)
st.write("""
- This error occurs when there is a mismatch in the number of columns in some rows.
- It is commonly caused when a CSV file is manually edited in a text editor, leading to structural inconsistencies.
- Use the `on_bad_lines='warn'` or `on_bad_lines='skip'` parameter in `pd.read_csv()` to handle problematic rows.
- Clean the CSV file to ensure consistent formatting.
""")
st.code("""
# Reading the CSV file with bad lines handled
df = pd.read_csv('sample.csv', on_bad_lines='warn')
print(df)
""", language="python")
st.markdown("""
<h3 style="color: #2a52be;">1. Encoding Error</h3>
""", unsafe_allow_html=True)
st.write("""
- Encoding is the process of converting text from one representation to another.
- It specifies how characters are stored and interpreted in files or streams.
- Common encoding formats include:
- **UTF-8**: The most widely used encoding format, compatible with most languages.
- **ISO-8859-1** (Latin-1): Often used for Western European languages.
- **ASCII**: Represents basic English characters.
""")
st.write("""
- This error occurs when a CSV file is saved with a different encoding format (e.g., `UTF-8`, `ISO-8859-1`).
- It often results in unreadable characters or errors while loading the file in Python.
- Use the `encoding` parameter in `pd.read_csv()` to specify the correct encoding format.
- Common encodings to try:
- `encoding='utf-8'`
- `encoding='latin1'`
- `encoding='iso-8859-1'`
""")
st.code("""
df = pd.read_csv('sample.csv', encoding='utf-8') # Specify the encoding format
print(df)
""", language="python")
st.write("""
To identify the correct encoding for a CSV file, we can iterate through all possible encodings and try to read the file.
This approach helps when the encoding of the file is unknown.
""")
st.code("""
import encodings
# Get all possible encodings
encoding_list = encodings.aliases.aliases.keys()
# Check which encoding works
for encoding in encoding_list:
try:
# Attempt to read the file with the current encoding
pd.read_csv(file_path, encoding=encoding)
print(f"{encoding} is correct encoding")
except UnicodeDecodeError:
print(f"{encoding} is not correct encoding")
""", language="python")
st.markdown("""
<h3 style="color: #2a52be;">1. Out of Memory Issue</h3>
""", unsafe_allow_html=True)
st.write("""
- Out of Memory issues occur when the size of the CSV file is too large to fit into the available memory (RAM) of the system.
- This typically happens when:
- Files contain millions of rows or a large number of columns.
- The system's RAM is insufficient to load the entire file at once.
- Read the CSV file in smaller chunks using the `chunksize` parameter in `pd.read_csv()`.
- Process and save chunks incrementally to avoid memory overload.
- Use efficient data types to reduce memory usage.
""")
st.code("""
# Reading a large CSV file in chunks
chunk_size = 100000 # Number of rows per chunk
chunks = []
for chunk in pd.read_csv('large_file.csv', chunksize = chunk_size):
chunks.append(chunk)
# Combine all chunks into a single DataFrame if needed
df = pd.concat(chunks, axis=0)
print(df)
""", language="python")
col1, col2 = st.columns(2)
with col1:
st.markdown("""
<a href="https://colab.research.google.com/drive/1fwKrcdVlZDT-iozan7OT7SyLFFILx9a6?usp=sharing" target="_blank">
<button style="background-color: #4CAF50; color: white; padding: 10px; font-size: 16px; border: none; cursor: pointer;">
Open Google Colab File
</button>
</a>
""", unsafe_allow_html=True)
with col2:
if st.button("⬅️ Back to Previous Page"):
st.session_state.current_page = "main"
st.query_params.update({}) # Ensure scroll resets
#--------------------------------------------------------- Json --------------------------------------------------------------------------------
elif st.session_state.current_page == "explore_json":
st.query_params.update({})
st.markdown("""
<h2 style="color: #BB3385;">JavaScript Object Notation (JSON)</h2>
""", unsafe_allow_html=True)
st.write("""
- **JSON (JavaScript Object Notation)** is a lightweight data-interchange format.
- It is easy for humans to read and write, and easy for machines to parse and generate.
- JSON is used to represent data as key-value pairs and supports hierarchical structures.
- Commonly used for:
- Web APIs for sending and receiving data.
- Configuration files.
- Storing structured and semi-structured data.
""")
st.markdown("""
<h3 style="color: #5b2c6f;">Default JSON Format</h3>
""", unsafe_allow_html=True)
st.write("""
- JSON format is similar to a Python dictionary with key-value pairs.
- The main difference between JSON and a Python dictionary is:
- **In JSON**:
- Keys must be in string format.
- Values can be of various types (e.g., strings, numbers, arrays, objects).
- **In Python Dictionary**:
- Keys can be any hashable type (e.g., strings, numbers, tuples).
""")
st.markdown("""
<h4 style="color: #2a52be;">Example</h4>
""", unsafe_allow_html=True)
st.code("""
# JSON Format
{
"name": ["a", "b", "c"],
"age": [11, 12, 13]
}
""", language="json")
st.code("""
# Python Dictionary
{
"name": ["a", "b", "c"],
"age": [11, 12, 13]
}
""", language="python")
st.markdown("""
<h3 style="color: #5b2c6f;">JSON in Structured Data</h3>
""", unsafe_allow_html=True)
st.write("""
- JSON is considered structured when it has a consistent format with uniform key-value pairs for all entries.
- This allows direct conversion into a tabular format, such as a DataFrame.
""")
st.code("""
# Example of Structured JSON
[
{ "Id": 100, "Name": "Lakshmi Harika", "Age": 22, "Gender": "Female" },
{ "Id": 101, "Name": "Varshitha", "Age": 23, "Gender": "Female" },
{ "Id": 102, "Name": "Hari Chandan", "Age": 22, "Gender": "Male" },
{ "Id": 103, "Name": "Shamitha", "Age": 23, "Gender": "Female" }
]
""", language="json")
st.code("""
# Reading a structured JSON file
df = pd.read_json('structured_data.json')
print(df)
""", language="python")
st.markdown("""
<h3 style="color: #5b2c6f;">JSON Orientations in Structured Data</h2>
""", unsafe_allow_html=True)
st.write("""
- JSON can represent data in various orientations using the `orient` parameter in `pandas.to_json()` or `pandas.read_json()`.""")
st.markdown("""
<h4 style="color: #2a52be;">JSON with Orient = 'index'</h4>
""", unsafe_allow_html=True)
st.write("""
- When **`orient='index'`**:
- In this format, keys represent row indices, and the values are dictionaries of column names and their respective data.
- It is useful when the data is naturally indexed.
""")
st.code("""
# Example of JSON with orient='index'
{
"0": { "Id": 100, "Name": "Lakshmi Harika", "Age": 22, "Gender": "Female" },
"1": { "Id": 101, "Name": "Varshitha", "Age": 23, "Gender": "Female" },
"2": { "Id": 102, "Name": "Hari Chandan", "Age": 22, "Gender": "Male" },
"3": { "Id": 103, "Name": "Shamitha", "Age": 23, "Gender": "Female" }
}
""", language="json")
st.code("""
# Creating a DataFrame
data = pd.DataFrame({
"Id": [100, 101, 102, 103],
"Name": ["Lakshmi Harika", "Varshitha", "Hari Chandan", "Shamitha"],
"Age": [22, 23, 22, 23],
"Gender": ["Female", "Female", "Male", "Female"]
})
# Exporting to JSON with orient='index'
json_data = data.to_json(orient='index')
print(json_data)
# Reading back from JSON with orient='index'
df = pd.read_json(json_data, orient='index')
print(df)
""", language="python")
st.markdown("""
<h4 style="color: #2a52be;">JSON with Orient = 'columns'</h4>
""", unsafe_allow_html=True)
st.write("""
- When **`orient='columns'`**:
- Keys represent column names, and the values are dictionaries where each key is the row index, and the value is the data.
- This is the default orientation when exporting DataFrames to JSON.
""")
st.code("""
# Example of JSON with orient='columns'
{
"Id": { "0": 100, "1": 101, "2": 102, "3": 103 },
"Name": { "0": "Lakshmi Harika", "1": "Varshitha", "2": "Hari Chandan", "3": "Shamitha" },
"Age": { "0": 22, "1": 23, "2": 22, "3": 23 },
"Gender": { "0": "Female", "1": "Female", "2": "Male", "3": "Female" }
}
""", language="json")
st.code("""
# Creating a DataFrame
data = pd.DataFrame({
"Id": [100, 101, 102, 103],
"Name": ["Lakshmi Harika", "Varshitha", "Hari Chandan", "Shamitha"],
"Age": [22, 23, 22, 23],
"Gender": ["Female", "Female", "Male", "Female"]
})
# Exporting to JSON with orient='columns'
json_data = data.to_json(orient='columns')
print(json_data)
# Reading back from JSON with orient='columns'
df = pd.read_json(json_data, orient='columns')
print(df)
""", language="python")
st.markdown("""
<h4 style="color: #2a52be;">JSON with Orient = 'values'</h4>
""", unsafe_allow_html=True)
st.write("""
- When **`orient='values'`**:
- The JSON represents the data as an array of arrays.
- Each inner array corresponds to a row of data, and the order matches the DataFrame’s column order.
""")
st.code("""
# Example of JSON with orient='values'
[
[100, "Lakshmi Harika", 22, "Female"],
[101, "Varshitha", 23, "Female"],
[102, "Hari Chandan", 22, "Male"],
[103, "Shamitha", 23, "Female"]
]
""", language="json")
st.code("""
# Creating a DataFrame
data = pd.DataFrame({
"Id": [100, 101, 102, 103],
"Name": ["Lakshmi Harika", "Varshitha", "Hari Chandan", "Shamitha"],
"Age": [22, 23, 22, 23],
"Gender": ["Female", "Female", "Male", "Female"]
})
# Exporting to JSON with orient='values'
json_data = data.to_json(orient='values')
print(json_data)
# Reading back from JSON with orient='values'
df = pd.read_json(json_data, orient='values')
print(df)
""", language="python")
st.markdown("""
<h4 style="color: #2a52be;">JSON with Orient = 'split'</h4>
""", unsafe_allow_html=True)
st.write("""
- When **`orient='split'`**:
- The JSON structure splits the data into three parts:
1. `index`: Contains the row indices.
2. `columns`: Contains the column names.
3. `data`: Contains the actual data as a 2D array.
- This orientation is useful for reconstructing the original DataFrame structure.
""")
st.code("""
# Example of JSON with orient='split'
{
"index": [0, 1, 2, 3],
"columns": ["Id", "Name", "Age", "Gender"],
"data": [
[100, "Lakshmi Harika", 22, "Female"],
[101, "Varshitha", 23, "Female"],
[102, "Hari Chandan", 22, "Male"],
[103, "Shamitha", 23, "Female"]
]
}
""", language="json")
st.code("""
# Creating a DataFrame
data = pd.DataFrame({
"Id": [100, 101, 102, 103],
"Name": ["Lakshmi Harika", "Varshitha", "Hari Chandan", "Shamitha"],
"Age": [22, 23, 22, 23],
"Gender": ["Female", "Female", "Male", "Female"]
})
# Exporting to JSON with orient='split'
json_data = data.to_json(orient='split')
print(json_data)
# Reading back from JSON with orient='split'
df = pd.read_json(json_data, orient='split')
print(df)
""", language="python")
st.markdown("""
<h3 style="color: #5b2c6f;">JSON in Semi-Structured Data</h3>
""", unsafe_allow_html=True)
st.write("""
- If the JSON file is in semi-structured format, we can use `pd.json_normalize()` to convert it into a DataFrame.
- **Semi-Structured JSON**:
- A JSON structure is considered semi-structured when one or multiple columns contain data in the form of lists of dictionaries.
- This format requires flattening or normalization to be converted into a tabular structure.
- When using `pd.json_normalize()`, ensure the data is in a **dictionary format**; otherwise, it will throw an error.
""")
st.code("""
# Example Nested JSON
x = [
{"name": "Lakshmi Harika", "age": 23, "gender": "f", "marks": [{"maths": 75, "English": 82}]},
{"name": "Varshitha", "age": 43, "gender": "f", "marks": [{"maths": 65, "English": 72}]},
{"name": "Hari Chandan", "age": 28, "gender": "m", "marks": [{"maths": 85, "English": 92}]},
{"name": "Shamitha", "age": 21, "gender": "f", "marks": [{"maths": 90, "English": 88}]}
]
""", language="python")
st.markdown("""
<h4 style="color: #5b2c6f;">Parameters to Understand</h4>
""", unsafe_allow_html=True)
st.write("""
1. **record_path**:
- Specifies the path to nested lists or dictionaries that need to be flattened.
- Example: For the key `marks`, the path would be `'marks'`.
2. **meta**:
- Specifies fields to include as metadata in the resulting DataFrame.
- These fields remain unchanged and are added to the resulting DataFrame.
3. **max_level**:
- Controls the depth of the flattening.
- Default is `None` (flattens everything), but setting it to a specific number limits the depth.
""")
st.code("""
# Example of Semi-Structured JSON
df = pd.json_normalize(
x,
record_path=['marks'], # Specifies the nested list to flatten
meta=['name', 'age', 'gender'], # Fields to include as metadata
max_level=1 # Specifies the depth of flattening
)
print(df)
""", language="python")
st.write("""
The resulting DataFrame will look like this:
""")
st.table({
"maths": [75, 65, 85, 90],
"English": [82, 72, 92, 88],
"name": ["Lakshmi Harika", "Varshitha", "Hari Chandan", "Shamitha"],
"age": [23, 43, 28, 21],
"gender": ["f", "f", "m", "f"]
})
col1, col2 = st.columns(2)
with col1:
st.markdown("""
<a href="https://colab.research.google.com/drive/1V43v3cNXODQYHLzuC_Rwt2hFSxa378I-?usp=sharing" target="_blank">
<button style="background-color: #4CAF50; color: white; padding: 10px; font-size: 16px; border: none; cursor: pointer;">
Open Google Colab File
</button>
</a>
""", unsafe_allow_html=True)
with col2:
if st.button("⬅️ Back to Previous Page"):
st.session_state.current_page = "main"
st.query_params.update({}) # Ensure scroll resets
#--------------------------------------------------------- XML -------------------------------------------------------------------------------
elif st.session_state.current_page == "explore_xml":
st.query_params.update({})
st.markdown("""
<h2 style="color: #BB3385;">Extensible Markup Language(XML)</h2>
""", unsafe_allow_html=True)
st.write("""
- **XML (Extensible Markup Language)** is a markup language used to store and transport data.
- It is designed to be self-descriptive, making it easy to read and understand.
- XML uses a tree structure where data is organized into nested elements.
- Commonly used for:
- Configuration files
- Data interchange between systems
- Storing structured data
""")
st.markdown("""
<h3 style="color: #5b2c6f;">Element-wise XML Structure</h3>
""", unsafe_allow_html=True)
st.write("""
An element-wise XML structure organizes data into nested elements, where each piece of information is an individual element.""")
st.code("""
<items>
<item>
<name>Item 1</name>
<price>100</price>
<category>Category A</category>
</item>
<item>
<name>Item 2</name>
<price>200</price>
<category>Category B</category>
</item>
</items>
""", language="xml")
st.code("""
# Reading an element-wise XML file
df = pd.read_xml('element_structure.xml', xpath='/items/item')
print(df)
""", language="python")
st.write("""
**Output DataFrame**:
""")
st.table({
'name': ['Item 1', 'Item 2'],
'price': [100, 200],
'category': ['Category A', 'Category B']
})
st.write("""
- `xpath='/items/item'`: Extracts all `<item>` elements from within `<items>`.
- Useful for XML structures with data organized by child elements.
""")
st.markdown("""
<h3 style="color: #5b2c6f;">Attribute XML Structure</h3>
""", unsafe_allow_html=True)
st.write("""
An attribute XML structure stores data as attributes of tags, rather than as child elements.""")
st.code("""
<items>
<item name="Item 1" price="100" category="Category A" />
<item name="Item 2" price="200" category="Category B" />
</items>
""", language="xml")
st.code("""
# Reading an attribute-based XML file
df = pd.read_xml('attribute_structure.xml', xpath='/items/item')
print(df)
""", language="python")
st.write("""
**Output DataFrame**:
""")
st.table({
'name': ['Item 1', 'Item 2'],
'price': [100, 200],
'category': ['Category A', 'Category B']
})
st.write("""
- `xpath='/items/item'`: Extracts attributes of `<item>` elements.
- Useful for XML structures where data is stored in attributes instead of nested elements.
""")
st.markdown("""
<h3 style="color: #5b2c6f;">Exporting Element-wise XML Structure</h3>
""", unsafe_allow_html=True)
st.code("""
# Sample DataFrame
data = pd.DataFrame({
'name': ['Item 1', 'Item 2'],
'price': [100, 200],
'category': ['Category A', 'Category B']
})
# Export the DataFrame to an element-wise XML file
data.to_xml('element_structure.xml', index=False, root_name='items', row_name='item')
""", language="python")
st.markdown("""
<h3 style="color: #5b2c6f;">Exporting Attribute XML Structure</h3>
""", unsafe_allow_html=True)
st.code("""
# Sample DataFrame
data = pd.DataFrame({
'name': ['Item 1', 'Item 2'],
'price': [100, 200],
'category': ['Category A', 'Category B']
})
# Export the DataFrame to an attribute-based XML file
data.to_xml('attribute_structure.xml', index=False, root_name='items', row_name='item', attr_cols=['name', 'price', 'category'])
""", language="python")
col1, col2 = st.columns(2)
with col1:
st.markdown("""
<a href="https://colab.research.google.com/drive/1HnbPeyrcQ5C6oHhrshBisJZbUxxMEvjD?usp=sharing" target="_blank">
<button style="background-color: #4CAF50; color: white; padding: 10px; font-size: 16px; border: none; cursor: pointer;">
Open Google Colab File
</button>
</a>
""", unsafe_allow_html=True)
with col2:
if st.button("⬅️ Back to Previous Page"):
st.session_state.current_page = "main"
st.query_params.update({}) # Ensure scroll resets
#--------------------------------------------------------- HTML --------------------------------------------------------------------------------
elif st.session_state.current_page == "explore_html":
st.query_params.update({})
st.markdown("""
<h2 style="color: #BB3385;">Hyper Text Markup Language (HTML)</h2>
""", unsafe_allow_html=True)
st.write("""
- **HTML (HyperText Markup Language)** is the standard language used to create and structure web pages.
- It uses a combination of elements (tags) to define the content and layout of a webpage.
- Key features include:
- Structuring text with headings, paragraphs, and lists.
- Embedding multimedia content like images, videos, and audio.
- Adding interactivity with forms and hyperlinks.
""")
st.markdown("""
<h3 style="color: #5b2c6f;">Basic HTML Structure</h3>
""", unsafe_allow_html=True)
st.code("""
<!DOCTYPE html>
<html>
<head>
<title>Sample HTML</title>
</head>
<body>
<h1>Welcome to HTML</h1>
<p>This is a paragraph.</p>
</body>
</html>
""", language="html")
st.markdown("""
<h3 style="color: #5b2c6f;">Reading HTML Files</h3>
""", unsafe_allow_html=True)
st.code("""
# Reading an HTML file into a DataFrame
dfs = pd.read_html('sample.html') # Returns a list of DataFrames
for df in dfs:
print(df)
""", language="python")
st.markdown("""
<h3 style="color: #5b2c6f;">Writing HTML Files</h3>
""", unsafe_allow_html=True)
st.code("""
# Sample DataFrame
data = pd.DataFrame({
'name': ['John', 'Jane', 'Doe'],
'age': [28, 34, 29],
'department': ['HR', 'IT', 'Finance']
})
# Write the DataFrame to an HTML file
data.to_html('output.html', index=False)
""", language="python")
col1, col2 = st.columns(2)
with col1:
st.markdown("""
<a href="https://colab.research.google.com/drive/1oQniP5t1sdk5r17dVNZkRGIhe4jCe4yR?usp=sharing" target="_blank">
<button style="background-color: #4CAF50; color: white; padding: 10px; font-size: 16px; border: none; cursor: pointer;">
Open Google Colab File
</button>
</a>
""", unsafe_allow_html=True)
with col2:
if st.button("⬅️ Back to Previous Page"):
st.session_state.current_page = "main"
st.query_params.update({}) # Ensure scroll resets