Spaces:
Sleeping
Sleeping
Update pages/8Model Training.py
Browse files- pages/8Model Training.py +106 -99
pages/8Model Training.py
CHANGED
|
@@ -3,154 +3,161 @@ import numpy as np
|
|
| 3 |
import matplotlib.pyplot as plt
|
| 4 |
import pandas as pd
|
| 5 |
|
| 6 |
-
|
| 7 |
st.set_page_config(
|
| 8 |
page_title="Model Building",
|
| 9 |
page_icon="🚀",
|
| 10 |
-
layout="wide"
|
| 11 |
-
)
|
| 12 |
|
| 13 |
-
# Optional navigation state (you can remove if unused here)
|
| 14 |
if "current_page" not in st.session_state:
|
| 15 |
st.session_state.current_page = "main"
|
| 16 |
|
| 17 |
def navigate_to(page_name):
|
| 18 |
st.session_state.current_page = page_name
|
| 19 |
|
| 20 |
-
# ---------------------
|
| 21 |
-
# 📘 Model Building Content Starts Here
|
| 22 |
-
# ---------------------
|
| 23 |
|
| 24 |
st.markdown("""
|
| 25 |
-
<h1 style="text-align: center; color: #BB3385;"
|
| 26 |
<p style="text-align: center; font-size: 18px;">Welcome to one of the most exciting parts of machine learning – teaching the machine how to learn!</p>
|
| 27 |
""", unsafe_allow_html=True)
|
| 28 |
|
| 29 |
-
# What is Training?
|
| 30 |
-
st.markdown("## 🤖 So, What is Model Training?")
|
| 31 |
st.markdown("""
|
| 32 |
-
|
|
|
|
| 33 |
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
|
|
|
|
|
|
| 37 |
|
| 38 |
-
|
| 39 |
-
""")
|
| 40 |
|
| 41 |
-
# Who are we training?
|
| 42 |
-
st.markdown("## 👨💻 Who are we actually training?")
|
| 43 |
st.markdown("""
|
| 44 |
-
|
| 45 |
-
|
| 46 |
|
| 47 |
-
You
|
| 48 |
-
We (programmers) guide it using:
|
| 49 |
-
- The data we have
|
| 50 |
-
- The algorithm we choose
|
| 51 |
-
""")
|
| 52 |
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
|
|
|
|
|
|
|
|
|
| 59 |
|
| 60 |
-
If the model is not learning properly, and we can’t fix the data, we usually try switching to a better algorithm.
|
| 61 |
-
""")
|
| 62 |
|
| 63 |
-
# Importance of preprocessing
|
| 64 |
-
st.markdown("## 🧹 Why does Preprocessing Matter?")
|
| 65 |
st.markdown("""
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
|
| 70 |
-
Good learning happens when:
|
| 71 |
-
- Data is cleaned and clear
|
| 72 |
-
- The algorithm matches the task
|
| 73 |
-
""")
|
| 74 |
|
| 75 |
-
# Choosing algorithm type
|
| 76 |
-
st.markdown("## 🤔 Picking the Right Learning Style")
|
| 77 |
st.markdown("""
|
| 78 |
-
|
|
|
|
| 79 |
|
| 80 |
-
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
|
|
|
|
|
|
|
|
|
|
| 85 |
|
| 86 |
-
Most of the time, we start with **Supervised Learning**.
|
| 87 |
-
""")
|
| 88 |
|
| 89 |
-
# Inside Supervised
|
| 90 |
-
st.markdown("## 🧭 Inside Supervised Learning – Classification vs Regression")
|
| 91 |
st.markdown("""
|
| 92 |
-
|
| 93 |
-
|
| 94 |
-
|
|
|
|
| 95 |
|
| 96 |
-
|
|
|
|
|
|
|
|
|
|
| 97 |
""")
|
| 98 |
|
| 99 |
-
# Data Representation
|
| 100 |
-
st.markdown("## 🧾 How Do We Represent Data to the Model?")
|
| 101 |
st.markdown("""
|
| 102 |
-
|
| 103 |
|
| 104 |
-
|
|
|
|
| 105 |
|
| 106 |
-
|
|
|
|
| 107 |
|
| 108 |
-
|
| 109 |
-
|
|
|
|
|
|
|
| 110 |
|
| 111 |
-
If yi is a category → it’s **classification**
|
| 112 |
-
If yi is a number → it’s **regression**
|
| 113 |
-
""")
|
| 114 |
|
| 115 |
-
# Preparing data
|
| 116 |
-
st.markdown("## 📋 Preparing Data Before Training")
|
| 117 |
st.markdown("""
|
| 118 |
-
|
|
|
|
| 119 |
|
| 120 |
-
|
| 121 |
-
- For example, in the Iris dataset:
|
| 122 |
-
- Features = sepal length, petal length, etc.
|
| 123 |
-
- Target = species of flower
|
| 124 |
-
""")
|
| 125 |
|
| 126 |
-
|
| 127 |
-
st.markdown("## ✂️ Splitting the Data")
|
| 128 |
-
st.markdown("""
|
| 129 |
-
We don’t train on all data.
|
| 130 |
|
| 131 |
-
|
| 132 |
-
- **Training Set** – the data we use to teach the model
|
| 133 |
-
- **Testing Set** – the data we use to check how well the model learned
|
| 134 |
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
- Writing a test paper (testing)
|
| 138 |
|
| 139 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 140 |
|
| 141 |
-
|
| 142 |
-
|
| 143 |
-
- Each data point should have equal chance to be in either group
|
| 144 |
-
""")
|
| 145 |
|
| 146 |
-
# Naming convention
|
| 147 |
-
st.markdown("## 🧾 Naming Things After Split")
|
| 148 |
st.markdown("""
|
| 149 |
-
|
|
|
|
| 150 |
|
| 151 |
-
|
| 152 |
-
|
| 153 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 154 |
|
| 155 |
-
|
| 156 |
-
|
|
|
|
| 3 |
import matplotlib.pyplot as plt
|
| 4 |
import pandas as pd
|
| 5 |
|
| 6 |
+
|
| 7 |
st.set_page_config(
|
| 8 |
page_title="Model Building",
|
| 9 |
page_icon="🚀",
|
| 10 |
+
layout="wide")
|
|
|
|
| 11 |
|
|
|
|
| 12 |
if "current_page" not in st.session_state:
|
| 13 |
st.session_state.current_page = "main"
|
| 14 |
|
| 15 |
def navigate_to(page_name):
|
| 16 |
st.session_state.current_page = page_name
|
| 17 |
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
st.markdown("""
|
| 20 |
+
<h1 style="text-align: center; color: #BB3385;">Model Building</h1>
|
| 21 |
<p style="text-align: center; font-size: 18px;">Welcome to one of the most exciting parts of machine learning – teaching the machine how to learn!</p>
|
| 22 |
""", unsafe_allow_html=True)
|
| 23 |
|
|
|
|
|
|
|
| 24 |
st.markdown("""
|
| 25 |
+
<h2 style='color:#9400d3;'>What is Model Training?</h2>
|
| 26 |
+
<p><strong>Model training</strong> is the process of teaching a machine learning model to understand patterns from data.</p>
|
| 27 |
|
| 28 |
+
<p>The model learns using:</p>
|
| 29 |
+
<ul>
|
| 30 |
+
<li><strong>Data</strong> – examples we already know the answers to</li>
|
| 31 |
+
<li><strong>Algorithm</strong> – a method that helps the model learn from the data</li>
|
| 32 |
+
</ul>
|
| 33 |
|
| 34 |
+
<p>Once trained, the model can make predictions or decisions on new, unseen data.</p>
|
| 35 |
+
""", unsafe_allow_html=True)
|
| 36 |
|
|
|
|
|
|
|
| 37 |
st.markdown("""
|
| 38 |
+
<h3 style='color:#2a52be;'>For Example</h3>
|
| 39 |
+
<p>Think of yourself as a <strong>teacher</strong>, and the machine as a <strong>student</strong>.</p>
|
| 40 |
|
| 41 |
+
<p>You show your student several math problems (inputs) along with their answers (outputs). Over time, the student begins to recognize patterns and learns how to solve similar problems on their own.</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
| 42 |
|
| 43 |
+
<p>That’s exactly what happens in model training:</p>
|
| 44 |
+
<ul>
|
| 45 |
+
<li>The <strong>machine is the student</strong></li>
|
| 46 |
+
<li>The <strong>data is the math problem</strong></li>
|
| 47 |
+
<li>The <strong>algorithm is the learning technique</strong></li>
|
| 48 |
+
</ul>
|
| 49 |
+
|
| 50 |
+
<p>After training, the model (student) is ready to solve new problems!</p>
|
| 51 |
+
""", unsafe_allow_html=True)
|
| 52 |
|
|
|
|
|
|
|
| 53 |
|
|
|
|
|
|
|
| 54 |
st.markdown("""
|
| 55 |
+
<h2 style='color:#9400d3;'>Who Are We Actually Training?</h2>
|
| 56 |
+
<p>We are training machines to learn — not robots or humans, but something called a <strong>machine learning model</strong>.</p>
|
| 57 |
+
|
| 58 |
+
<p>This model is like a smart system that doesn’t know anything in the beginning. It needs examples and a method to understand those examples.</p>
|
| 59 |
+
|
| 60 |
+
<p>As programmers, we guide the machine to learn by giving it:</p>
|
| 61 |
+
<ul>
|
| 62 |
+
<li>Data – the examples to learn from</li>
|
| 63 |
+
<li>An Algorithm – the way it should learn from those examples</li>
|
| 64 |
+
</ul>
|
| 65 |
+
|
| 66 |
+
<p>With the right guidance, the machine can learn how to make decisions on its own.</p>
|
| 67 |
+
""", unsafe_allow_html=True)
|
| 68 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
|
|
|
|
|
|
|
| 70 |
st.markdown("""
|
| 71 |
+
<h2 style='color:#9400d3;'>What Does the Model Need to Learn?</h2>
|
| 72 |
+
<p>For a machine to learn, it needs just two important things:</p>
|
| 73 |
|
| 74 |
+
<p><strong>First, it needs data</strong>. This is the information the machine looks at to understand how things work.</p>
|
| 75 |
+
|
| 76 |
+
<p><strong>Second, it needs an algorithm</strong>. This tells the machine how to learn from that data.</p>
|
| 77 |
+
|
| 78 |
+
<p>The machine follows the steps given by the algorithm to learn from the data. If the learning doesn’t go well, we usually don’t change the data. Instead, we try using a better algorithm that suits the data.</p>
|
| 79 |
+
|
| 80 |
+
<p>So, how we guide the machine using the algorithm is very important for its learning.</p>
|
| 81 |
+
""", unsafe_allow_html=True)
|
| 82 |
|
|
|
|
|
|
|
| 83 |
|
|
|
|
|
|
|
| 84 |
st.markdown("""
|
| 85 |
+
<h2 style='color:#9400d3;'>Picking the Right Learning Style</h2>
|
| 86 |
+
<p>Now that the data is ready, we need to choose how the machine should learn from it.</p>
|
| 87 |
+
|
| 88 |
+
<p>There are different learning styles, just like there are different ways people learn.</p>
|
| 89 |
|
| 90 |
+
- **Supervised** – learning from labeled data
|
| 91 |
+
- **Unsupervised** – learning without answers
|
| 92 |
+
- **Semi-supervised** – mix of both
|
| 93 |
+
- **Reinforcement** – learn by doing
|
| 94 |
""")
|
| 95 |
|
|
|
|
|
|
|
| 96 |
st.markdown("""
|
| 97 |
+
<p>In supervised learning, there are two main types of tasks — classification and regression. Let’s understand the difference in a simple way.</p>
|
| 98 |
|
| 99 |
+
<p><strong>Classification</strong> is used when we want the machine to predict a category or a group.</p>
|
| 100 |
+
<p>For example, the output could be something like "Yes" or "No", or it could be types like "Apple", "Banana", or "Orange".</p>
|
| 101 |
|
| 102 |
+
<p><strong>Regression</strong> is used when we want the machine to predict a number or a value.</p>
|
| 103 |
+
<p>The output could be something like a price, a temperature, or a score.</p>
|
| 104 |
|
| 105 |
+
<p>So, the choice depends on what kind of answer we expect — a category or a number.</p>
|
| 106 |
+
|
| 107 |
+
<p>Both are powerful, and which one you use depends on the kind of problem you're solving.</p>
|
| 108 |
+
""", unsafe_allow_html=True)
|
| 109 |
|
|
|
|
|
|
|
|
|
|
| 110 |
|
|
|
|
|
|
|
| 111 |
st.markdown("""
|
| 112 |
+
<h2 style='color:#9400d3;'>How Do We Represent Data to the Model?</h2>
|
| 113 |
+
<p>When we train a machine learning model, we need to give the data in a proper structure that the model understands.</p>
|
| 114 |
|
| 115 |
+
<p>We usually write it like this:</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
| 116 |
|
| 117 |
+
<p><strong>D = { (xi, yi) }</strong></p>
|
|
|
|
|
|
|
|
|
|
| 118 |
|
| 119 |
+
<p>This simply means we have a group of data points. Each data point has two parts:</p>
|
|
|
|
|
|
|
| 120 |
|
| 121 |
+
<p><strong>xi</strong> is the input — the information we give to the model.</p>
|
| 122 |
+
<p><strong>yi</strong> is the output — the result we want the model to learn or predict.</p>
|
|
|
|
| 123 |
|
| 124 |
+
<p>For example:</p>
|
| 125 |
+
<ul>
|
| 126 |
+
<li>If the output is a label or category, then it's a classification problem.</li>
|
| 127 |
+
<li>If the output is a number, then it's a regression problem.</li>
|
| 128 |
+
</ul>
|
| 129 |
|
| 130 |
+
<p>This is how we prepare the data so the model can start learning from it.</p>
|
| 131 |
+
""", unsafe_allow_html=True)
|
|
|
|
|
|
|
| 132 |
|
|
|
|
|
|
|
| 133 |
st.markdown("""
|
| 134 |
+
<h2 style='color:#9400d3;'>Preparing the Data Before Training</h2>
|
| 135 |
+
<p>Before we train a model, we need to prepare our data in the right way.</p>
|
| 136 |
|
| 137 |
+
<p>Every dataset has two parts:</p>
|
| 138 |
+
<ul>
|
| 139 |
+
<li><strong>Features</strong>: These are the inputs. They are the columns that help us make predictions.</li>
|
| 140 |
+
<li><strong>Target</strong> (or label): This is the output. It is the column we want the machine to learn and predict.</li>
|
| 141 |
+
</ul>
|
| 142 |
+
|
| 143 |
+
<p>We first separate the features and the target from the dataset. This helps the machine understand what to learn from and what to predict.</p>
|
| 144 |
+
|
| 145 |
+
<p>This step is important because the machine needs to know what to look at (features) and what result to learn (target).</p>
|
| 146 |
+
""", unsafe_allow_html=True)
|
| 147 |
+
|
| 148 |
+
st.markdown("""
|
| 149 |
+
<h2 style='color:#9400d3;'>✂️ Splitting the Data</h2>
|
| 150 |
+
<p>Once we separate the features and the target, the next step is to split the data into two parts:</p>
|
| 151 |
+
|
| 152 |
+
<p><strong>One part is for training the model</strong>. This is the data the machine will use to learn.</p>
|
| 153 |
+
|
| 154 |
+
<p><strong>The other part is for testing the model</strong>. This helps us check if the model has really learned well or just memorized things.</p>
|
| 155 |
+
|
| 156 |
+
<p>Most of the time, a larger portion of the data is kept for training and a smaller portion for testing. Some common splits are 80% training and 20% testing, or 70% training and 30% testing. In some cases, 60% training and 40% testing is also used.</p>
|
| 157 |
+
|
| 158 |
+
<p>The data should be split randomly so that each data point has an equal chance of being selected. Also, the same data point should not appear in both the training and testing sets.</p>
|
| 159 |
+
|
| 160 |
+
<p>After the split, the input and output values for training are called X_train and y_train. The input and output values for testing are called X_test and y_test.</p>
|
| 161 |
|
| 162 |
+
<p>This step is important because it helps check how well the model performs on new data that was not used during training.</p>
|
| 163 |
+
""", unsafe_allow_html=True)
|