Spaces:
Sleeping
Sleeping
Update pages/8Model Training.py
Browse files- pages/8Model Training.py +101 -104
pages/8Model Training.py
CHANGED
|
@@ -7,160 +7,157 @@ import pandas as pd
|
|
| 7 |
st.set_page_config(
|
| 8 |
page_title="Model Building",
|
| 9 |
page_icon="🚀",
|
| 10 |
-
layout="wide"
|
| 11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
|
| 13 |
st.markdown("""
|
| 14 |
-
<h1 style="text-align: center; color: #BB3385;"
|
| 15 |
-
<p style="text-align: center; font-size: 18px;">
|
| 16 |
-
Let’s explore how machines learn — from raw data to smart predictions.
|
| 17 |
-
</p>
|
| 18 |
""", unsafe_allow_html=True)
|
| 19 |
|
| 20 |
-
# What is model training
|
| 21 |
st.markdown("""
|
| 22 |
-
<h3 style='color:#
|
| 23 |
-
<p>Model training is the
|
| 24 |
|
| 25 |
-
<p>
|
| 26 |
|
| 27 |
-
<p>
|
|
|
|
|
|
|
|
|
|
| 28 |
""", unsafe_allow_html=True)
|
| 29 |
|
| 30 |
-
# Who are we training
|
| 31 |
st.markdown("""
|
| 32 |
-
|
| 33 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
-
|
| 36 |
""", unsafe_allow_html=True)
|
| 37 |
|
| 38 |
-
|
| 39 |
st.markdown("""
|
| 40 |
-
<h3 style='color:#
|
| 41 |
-
<p>
|
|
|
|
|
|
|
| 42 |
|
| 43 |
-
<p
|
| 44 |
-
<p
|
| 45 |
|
| 46 |
-
<p>If the
|
| 47 |
""", unsafe_allow_html=True)
|
| 48 |
|
| 49 |
-
|
| 50 |
st.markdown("""
|
| 51 |
-
<h3 style='color:#
|
| 52 |
-
<p>
|
| 53 |
|
| 54 |
-
<p
|
| 55 |
-
<p><strong>Unsupervised learning</strong> – when the data has no labels, and the model has to discover patterns on its own.</p>
|
| 56 |
-
<p><strong>Semi-supervised</strong> – when some data is labeled and some is not.</p>
|
| 57 |
-
<p><strong>Reinforcement learning</strong> – when the model learns by trial and error through rewards.</p>
|
| 58 |
|
| 59 |
-
|
| 60 |
-
|
|
|
|
|
|
|
| 61 |
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
<
|
| 65 |
-
<p>
|
| 66 |
|
| 67 |
-
<p
|
| 68 |
-
<p
|
| 69 |
|
| 70 |
-
<p>The choice depends on the type of output
|
|
|
|
| 71 |
""", unsafe_allow_html=True)
|
| 72 |
|
| 73 |
-
|
| 74 |
st.markdown("""
|
| 75 |
-
<h3 style='color:#
|
| 76 |
-
<p>
|
| 77 |
|
|
|
|
| 78 |
<p><strong>D = { (xi, yi) }</strong></p>
|
| 79 |
|
| 80 |
-
<p>
|
| 81 |
-
<p>xi is the input, and yi is the output the model should learn to predict.</p>
|
| 82 |
-
|
| 83 |
-
<p>If the output is a label, it’s classification. If it’s a number, it’s regression.</p>
|
| 84 |
-
""", unsafe_allow_html=True)
|
| 85 |
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
<h3 style='color:#9400d3;'>Preparing the Data Before Training</h3>
|
| 89 |
-
<p>Every dataset has two parts:</p>
|
| 90 |
|
| 91 |
-
<p>
|
| 92 |
-
<p
|
|
|
|
| 93 |
|
| 94 |
-
<p>
|
| 95 |
""", unsafe_allow_html=True)
|
| 96 |
|
| 97 |
-
|
| 98 |
st.markdown("""
|
| 99 |
-
<h3 style='color:#
|
| 100 |
-
<p>Once the features and target are ready, the next step is to divide the data into two sets.</p>
|
| 101 |
|
| 102 |
-
<p>
|
| 103 |
-
<p>The other set is for testing. This helps check if the model has learned well enough to work on new data.</p>
|
| 104 |
|
| 105 |
-
<p>
|
| 106 |
-
<p>
|
|
|
|
| 107 |
|
| 108 |
-
<p>
|
| 109 |
-
<p>X_train and y_train for the training data.</p>
|
| 110 |
-
<p>X_test and y_test for the testing data.</p>
|
| 111 |
|
| 112 |
-
<p>
|
| 113 |
-
|
|
|
|
| 114 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 115 |
|
| 116 |
-
|
| 117 |
-
<h2 style='color:#9400d3;'>Summary</h2>
|
| 118 |
-
<p>Model building is all about teaching a machine to learn from data.</p>
|
| 119 |
-
|
| 120 |
-
<p>✔️ Model training means helping a machine understand patterns using data and an algorithm.</p>
|
| 121 |
-
<p>✔️ We train a machine learning model — a smart system that learns to make predictions.</p>
|
| 122 |
-
<p>✔️ The machine needs data (examples) and an algorithm (learning method).</p>
|
| 123 |
-
<p>✔️ We choose how the machine learns — supervised, unsupervised, semi-supervised, or reinforcement.</p>
|
| 124 |
-
<p>✔️ In supervised learning, we use classification (for categories) or regression (for numbers).</p>
|
| 125 |
-
<p>✔️ Data is represented as input-output pairs like (xi, yi).</p>
|
| 126 |
-
<p>✔️ We prepare data by separating inputs (features) and outputs (target).</p>
|
| 127 |
-
<p>✔️ Finally, we split the data into training and testing sets to help the model learn and to evaluate its performance.</p>
|
| 128 |
-
""", unsafe_allow_html=True)
|
| 129 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 130 |
|
| 131 |
-
|
| 132 |
-
<h3 style='color:#9400d3;'>🤖 Try It Out: Train a Simple Model</h3>
|
| 133 |
-
<p>This example shows how a basic machine learning model is trained using a few lines of code.</p>
|
| 134 |
""", unsafe_allow_html=True)
|
| 135 |
|
| 136 |
-
from sklearn.linear_model import LinearRegression
|
| 137 |
-
from sklearn.model_selection import train_test_split
|
| 138 |
-
from sklearn.metrics import mean_squared_error
|
| 139 |
|
| 140 |
-
|
| 141 |
-
|
| 142 |
-
|
|
|
|
| 143 |
|
| 144 |
-
#
|
| 145 |
-
|
|
|
|
| 146 |
|
| 147 |
-
|
| 148 |
-
model = LinearRegression()
|
| 149 |
-
model.fit(X_train, y_train)
|
| 150 |
|
| 151 |
-
#
|
| 152 |
-
|
|
|
|
| 153 |
|
| 154 |
-
#
|
| 155 |
-
|
| 156 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 157 |
|
| 158 |
-
# Show plot
|
| 159 |
-
fig, ax = plt.subplots()
|
| 160 |
-
ax.scatter(X_test, y_test, label="Actual", color="blue")
|
| 161 |
-
ax.plot(X_test, y_pred, color="red", label="Predicted Line")
|
| 162 |
-
ax.set_xlabel("X")
|
| 163 |
-
ax.set_ylabel("y")
|
| 164 |
-
ax.set_title("Actual vs Predicted")
|
| 165 |
-
ax.legend()
|
| 166 |
st.pyplot(fig)
|
|
|
|
| 7 |
st.set_page_config(
|
| 8 |
page_title="Model Building",
|
| 9 |
page_icon="🚀",
|
| 10 |
+
layout="wide")
|
| 11 |
+
|
| 12 |
+
if "current_page" not in st.session_state:
|
| 13 |
+
st.session_state.current_page = "main"
|
| 14 |
+
|
| 15 |
+
def navigate_to(page_name):
|
| 16 |
+
st.session_state.current_page = page_name
|
| 17 |
+
|
| 18 |
|
| 19 |
st.markdown("""
|
| 20 |
+
<h1 style="text-align: center; color: #BB3385;">Model Building</h1>
|
| 21 |
+
<p style="text-align: center; font-size: 18px;">Welcome to one of the most exciting parts of machine learning – teaching the machine how to learn!</p>
|
|
|
|
|
|
|
| 22 |
""", unsafe_allow_html=True)
|
| 23 |
|
|
|
|
| 24 |
st.markdown("""
|
| 25 |
+
<h3 style='color:#2a52be;'>What is Model Training?</h3>
|
| 26 |
+
<p><strong>Model training</strong> is the process of teaching a machine learning model to understand patterns from data.</p>
|
| 27 |
|
| 28 |
+
<p>The model learns with the help of:</p>
|
| 29 |
|
| 30 |
+
<p>• <strong>Data</strong> – examples that already have correct answers</p>
|
| 31 |
+
<p>• <strong>Algorithm</strong> – a method that helps the model learn from the data</p>
|
| 32 |
+
|
| 33 |
+
<p>Once the training is complete, the model can start making predictions or decisions on new, unseen data.</p>
|
| 34 |
""", unsafe_allow_html=True)
|
| 35 |
|
|
|
|
| 36 |
st.markdown("""
|
| 37 |
+
<h4 style='color:#BB3385;'>For Example</h4>
|
| 38 |
+
Think of yourself as a teacher, and the machine as a student.
|
| 39 |
+
|
| 40 |
+
You show math problems (inputs) and answers (outputs). The student starts to learn patterns.
|
| 41 |
+
|
| 42 |
+
Just like that:
|
| 43 |
+
- Machine = student
|
| 44 |
+
- Data = problem
|
| 45 |
+
- Algorithm = learning method
|
| 46 |
|
| 47 |
+
After training, the model is ready to solve new problems.
|
| 48 |
""", unsafe_allow_html=True)
|
| 49 |
|
| 50 |
+
|
| 51 |
st.markdown("""
|
| 52 |
+
<h3 style='color:#2a52be;'>Who Are We Actually Training?</h3>
|
| 53 |
+
<p>We are training machines to learn — not robots or humans, but something called a <strong>machine learning model</strong>.This model is like a smart system that doesn’t know anything in the beginning. It needs examples and a method to understand those examples.</p>
|
| 54 |
+
|
| 55 |
+
<p>As programmers, the machine is guided to learn by providing:</p>
|
| 56 |
|
| 57 |
+
<p>• <strong>Data</strong> – the examples it should learn from</p>
|
| 58 |
+
<p>• <strong>Algorithm</strong> – the method it should use to learn from the data</p>
|
| 59 |
|
| 60 |
+
<p>With the right guidance, the machine can learn how to make decisions on its own.The machine follows the steps given by the algorithm to learn from the data. If the learning doesn’t go well, we usually don’t change the data. Instead, we try using a better algorithm that suits the data.So, how we guide the machine using the algorithm is very important for its learning.</p>
|
| 61 |
""", unsafe_allow_html=True)
|
| 62 |
|
| 63 |
+
|
| 64 |
st.markdown("""
|
| 65 |
+
<h3 style='color:#2a52be;'>Picking the Right Learning Style</h3>
|
| 66 |
+
<p>Now that the data is ready, we need to choose how the machine should learn from it.</p>
|
| 67 |
|
| 68 |
+
<p>There are different learning styles, just like there are different ways people learn.</p>
|
|
|
|
|
|
|
|
|
|
| 69 |
|
| 70 |
+
- **Supervised** – learning from labeled data
|
| 71 |
+
- **Unsupervised** – learning without answers
|
| 72 |
+
- **Semi-supervised** – mix of both
|
| 73 |
+
- **Reinforcement** – learn by doing
|
| 74 |
|
| 75 |
+
<p>In supervised learning, there are two main types of tasks — classification and regression.</p>
|
| 76 |
+
|
| 77 |
+
<p>• <strong>Classification</strong> is used when the goal is to predict a category or group.</p>
|
| 78 |
+
<p> For example: "Yes" or "No", or types like "Apple", "Banana", or "Orange".</p>
|
| 79 |
|
| 80 |
+
<p>• <strong>Regression</strong> is used when the goal is to predict a number or value.</p>
|
| 81 |
+
<p> For example: price, temperature, or a score.</p>
|
| 82 |
|
| 83 |
+
<p>The choice depends on the type of output expected — category or number.</p>
|
| 84 |
+
<p>Both are powerful and used in different kinds of problems.</p>
|
| 85 |
""", unsafe_allow_html=True)
|
| 86 |
|
| 87 |
+
|
| 88 |
st.markdown("""
|
| 89 |
+
<h3 style='color:#2a52be;'>How Do We Represent Data to the Model?</h3>
|
| 90 |
+
<p>When training a machine learning model, the data must be given in a proper structure that the model can understand.</p>
|
| 91 |
|
| 92 |
+
<p>This structure usually looks like this:</p>
|
| 93 |
<p><strong>D = { (xi, yi) }</strong></p>
|
| 94 |
|
| 95 |
+
<p>This means the dataset contains pairs of input and output values. Each pair has two parts:</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
| 96 |
|
| 97 |
+
<p>• <strong>xi</strong> is the input — the information passed to the model.</p>
|
| 98 |
+
<p>• <strong>yi</strong> is the output — the result the model should learn or predict.</p>
|
|
|
|
|
|
|
| 99 |
|
| 100 |
+
<p>How to know what kind of problem it is:</p>
|
| 101 |
+
<p>• If the output is a label or category, it's a <strong>classification</strong> problem.</p>
|
| 102 |
+
<p>• If the output is a number, it's a <strong>regression</strong> problem.</p>
|
| 103 |
|
| 104 |
+
<p>This is how the data is organized so the model can start learning from it effectively.</p>
|
| 105 |
""", unsafe_allow_html=True)
|
| 106 |
|
| 107 |
+
|
| 108 |
st.markdown("""
|
| 109 |
+
<h3 style='color:#2a52be;'>Preparing and Splitting the Data</h3>
|
|
|
|
| 110 |
|
| 111 |
+
<p>Before training a machine learning model, the data must be prepared properly. This step is very important because it helps the model understand what to learn and how to learn it.</p>
|
|
|
|
| 112 |
|
| 113 |
+
<p>Every dataset has two main parts:</p>
|
| 114 |
+
<p>- <strong>Features</strong>: These are the input columns. They provide the information used to make predictions.</p>
|
| 115 |
+
<p>- <strong>Target</strong>: This is the output column. It contains the values the model needs to learn and predict.</p>
|
| 116 |
|
| 117 |
+
<p>First, the features and the target are separated. This helps the model focus on what to learn from and what to predict.</p>
|
|
|
|
|
|
|
| 118 |
|
| 119 |
+
<p>Then, the data is split into two sets:</p>
|
| 120 |
+
<p>- One set for <strong>training</strong>: used to teach the model.</p>
|
| 121 |
+
<p>- One set for <strong>testing</strong>: used to check how well the model learned.</p>
|
| 122 |
|
| 123 |
+
<p>Common ways to split the data include:</p>
|
| 124 |
+
<p>- 80% training and 20% testing</p>
|
| 125 |
+
<p>- 70% training and 30% testing</p>
|
| 126 |
+
<p>- 60% training and 40% testing</p>
|
| 127 |
|
| 128 |
+
<p>The split should be random so that every data point has a fair chance. A data point should appear in only one of the two sets — never both.</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 129 |
|
| 130 |
+
<p>After splitting, these names are used:</p>
|
| 131 |
+
<p>- <strong>X_train</strong>: inputs for training</p>
|
| 132 |
+
<p>- <strong>y_train</strong>: target values for training</p>
|
| 133 |
+
<p>- <strong>X_test</strong>: inputs for testing</p>
|
| 134 |
+
<p>- <strong>y_test</strong>: target values for testing</p>
|
| 135 |
|
| 136 |
+
<p>This process ensures the model is trained properly and can be tested fairly on data it hasn’t seen before.</p>
|
|
|
|
|
|
|
| 137 |
""", unsafe_allow_html=True)
|
| 138 |
|
|
|
|
|
|
|
|
|
|
| 139 |
|
| 140 |
+
st.markdown("""
|
| 141 |
+
<h4 style='color:#2a52be;'>Visual: Train/Test Split</h4>
|
| 142 |
+
<p>This diagram shows how the dataset is divided into training and testing sets.</p>
|
| 143 |
+
""", unsafe_allow_html=True)
|
| 144 |
|
| 145 |
+
# Sample split: 80% train, 20% test
|
| 146 |
+
train_ratio = 0.8
|
| 147 |
+
test_ratio = 0.2
|
| 148 |
|
| 149 |
+
fig, ax = plt.subplots(figsize=(6, 1.5))
|
|
|
|
|
|
|
| 150 |
|
| 151 |
+
# Plotting the train/test areas
|
| 152 |
+
ax.barh(y=0, width=train_ratio, color="#66c2a5", edgecolor='black', label='Training Data')
|
| 153 |
+
ax.barh(y=0, width=test_ratio, left=train_ratio, color="#fc8d62", edgecolor='black', label='Testing Data')
|
| 154 |
|
| 155 |
+
# Formatting
|
| 156 |
+
ax.set_xlim(0, 1)
|
| 157 |
+
ax.set_yticks([])
|
| 158 |
+
ax.set_xticks([0.1, 0.3, 0.5, 0.7, 0.9])
|
| 159 |
+
ax.set_title("Train/Test Split (80/20)", fontsize=12)
|
| 160 |
+
ax.legend(loc="upper right")
|
| 161 |
+
ax.axis("off")
|
| 162 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 163 |
st.pyplot(fig)
|