Spaces:
Sleeping
Sleeping
Update pages/8Model Training.py
Browse files- pages/8Model Training.py +27 -102
pages/8Model Training.py
CHANGED
|
@@ -22,23 +22,18 @@ st.markdown("""
|
|
| 22 |
""", unsafe_allow_html=True)
|
| 23 |
|
| 24 |
st.markdown("""
|
| 25 |
-
|
| 26 |
-
|
| 27 |
|
| 28 |
-
|
|
|
|
| 29 |
|
| 30 |
-
|
| 31 |
-
<p>• <strong>Algorithm</strong> – a method that helps the model learn from the data</p>
|
| 32 |
-
|
| 33 |
-
<p>Once the training is complete, the model can start making predictions or decisions on new, unseen data.</p>
|
| 34 |
""", unsafe_allow_html=True)
|
| 35 |
|
| 36 |
st.markdown("""
|
| 37 |
<h4 style='color:#BB3385;'>For Example</h4>
|
| 38 |
-
Think of yourself as a teacher, and the machine as a student.
|
| 39 |
-
|
| 40 |
-
You show math problems (inputs) and answers (outputs). The student starts to learn patterns.
|
| 41 |
-
|
| 42 |
Just like that:
|
| 43 |
- Machine = student
|
| 44 |
- Data = problem
|
|
@@ -49,115 +44,45 @@ After training, the model is ready to solve new problems.
|
|
| 49 |
|
| 50 |
|
| 51 |
st.markdown("""
|
| 52 |
-
|
| 53 |
-
|
| 54 |
|
| 55 |
-
|
| 56 |
|
| 57 |
-
|
| 58 |
-
|
| 59 |
|
| 60 |
-
|
| 61 |
""", unsafe_allow_html=True)
|
| 62 |
|
| 63 |
|
| 64 |
st.markdown("""
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
<p>There are different learning styles, just like there are different ways people learn.</p>
|
| 69 |
|
| 70 |
- **Supervised** – learning from labeled data
|
| 71 |
- **Unsupervised** – learning without answers
|
| 72 |
- **Semi-supervised** – mix of both
|
| 73 |
- **Reinforcement** – learn by doing
|
| 74 |
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
<p>• <strong>Classification</strong> is used when the goal is to predict a category or group.</p>
|
| 78 |
-
<p> For example: "Yes" or "No", or types like "Apple", "Banana", or "Orange".</p>
|
| 79 |
-
|
| 80 |
-
<p>• <strong>Regression</strong> is used when the goal is to predict a number or value.</p>
|
| 81 |
-
<p> For example: price, temperature, or a score.</p>
|
| 82 |
-
|
| 83 |
-
<p>The choice depends on the type of output expected — category or number.</p>
|
| 84 |
-
<p>Both are powerful and used in different kinds of problems.</p>
|
| 85 |
-
""", unsafe_allow_html=True)
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
st.markdown("""
|
| 89 |
-
<h3 style='color:#2a52be;'>How Do We Represent Data to the Model?</h3>
|
| 90 |
-
<p>When training a machine learning model, the data must be given in a proper structure that the model can understand.</p>
|
| 91 |
-
|
| 92 |
-
<p>This structure usually looks like this:</p>
|
| 93 |
-
<p><strong>D = { (xi, yi) }</strong></p>
|
| 94 |
-
|
| 95 |
-
<p>This means the dataset contains pairs of input and output values. Each pair has two parts:</p>
|
| 96 |
-
|
| 97 |
-
<p>• <strong>xi</strong> is the input — the information passed to the model.</p>
|
| 98 |
-
<p>• <strong>yi</strong> is the output — the result the model should learn or predict.</p>
|
| 99 |
-
|
| 100 |
-
<p>How to know what kind of problem it is:</p>
|
| 101 |
-
<p>• If the output is a label or category, it's a <strong>classification</strong> problem.</p>
|
| 102 |
-
<p>• If the output is a number, it's a <strong>regression</strong> problem.</p>
|
| 103 |
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
st.markdown("""
|
| 109 |
-
<h3 style='color:#2a52be;'>Preparing and Splitting the Data</h3>
|
| 110 |
-
|
| 111 |
-
<p>Before training a machine learning model, the data must be prepared properly. This step is very important because it helps the model understand what to learn and how to learn it.</p>
|
| 112 |
-
|
| 113 |
-
<p>Every dataset has two main parts:</p>
|
| 114 |
-
<p>- <strong>Features</strong>: These are the input columns. They provide the information used to make predictions.</p>
|
| 115 |
-
<p>- <strong>Target</strong>: This is the output column. It contains the values the model needs to learn and predict.</p>
|
| 116 |
-
|
| 117 |
-
<p>First, the features and the target are separated. This helps the model focus on what to learn from and what to predict.</p>
|
| 118 |
-
|
| 119 |
-
<p>Then, the data is split into two sets:</p>
|
| 120 |
-
<p>- One set for <strong>training</strong>: used to teach the model.</p>
|
| 121 |
-
<p>- One set for <strong>testing</strong>: used to check how well the model learned.</p>
|
| 122 |
|
| 123 |
-
|
| 124 |
-
<p>- 80% training and 20% testing</p>
|
| 125 |
-
<p>- 70% training and 30% testing</p>
|
| 126 |
-
<p>- 60% training and 40% testing</p>
|
| 127 |
-
|
| 128 |
-
<p>The split should be random so that every data point has a fair chance. A data point should appear in only one of the two sets — never both.</p>
|
| 129 |
-
|
| 130 |
-
<p>After splitting, these names are used:</p>
|
| 131 |
-
<p>- <strong>X_train</strong>: inputs for training</p>
|
| 132 |
-
<p>- <strong>y_train</strong>: target values for training</p>
|
| 133 |
-
<p>- <strong>X_test</strong>: inputs for testing</p>
|
| 134 |
-
<p>- <strong>y_test</strong>: target values for testing</p>
|
| 135 |
-
|
| 136 |
-
<p>This process ensures the model is trained properly and can be tested fairly on data it hasn’t seen before.</p>
|
| 137 |
""", unsafe_allow_html=True)
|
| 138 |
|
| 139 |
-
|
| 140 |
st.markdown("""
|
| 141 |
-
|
| 142 |
-
<p>This diagram shows how the dataset is divided into training and testing sets.</p>
|
| 143 |
-
""", unsafe_allow_html=True)
|
| 144 |
-
|
| 145 |
-
# Sample split: 80% train, 20% test
|
| 146 |
-
train_ratio = 0.8
|
| 147 |
-
test_ratio = 0.2
|
| 148 |
-
|
| 149 |
-
fig, ax = plt.subplots(figsize=(6, 1.5))
|
| 150 |
|
| 151 |
-
|
| 152 |
-
|
| 153 |
-
|
| 154 |
|
| 155 |
-
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
ax.set_xticks([0.1, 0.3, 0.5, 0.7, 0.9])
|
| 159 |
-
ax.set_title("Train/Test Split (80/20)", fontsize=12)
|
| 160 |
-
ax.legend(loc="upper right")
|
| 161 |
-
ax.axis("off")
|
| 162 |
|
| 163 |
-
|
|
|
|
|
|
| 22 |
""", unsafe_allow_html=True)
|
| 23 |
|
| 24 |
st.markdown("""
|
| 25 |
+
<h3 style='color:#2a52be;'>What is Model Training?</h3>
|
| 26 |
+
<p><strong>Model training</strong> is the process of teaching a machine learning model to understand patterns from data.The model learns with the help of:</p>
|
| 27 |
|
| 28 |
+
<p>→ <strong>Data</strong> – examples that already have correct answers</p>
|
| 29 |
+
<p>→ <strong>Algorithm</strong> – a method that helps the model learn from the data</p>
|
| 30 |
|
| 31 |
+
<p>Once the training is complete, the model can start making predictions or decisions on new, unseen data.</p>
|
|
|
|
|
|
|
|
|
|
| 32 |
""", unsafe_allow_html=True)
|
| 33 |
|
| 34 |
st.markdown("""
|
| 35 |
<h4 style='color:#BB3385;'>For Example</h4>
|
| 36 |
+
Think of yourself as a teacher, and the machine as a student.You show math problems (inputs) and answers (outputs). The student starts to learn patterns.
|
|
|
|
|
|
|
|
|
|
| 37 |
Just like that:
|
| 38 |
- Machine = student
|
| 39 |
- Data = problem
|
|
|
|
| 44 |
|
| 45 |
|
| 46 |
st.markdown("""
|
| 47 |
+
<h3 style='color:#2a52be;'>Who Are We Actually Training?</h3>
|
| 48 |
+
<p>We are training machines to learn — not robots or humans, but something called a <strong>machine learning model</strong>.This model is like a smart system that doesn’t know anything in the beginning. It needs examples and a method to understand those examples.</p>
|
| 49 |
|
| 50 |
+
<p>As programmers, the machine is guided to learn by providing:</p>
|
| 51 |
|
| 52 |
+
<p>▶ <strong>Data</strong> – the examples it should learn from</p>
|
| 53 |
+
<p>▶ <strong>Algorithm</strong> – the method it should use to learn from the data</p>
|
| 54 |
|
| 55 |
+
<p>With the right guidance, the machine can learn how to make decisions on its own.The machine follows the steps given by the algorithm to learn from the data. If the learning doesn’t go well, we usually don’t change the data. Instead, we try using a better algorithm that suits the data.So, how we guide the machine using the algorithm is very important for its learning.</p>
|
| 56 |
""", unsafe_allow_html=True)
|
| 57 |
|
| 58 |
|
| 59 |
st.markdown("""
|
| 60 |
+
<h3 style='color:#2a52be;'>Picking the Right Learning Style</h3>
|
| 61 |
+
<p>Now that the data is ready, we need to choose how the machine should learn from it.There are different learning styles, just like there are different ways people learn.</p>
|
|
|
|
|
|
|
| 62 |
|
| 63 |
- **Supervised** – learning from labeled data
|
| 64 |
- **Unsupervised** – learning without answers
|
| 65 |
- **Semi-supervised** – mix of both
|
| 66 |
- **Reinforcement** – learn by doing
|
| 67 |
|
| 68 |
+
<p>In supervised learning, there are two main types of tasks — classification and regression.</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 69 |
|
| 70 |
+
<p>• <strong>Classification</strong> is used when the goal is to predict a category or group.For example: "Yes" or "No", or types like "Apple", "Banana", or "Orange".</p>
|
| 71 |
+
<p>• <strong>Regression</strong> is used when the goal is to predict a number or value.For example: price, temperature, or a score.</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
|
| 73 |
+
<p>The choice depends on the type of output expected — category or number.Both are powerful and used in different kinds of problems.</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
""", unsafe_allow_html=True)
|
| 75 |
|
|
|
|
| 76 |
st.markdown("""
|
| 77 |
+
<h3 style='color:#2a52be;'>Preparing and Splitting the Data</h3>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 78 |
|
| 79 |
+
<p>Before training a machine learning model, the data must be prepared properly. This step is very important because it helps the model understand what to learn and how to learn it.Every dataset has two main parts:</p>
|
| 80 |
+
<p>- <strong>Features</strong>: These are the input columns. They provide the information used to make predictions.</p>
|
| 81 |
+
<p>- <strong>Target</strong>: This is the output column. It contains the values the model needs to learn and predict.</p>
|
| 82 |
|
| 83 |
+
<p>First, the features and the target are separated. This helps the model focus on what to learn from and what to predict.Then, the data is split into two sets:</p>
|
| 84 |
+
<p>- One set for <strong>training</strong>: used to teach the model.</p>
|
| 85 |
+
<p>- One set for <strong>testing</strong>: used to check how well the model learned.</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
| 86 |
|
| 87 |
+
<p>Common ways to split the data include: 80% training & 20% testing, 70% training & 30% testing, or 60% training & 40% testing.</p>
|
| 88 |
+
<p>The split should be random so that every data point has a fair chance. A data point should appear in only one of the two sets.After splitting, these names are used:</p>
|