Spaces:
Sleeping
Sleeping
Upload 61 files
Browse filesJust Lazy and upload via browser...
This view is limited to 50 files because it contains too many changes.
See raw diff
- .gitattributes +3 -0
- .gitignore +0 -0
- AlphanumericQR.ttf +3 -0
- BallpointPenNHK-Regular.ttf +3 -0
- Carnival.ttf +0 -0
- Chesilin.ttf +0 -0
- Daemon.otf +0 -0
- Daemon_Training.png +0 -0
- EULA BallpointPenNHK.pdf +3 -0
- FortuneTeller.ttf +0 -0
- Jajapanan.ttf +0 -0
- Kakuji.ttf +0 -0
- Ling_Ling_-_Cost_vs_Accuracy.png +0 -0
- Ling_Ling_-_Six_Predictions.png +0 -0
- LogisticRegression.md +22 -0
- MusicNotation.ttf +0 -0
- Photograph Signature.ttf +0 -0
- PyTorchClass.zip +3 -0
- README.md +608 -14
- ShallowNeuralNetwork.md +86 -0
- SoftmaxRegression.md +16 -0
- UpperLower.svg +309 -0
- ZXX Bold.otf +0 -0
- ZXX Camo.otf +0 -0
- ZXX False.otf +0 -0
- ZXX Noise.otf +0 -0
- ZXX Sans.otf +0 -0
- ZXX Xed.otf +0 -0
- after_logistic.png +0 -0
- after_shallow.png +0 -0
- after_softmax.png +0 -0
- after_training_words.png +0 -0
- after_training_words_Chesilin.png +0 -0
- app.py +219 -0
- before_logistic.png +0 -0
- before_shallow.png +0 -0
- before_softmax.png +0 -0
- bullets4.ttf +0 -0
- convolutional_neural_networks.md +703 -0
- convolutional_neural_networks.py +69 -0
- dataset_loader.py +69 -0
- deep_networks.md +356 -0
- deep_networks.py +80 -0
- final_project.md +29 -0
- final_project.py +260 -0
- font.py +138 -0
- install.bat +7 -0
- install.sh +7 -0
- logistic_regression.py +65 -0
- metrics.png +0 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
AlphanumericQR.ttf filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
BallpointPenNHK-Regular.ttf filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
EULA[[:space:]]BallpointPenNHK.pdf filter=lfs diff=lfs merge=lfs -text
|
.gitignore
ADDED
|
File without changes
|
AlphanumericQR.ttf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c16ec58d9b2ece98c013a72666a031d4b13b7349d33b19e306b9966303121977
|
| 3 |
+
size 105364
|
BallpointPenNHK-Regular.ttf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:82108314e1dcd918c09403edc6ec88218f56af0628c95bfc1d475386d7b4ae83
|
| 3 |
+
size 309684
|
Carnival.ttf
ADDED
|
Binary file (13.3 kB). View file
|
|
|
Chesilin.ttf
ADDED
|
Binary file (31.8 kB). View file
|
|
|
Daemon.otf
ADDED
|
Binary file (23.3 kB). View file
|
|
|
Daemon_Training.png
ADDED
|
EULA BallpointPenNHK.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fa81b3b9cb5e2ea540922ade44185601ba9b2a745ce477f563dddf5351c51889
|
| 3 |
+
size 643136
|
FortuneTeller.ttf
ADDED
|
Binary file (38.4 kB). View file
|
|
|
Jajapanan.ttf
ADDED
|
Binary file (21 kB). View file
|
|
|
Kakuji.ttf
ADDED
|
Binary file (26.3 kB). View file
|
|
|
Ling_Ling_-_Cost_vs_Accuracy.png
ADDED
|
Ling_Ling_-_Six_Predictions.png
ADDED
|
LogisticRegression.md
ADDED
|
@@ -0,0 +1,22 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
7️⃣ **Cross Entropy Loss: Teaching AI to Learn Better** 🎯
|
| 2 |
+
- AI makes mistakes, so we **measure how bad they are** using a **loss function**.
|
| 3 |
+
- Cross Entropy Loss helps AI **learn from its mistakes** and get better.
|
| 4 |
+
- Instead of guessing randomly, the AI **adjusts itself to improve its answers**.
|
| 5 |
+
|
| 6 |
+
8️⃣ **Backpropagation: AI Fixing Its Own Mistakes** 🔄
|
| 7 |
+
- AI learns by **guessing, checking, and fixing mistakes**.
|
| 8 |
+
- It uses **backpropagation** to update itself, just like **learning from practice**.
|
| 9 |
+
- This helps AI **get smarter every time it trains**.
|
| 10 |
+
|
| 11 |
+
9️⃣ **Multi-Class Neural Networks: Picking the Best Answer** 🎨
|
| 12 |
+
- AI doesn’t always choose between **just two things**; sometimes, it picks from **many choices**!
|
| 13 |
+
- It uses **Softmax** to figure out which answer is **most likely**.
|
| 14 |
+
- This helps in **image recognition, language processing, and more**!
|
| 15 |
+
|
| 16 |
+
🔟 **Activation Functions: Helping AI Think Faster** ⚡
|
| 17 |
+
- AI uses **activation functions** to **decide which patterns matter**.
|
| 18 |
+
- Three important ones:
|
| 19 |
+
- **Sigmoid** → Helps with probabilities.
|
| 20 |
+
- **Tanh** → Balances data better.
|
| 21 |
+
- **ReLU** → Fastest and most useful!
|
| 22 |
+
- These make AI **learn faster and make better decisions**!
|
MusicNotation.ttf
ADDED
|
Binary file (47.9 kB). View file
|
|
|
Photograph Signature.ttf
ADDED
|
Binary file (37.3 kB). View file
|
|
|
PyTorchClass.zip
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3d1c7a965cb19f8338176d8377fa077731290baeaa2844144c5ac4e0174cf0e5
|
| 3 |
+
size 22431051
|
README.md
CHANGED
|
@@ -1,14 +1,608 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
|
| 2 |
+
# 🧠 **What is Logistic Regression?**
|
| 3 |
+
Imagine you have a **robot** that tries to guess if a fruit is an 🍎 **apple** or a 🍌 **banana**.
|
| 4 |
+
- The robot uses **Logistic Regression** to make its guess.
|
| 5 |
+
- It looks at things like the fruit’s **color**, **shape**, and **size** to decide.
|
| 6 |
+
- The robot gives a score from **0 to 1**:
|
| 7 |
+
- 0 → Definitely a banana 🍌
|
| 8 |
+
- 1 → Definitely an apple 🍎
|
| 9 |
+
- 0.5 → The robot is unsure 🤖
|
| 10 |
+
|
| 11 |
+
## 🔥 **What does the notebook do?**
|
| 12 |
+
1. **Makes fake data** → It creates pretend fruits with made-up colors and sizes.
|
| 13 |
+
2. **Builds the Logistic Regression model** → This is the robot that learns how to guess.
|
| 14 |
+
3. **Trains the robot** → It lets the robot practice guessing until it gets better.
|
| 15 |
+
4. **Shows why bad initialization is bad** → If the robot starts with **wrong guesses**, it takes a long time to learn.
|
| 16 |
+
- Good start ➡️ 🟢 The robot learns fast.
|
| 17 |
+
- Bad start ➡️ 🔴 The robot takes forever or never learns properly.
|
| 18 |
+
5. **Shows how to fix bad initialization** → We can **reinitialize** the robot with -**Random weights** to start with good guesses.
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
# 🧠 **What is Cross-Entropy?**
|
| 22 |
+
Imagine you are playing a **guessing game** with a 🦉 **wise owl**.
|
| 23 |
+
- The owl has to guess if a fruit is an 🍎 **apple** or a 🍌 **banana**.
|
| 24 |
+
- The owl makes a **prediction** (for example: 90% sure it’s an apple).
|
| 25 |
+
- If the owl is **right**, it gets a ⭐️.
|
| 26 |
+
- If the owl is **wrong**, it gets a 👎.
|
| 27 |
+
|
| 28 |
+
**Cross-Entropy** is like a **scorekeeper**:
|
| 29 |
+
- If the owl guesses correctly ➡️ **low score** 🟢 (good)
|
| 30 |
+
- If the owl guesses wrong ➡️ **high score** 🔴 (bad)
|
| 31 |
+
|
| 32 |
+
## 🔥 **What does the notebook do?**
|
| 33 |
+
1. **Makes fake fruit data** → It creates pretend fruits with random colors and shapes.
|
| 34 |
+
2. **Builds the Logistic Regression model** → This is the owl’s brain that makes guesses.
|
| 35 |
+
3. **Trains the model with Cross-Entropy** → It helps the owl learn by keeping score.
|
| 36 |
+
4. **Improves accuracy** → The owl gets better at guessing with practice by trying to lower its Cross-Entropy score.
|
| 37 |
+
|
| 38 |
+
|
| 39 |
+
# 🧠 **What is Softmax?**
|
| 40 |
+
Imagine you have a bag of colorful candies. Each candy represents a possible answer (like cat, dog, or bird). The **Softmax function** is like a magical machine that takes all the candies and tells you the **probability** of each one being picked.
|
| 41 |
+
|
| 42 |
+
For example:
|
| 43 |
+
- 🍬?->😺 **Cat** → 70% chance
|
| 44 |
+
- 🍬?->🐶**Dog** → 20% chance
|
| 45 |
+
- 🍬?->🐦 **Bird** → 10% chance
|
| 46 |
+
|
| 47 |
+
Softmax makes sure that all the probabilities add up to **100%** (because one of them will definitely be the right answer).
|
| 48 |
+
|
| 49 |
+
## 🔥 **What does the notebook do?**
|
| 50 |
+
1. **Makes fake data** → It creates some pretend candies (data points) to practice with.
|
| 51 |
+
2. **Builds the Softmax classifier** → This is the machine that guesses which candy you will pick based on its features.
|
| 52 |
+
3. **Trains the model** → It lets the machine practice guessing so it gets better at it.
|
| 53 |
+
4. **Shows the results** → It checks how good the machine is at guessing the correct candy.
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
|
| 57 |
+
# 📚 Understanding Softmax and MNIST 🖊️
|
| 58 |
+
|
| 59 |
+
## 1️⃣ What are we doing?
|
| 60 |
+
We want to teach a computer how to recognize numbers (0-9) by looking at images. Just like how you can tell the difference between a "2" and a "5", we want the computer to do the same!
|
| 61 |
+
|
| 62 |
+
## 2️⃣ What is MNIST? 🤔
|
| 63 |
+
MNIST is a big collection of handwritten numbers. People have written digits (0-9) on paper, and all those images were put into a dataset for computers to learn from.
|
| 64 |
+
|
| 65 |
+
## 3️⃣ What is a Softmax Classifier? 🤖
|
| 66 |
+
A **Softmax Classifier** is like a decision-maker. When it sees a number, it checks **how sure** it is that the number is a 0, 1, 2, etc. It picks the number it is most confident about.
|
| 67 |
+
|
| 68 |
+
Think of it like:
|
| 69 |
+
- You see a blurry animal. 🐶🐱🐭
|
| 70 |
+
- You think: "It **looks** like a dog, but **maybe** a cat."
|
| 71 |
+
- You decide: "I'm **80% sure** it's a dog, **15% sure** it's a cat, and **5% sure** it's a mouse."
|
| 72 |
+
- You pick the one you're most sure about → 🐶 Dog!
|
| 73 |
+
|
| 74 |
+
That's exactly how Softmax works, but with numbers instead of animals!
|
| 75 |
+
|
| 76 |
+
## 4️⃣ How do we train the computer? 🎓
|
| 77 |
+
1. We **show** the computer many images of numbers. 📸
|
| 78 |
+
2. It **tries to guess** what number is in the image. 🔢
|
| 79 |
+
3. If it's wrong, we **correct** it and help it learn. 📚
|
| 80 |
+
4. After training, it becomes **really good** at recognizing numbers! 🚀
|
| 81 |
+
|
| 82 |
+
## 5️⃣ What will we do in the notebook? 📝
|
| 83 |
+
- Load the MNIST dataset. 📊
|
| 84 |
+
- Build a Softmax Classifier. 🏗️
|
| 85 |
+
- Train it to recognize numbers. 🏋️♂️
|
| 86 |
+
- Test if it works! ✅
|
| 87 |
+
|
| 88 |
+
Let's start teaching our computer to recognize numbers! 🧠💡
|
| 89 |
+
|
| 90 |
+
# 🧠 Building a Simple Neural Network! 🤖
|
| 91 |
+
|
| 92 |
+
## 1️⃣ What are we doing? 🎯
|
| 93 |
+
We are teaching a computer to recognize patterns! It will learn from examples and make smart guesses, just like how you learn from practice.
|
| 94 |
+
|
| 95 |
+
## 2️⃣ What is a Neural Network? 🕸️
|
| 96 |
+
A **neural network** is like a **tiny brain** inside a computer. It looks at data, finds patterns, and makes decisions.
|
| 97 |
+
|
| 98 |
+
Imagine your brain trying to recognize your best friend:
|
| 99 |
+
- Your **eyes** see their face. 👀
|
| 100 |
+
- Your **brain** processes what you see. 🧠
|
| 101 |
+
- You **decide**: "Hey, that's my friend!" 🎉
|
| 102 |
+
|
| 103 |
+
A neural network does the same thing but with numbers!
|
| 104 |
+
|
| 105 |
+
|
| 106 |
+
## 3️⃣ What is a Hidden Layer? 🤔
|
| 107 |
+
A **hidden layer** is like a smart helper inside the network. It helps break down complex problems step by step.
|
| 108 |
+
|
| 109 |
+
Think of it like:
|
| 110 |
+
- 🏠 A house → **Too big to understand at once!**
|
| 111 |
+
- 🧱 A hidden layer **breaks it down**: first walls, then windows, then doors!
|
| 112 |
+
- 🏗️ This makes it easier to recognize and understand!
|
| 113 |
+
|
| 114 |
+
## 4️⃣ How do we train the computer? 🎓
|
| 115 |
+
1. We **show** it some data (like numbers or pictures). 👀
|
| 116 |
+
2. It **guesses** what it sees. 🤔
|
| 117 |
+
3. If it’s **wrong**, we **correct** it! ✏️
|
| 118 |
+
4. After **practicing a lot**, it becomes **really good** at guessing. 🚀
|
| 119 |
+
|
| 120 |
+
## 5️⃣ What will we do in the notebook? 📝
|
| 121 |
+
- **Build a simple neural network** with **one hidden layer**. 🏗️
|
| 122 |
+
- **Give it some data** to learn from. 📊
|
| 123 |
+
- **Train it** so it gets better. 🏋️♂️
|
| 124 |
+
- **Test it** to see if it works! ✅
|
| 125 |
+
|
| 126 |
+
By the end, our computer will be **smarter** and ready to recognize patterns! 🧠💡
|
| 127 |
+
|
| 128 |
+
# 🤖 Making a Smarter Neural Network! 🧠
|
| 129 |
+
|
| 130 |
+
## 1️⃣ What are we doing? 🎯
|
| 131 |
+
We are making a **better and smarter brain** for the computer! Instead of just one smart helper (neuron), we will have **many neurons working together**!
|
| 132 |
+
|
| 133 |
+
## 2️⃣ What are Neurons? ⚡
|
| 134 |
+
Neurons are like **tiny workers** inside a neural network. They take information, process it, and pass it along. The more neurons we have, the **smarter** our network becomes!
|
| 135 |
+
|
| 136 |
+
Think of it like:
|
| 137 |
+
- 🏗️ A simple house = **one worker** 🛠️ (slow)
|
| 138 |
+
- 🏙️ A big city = **many workers** 🏗️ (faster & better!)
|
| 139 |
+
|
| 140 |
+
## 3️⃣ Why More Neurons? 🤔
|
| 141 |
+
More neurons mean:
|
| 142 |
+
✅ The network **understands more details**.
|
| 143 |
+
✅ It **learns better** and makes **fewer mistakes**.
|
| 144 |
+
✅ It can solve **harder problems**!
|
| 145 |
+
|
| 146 |
+
Imagine:
|
| 147 |
+
- One person trying to solve a big puzzle 🧩 = **hard**
|
| 148 |
+
- A team of people working together = **faster & easier!**
|
| 149 |
+
|
| 150 |
+
## 4️⃣ How do we train it? 🎓
|
| 151 |
+
1. **Give it some data** 📊
|
| 152 |
+
2. **Let the neurons think** 🧠
|
| 153 |
+
3. **If it’s wrong, we correct it** 📚
|
| 154 |
+
4. **After practice, it gets really smart!** 🚀
|
| 155 |
+
|
| 156 |
+
## 5️⃣ What will we do in the notebook? 📝
|
| 157 |
+
- **Build a bigger neural network** with more neurons! 🏗️
|
| 158 |
+
- **Feed it data to learn from** 📊
|
| 159 |
+
- **Train it to get better** 🏋️♂️
|
| 160 |
+
- **Test it to see how smart it is!** ✅
|
| 161 |
+
|
| 162 |
+
By the end, our computer will be **super smart** at recognizing patterns! 🧠💡
|
| 163 |
+
|
| 164 |
+
# 🤖 Teaching a Computer to Solve XOR! 🧠
|
| 165 |
+
|
| 166 |
+
## 1️⃣ What are we doing? 🎯
|
| 167 |
+
We are teaching a computer to understand a special kind of problem called **XOR**. It's like a puzzle where the answer is only "Yes" when things are different.
|
| 168 |
+
|
| 169 |
+
## 2️⃣ What is XOR? ❌🔄✅
|
| 170 |
+
XOR is a rule that works like this:
|
| 171 |
+
- If two things are the **same** → ❌ NO
|
| 172 |
+
- If two things are **different** → ✅ YES
|
| 173 |
+
|
| 174 |
+
Example:
|
| 175 |
+
| Input 1 | Input 2 | XOR Output |
|
| 176 |
+
|---------|---------|------------|
|
| 177 |
+
| 0 | 0 | 0 ❌ |
|
| 178 |
+
| 0 | 1 | 1 ✅ |
|
| 179 |
+
| 1 | 0 | 1 ✅ |
|
| 180 |
+
| 1 | 1 | 0 ❌ |
|
| 181 |
+
|
| 182 |
+
It's like a **light switch** that only turns on if one switch is flipped!
|
| 183 |
+
|
| 184 |
+
## 3️⃣ Why is XOR tricky for computers? 🤔
|
| 185 |
+
Basic computers **don’t understand XOR easily**. They need a **hidden layer** with **multiple neurons** to figure it out!
|
| 186 |
+
|
| 187 |
+
## 4️⃣ What do we do in this notebook? 📝
|
| 188 |
+
- **Create a neural network** with one hidden layer 🏗️
|
| 189 |
+
- **Train it** to learn the XOR rule 🎓
|
| 190 |
+
- **Try different numbers of neurons** (1, 2, 3...) to see what works best! ⚡
|
| 191 |
+
|
| 192 |
+
By the end, our computer will **solve the XOR puzzle** and be smarter! 🧠🚀
|
| 193 |
+
|
| 194 |
+
# 🧠 Teaching a Computer to Read Numbers! 🔢🤖
|
| 195 |
+
|
| 196 |
+
## 1️⃣ What are we doing? 🎯
|
| 197 |
+
We are training a **computer brain** to look at pictures of numbers (0-9) and guess what they are!
|
| 198 |
+
|
| 199 |
+
## 2️⃣ What is the MNIST Dataset? 📸
|
| 200 |
+
MNIST is a **big collection of handwritten numbers** that we use to teach computers how to recognize digits.
|
| 201 |
+
|
| 202 |
+
## 3️⃣ How does the Computer Learn? 🏗️
|
| 203 |
+
- The computer looks at **lots of examples** of numbers. 👀
|
| 204 |
+
- It tries to guess what number each image shows. 🤔
|
| 205 |
+
- If it’s **wrong**, we help it learn and get better! 📚
|
| 206 |
+
- After **lots of practice**, it becomes really smart! 🚀
|
| 207 |
+
|
| 208 |
+
## 4️⃣ What’s Special About This Network? 🤔
|
| 209 |
+
We are using a **simple neural network** with **one hidden layer**. This layer helps the computer **understand patterns** in the numbers!
|
| 210 |
+
|
| 211 |
+
## 5️⃣ What Will We Do in This Notebook? 📝
|
| 212 |
+
- **Build a simple neural network** with **one hidden layer**. 🏗️
|
| 213 |
+
- **Train it** to recognize numbers. 🎓
|
| 214 |
+
- **Test it** to see how smart it is! ✅
|
| 215 |
+
|
| 216 |
+
By the end, our computer will **read numbers just like you!** 🧠💡
|
| 217 |
+
|
| 218 |
+
# ⚡ Making the Computer Think Better! 🧠
|
| 219 |
+
|
| 220 |
+
## 1️⃣ What are we doing? 🎯
|
| 221 |
+
We are learning about **activation functions** – special rules that help a computer **decide things**!
|
| 222 |
+
|
| 223 |
+
## 2️⃣ What is an Activation Function? 🤔
|
| 224 |
+
Think of a **light switch**! 💡
|
| 225 |
+
- If you turn it **ON**, the light shines.
|
| 226 |
+
- If you turn it **OFF**, the light is dark.
|
| 227 |
+
|
| 228 |
+
Activation functions help a computer **decide** what to focus on, just like flipping a switch!
|
| 229 |
+
|
| 230 |
+
## 3️⃣ Types of Activation Functions 🔢
|
| 231 |
+
We will learn about:
|
| 232 |
+
- **Sigmoid**: A soft switch that makes decisions slowly.
|
| 233 |
+
- **Tanh**: A stronger version of Sigmoid.
|
| 234 |
+
- **ReLU**: The fastest and strongest switch for learning!
|
| 235 |
+
|
| 236 |
+
## 4️⃣ What Will We Do in This Notebook? 📝
|
| 237 |
+
- **Learn about different activation functions** ⚡
|
| 238 |
+
- **Try them in a neural network** 🏗️
|
| 239 |
+
- **See which one works best** ✅
|
| 240 |
+
|
| 241 |
+
By the end, we’ll know how computers **make smart choices!** 🤖
|
| 242 |
+
|
| 243 |
+
# 🔢 Helping a Computer Read Numbers Better! 🧠🤖
|
| 244 |
+
|
| 245 |
+
## 1️⃣ What are we doing? 🎯
|
| 246 |
+
We are testing **three different activation functions** to see which one helps the computer **read numbers the best!**
|
| 247 |
+
|
| 248 |
+
## 2️⃣ What is an Activation Function? 🤔
|
| 249 |
+
An activation function helps the computer **decide things**!
|
| 250 |
+
It’s like a **brain switch** that turns information **ON or OFF** so the computer can learn better.
|
| 251 |
+
|
| 252 |
+
## 3️⃣ What Activation Functions Are We Testing? ⚡
|
| 253 |
+
- **Sigmoid**: Soft decision-making. 🧐
|
| 254 |
+
- **Tanh**: A stronger version of Sigmoid. 🔥
|
| 255 |
+
- **ReLU**: The fastest and most powerful! ⚡
|
| 256 |
+
|
| 257 |
+
## 4️⃣ What Will We Do in This Notebook? 📝
|
| 258 |
+
- **Train a computer** to read handwritten numbers! 🔢
|
| 259 |
+
- **Use different activation functions** and compare them. ⚡
|
| 260 |
+
- **See which one works best** for accuracy! ✅
|
| 261 |
+
|
| 262 |
+
By the end, we’ll know which function helps the computer **think the smartest!** 🧠🚀
|
| 263 |
+
|
| 264 |
+
# 🧠 What is a Deep Neural Network? 🤖
|
| 265 |
+
|
| 266 |
+
## 1️⃣ What are we doing? 🎯
|
| 267 |
+
We are building a **Deep Neural Network (DNN)** to help a computer **understand and recognize numbers**!
|
| 268 |
+
|
| 269 |
+
## 2️⃣ What is a Deep Neural Network? 🤔
|
| 270 |
+
A Deep Neural Network is a **super smart computer brain** with **many layers**.
|
| 271 |
+
Each layer **learns something new** and helps the computer make better decisions.
|
| 272 |
+
|
| 273 |
+
Think of it like:
|
| 274 |
+
👶 **A baby** trying to recognize a cat 🐱 → It might get confused!
|
| 275 |
+
👦 **A child** learning from books 📚 → Gets better at it!
|
| 276 |
+
🧑 **An expert** who has seen many cats 🏆 → Can recognize them instantly!
|
| 277 |
+
|
| 278 |
+
A **Deep Neural Network** works the same way—it **learns step by step**!
|
| 279 |
+
|
| 280 |
+
## 3️⃣ Why is a Deep Neural Network better? 🚀
|
| 281 |
+
✅ **More layers** = **More learning!**
|
| 282 |
+
✅ Can understand **complex patterns**.
|
| 283 |
+
✅ Can make **smarter decisions**!
|
| 284 |
+
|
| 285 |
+
## 4️⃣ What Will We Do in This Notebook? 📝
|
| 286 |
+
- **Build a Deep Neural Network** with multiple layers 🏗️
|
| 287 |
+
- **Train it** to recognize handwritten numbers 🔢
|
| 288 |
+
- **Try different activation functions** (Sigmoid, Tanh, ReLU) ⚡
|
| 289 |
+
- **See which one works best!** ✅
|
| 290 |
+
|
| 291 |
+
By the end, our computer will be **super smart** at recognizing patterns! 🧠🚀
|
| 292 |
+
|
| 293 |
+
# 🌀 Teaching a Computer to See Spirals! 🤖
|
| 294 |
+
|
| 295 |
+
## 1️⃣ What are we doing? 🎯
|
| 296 |
+
We are teaching a **computer brain** to look at points in a spiral shape and **figure out which group they belong to**!
|
| 297 |
+
|
| 298 |
+
## 2️⃣ Why is this tricky? 🤔
|
| 299 |
+
The points are **twisted into spirals** 🌀, so the computer needs to be **really smart** to tell them apart.
|
| 300 |
+
It needs a **deep neural network** to **understand the swirl**!
|
| 301 |
+
|
| 302 |
+
## 3️⃣ How does the Computer Learn? 🏗️
|
| 303 |
+
- It looks at **many points** 👀
|
| 304 |
+
- It **guesses** which spiral they belong to ❓
|
| 305 |
+
- If it’s **wrong**, we help it fix mistakes! 🚀
|
| 306 |
+
- After **lots of practice**, it gets really good at sorting them! ✅
|
| 307 |
+
|
| 308 |
+
## 4️⃣ What’s Special About This Network? 🧠
|
| 309 |
+
- We use **ReLU activation** ⚡ to make learning **faster and better**!
|
| 310 |
+
- We **train it** to separate the spiral points into **different colors**! 🎨
|
| 311 |
+
|
| 312 |
+
## 5️⃣ What Will We Do in This Notebook? 📝
|
| 313 |
+
- **Build a deep neural network** with **many layers** 🏗️
|
| 314 |
+
- **Train it** to separate spirals 🌀
|
| 315 |
+
- **Check if it gets them right**! ✅
|
| 316 |
+
|
| 317 |
+
By the end, our computer will **see the spirals just like us!** 🧠✨
|
| 318 |
+
|
| 319 |
+
# 🎓 Teaching a Computer to Be Smarter with Dropout! 🤖
|
| 320 |
+
|
| 321 |
+
## 1️⃣ What are we doing? 🎯
|
| 322 |
+
We are training a **computer brain** to make better predictions by using **Dropout**!
|
| 323 |
+
|
| 324 |
+
## 2️⃣ What is Dropout? 🤔
|
| 325 |
+
Dropout is like **playing a game with one eye closed**! 👀
|
| 326 |
+
- It makes the computer **forget** some parts of what it learned **on purpose**!
|
| 327 |
+
- This helps it **not get stuck** memorizing the training examples.
|
| 328 |
+
- Instead, it learns to **think better** and make **stronger predictions**!
|
| 329 |
+
|
| 330 |
+
## 3️⃣ Why is Dropout Important? 🧠
|
| 331 |
+
Imagine learning math but only using the same **five problems** over and over.
|
| 332 |
+
- You’ll **memorize** them but struggle with new ones! 😕
|
| 333 |
+
- Dropout **mixes things up** so the computer learns **general rules**, not just examples! 🚀
|
| 334 |
+
|
| 335 |
+
## 4️⃣ What Will We Do in This Notebook? 📝
|
| 336 |
+
- **Make some data** to train our computer. 📊
|
| 337 |
+
- **Build a neural network** and use Dropout. 🏗️
|
| 338 |
+
- **Train it using Batch Gradient Descent** (a way to help the computer learn step by step). 🏃
|
| 339 |
+
- **See how Dropout helps prevent overfitting!** ✅
|
| 340 |
+
|
| 341 |
+
By the end, our computer will **make smarter decisions** instead of just memorizing! 🧠✨
|
| 342 |
+
|
| 343 |
+
|
| 344 |
+
# 📉 Teaching a Computer to Predict Numbers with Dropout! 🤖
|
| 345 |
+
|
| 346 |
+
## 1️⃣ What is Regression? 🤔
|
| 347 |
+
Regression is when a computer **learns from past numbers** to **predict future numbers**!
|
| 348 |
+
For example:
|
| 349 |
+
- If you save **$5 every week**, how much will you have in **10 weeks**? 💰
|
| 350 |
+
- The computer **looks at patterns** and **makes a smart guess**!
|
| 351 |
+
|
| 352 |
+
## 2️⃣ Why Do We Need Dropout? 🚀
|
| 353 |
+
Sometimes, the computer **memorizes too much** and doesn’t learn the real pattern. 😵
|
| 354 |
+
Dropout **randomly turns off** parts of the computer’s learning, so it **thinks smarter** instead of just remembering numbers.
|
| 355 |
+
|
| 356 |
+
## 3️⃣ What’s Happening in This Notebook? 📝
|
| 357 |
+
- **We make number data** for the computer to learn from. 📊
|
| 358 |
+
- **We build a model** using PyTorch to predict numbers. 🏗️
|
| 359 |
+
- **We add Dropout** to stop the model from memorizing. ❌🧠
|
| 360 |
+
- **We check if Dropout helps the model predict better!** ✅
|
| 361 |
+
|
| 362 |
+
By the end, our computer will be **smarter at guessing numbers!** 🧠✨
|
| 363 |
+
|
| 364 |
+
# 🏗️ Why Can't We Start with the Same Weights? 🤖
|
| 365 |
+
|
| 366 |
+
## 1️⃣ What is Weight Initialization? 🤔
|
| 367 |
+
When a computer **learns** using a neural network, it starts with **random numbers** (weights) and adjusts them over time to get better.
|
| 368 |
+
|
| 369 |
+
## 2️⃣ What Happens if We Use the Same Weights? 🚨
|
| 370 |
+
If all the starting weights are **the same**, the computer gets **confused**! 😵
|
| 371 |
+
- Every neuron learns **the exact same thing** → No variety!
|
| 372 |
+
- The network **doesn’t improve**, and learning **gets stuck**.
|
| 373 |
+
|
| 374 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
| 375 |
+
- **Make a simple neural network** to test this. 🏗️
|
| 376 |
+
- **Initialize all weights the same way** to see what happens. ⚖️
|
| 377 |
+
- **Try using different random weights** and compare the results! 🎯
|
| 378 |
+
|
| 379 |
+
By the end, we’ll see why **random weight initialization is important** for a smart neural network! 🧠✨
|
| 380 |
+
|
| 381 |
+
# 🎯 Helping a Computer Learn Better with Xavier Initialization! 🤖
|
| 382 |
+
|
| 383 |
+
## 1️⃣ What is Weight Initialization? 🤔
|
| 384 |
+
When a neural network **starts learning**, it needs to begin with **some numbers** (called weights).
|
| 385 |
+
If we **pick bad starting numbers**, the network **won't learn well**!
|
| 386 |
+
|
| 387 |
+
## 2️⃣ What is Xavier Initialization? ⚖️
|
| 388 |
+
Xavier Initialization is a **smart way** to pick these starting numbers.
|
| 389 |
+
It **balances** them so they’re **not too big** or **too small**.
|
| 390 |
+
This helps the computer **learn faster** and **make better decisions**! 🚀
|
| 391 |
+
|
| 392 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
| 393 |
+
- **Build a neural network** to recognize handwritten numbers. 🔢
|
| 394 |
+
- **Use Xavier Initialization** to set up good starting weights. 🎯
|
| 395 |
+
- **Compare** how well the network learns! ✅
|
| 396 |
+
|
| 397 |
+
By the end, we’ll see why **starting right** helps a neural network **become smarter!** 🧠✨
|
| 398 |
+
|
| 399 |
+
# 🚀 Helping a Computer Learn Faster with Momentum! 🤖
|
| 400 |
+
|
| 401 |
+
## 1️⃣ What is a Polynomial Function? 📈
|
| 402 |
+
A polynomial function is a math equation with **powers** (like squared or cubed numbers).
|
| 403 |
+
For example:
|
| 404 |
+
- \( y = x^2 + 3x + 5 \)
|
| 405 |
+
- \( y = x^3 - 2x^2 + x \)
|
| 406 |
+
|
| 407 |
+
These are tricky for a computer to learn! 😵
|
| 408 |
+
|
| 409 |
+
## 2️⃣ What is Momentum? ⚡
|
| 410 |
+
Imagine rolling a ball down a hill. ⛰️🏀
|
| 411 |
+
- If the ball **stops at every step**, it takes **a long time** to reach the bottom.
|
| 412 |
+
- But if we give it **momentum**, it **keeps going** and moves faster! 🚀
|
| 413 |
+
|
| 414 |
+
Momentum helps a neural network **move in the right direction** without getting stuck.
|
| 415 |
+
|
| 416 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
| 417 |
+
- **Teach a computer to learn polynomial functions.** 📊
|
| 418 |
+
- **Use Momentum** to help it learn faster. 🏃
|
| 419 |
+
- **Compare it to normal learning** and see why Momentum is better! ✅
|
| 420 |
+
|
| 421 |
+
By the end, we’ll see how **Momentum helps a neural network** learn tricky math problems **faster and smarter!** 🧠✨
|
| 422 |
+
|
| 423 |
+
# 🏃♂️ Helping a Neural Network Learn Faster with Momentum! 🚀
|
| 424 |
+
|
| 425 |
+
## 1️⃣ What is a Neural Network? 🤖
|
| 426 |
+
A neural network is a **computer brain** that learns by **adjusting numbers (weights)** to make good predictions.
|
| 427 |
+
|
| 428 |
+
## 2️⃣ What is Momentum? ⚡
|
| 429 |
+
Imagine pushing a heavy box. 📦
|
| 430 |
+
- If you **push and stop**, it moves slowly. 😴
|
| 431 |
+
- But if you **keep pushing**, it **gains speed** and moves **faster**! 🚀
|
| 432 |
+
|
| 433 |
+
Momentum helps a neural network **keep moving in the right direction** without getting stuck!
|
| 434 |
+
|
| 435 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
| 436 |
+
- **Train a neural network** to recognize patterns. 🎯
|
| 437 |
+
- **Use Momentum** to help it learn faster. 🏃♂️
|
| 438 |
+
- **Compare it to normal learning** and see why Momentum is better! ✅
|
| 439 |
+
|
| 440 |
+
By the end, we’ll see how **Momentum helps a neural network** become **faster and smarter!** 🧠✨
|
| 441 |
+
|
| 442 |
+
# 🚀 Helping a Neural Network Learn Better with Batch Normalization! 🤖
|
| 443 |
+
|
| 444 |
+
## 1️⃣ What is a Neural Network? 🧠
|
| 445 |
+
A neural network is like a **computer brain** that learns by adjusting **numbers (weights)** to make smart decisions.
|
| 446 |
+
|
| 447 |
+
## 2️⃣ What is Batch Normalization? ⚖️
|
| 448 |
+
Imagine a race where everyone starts at **different speeds**. Some are too slow, and some are too fast. 🏃♂️💨
|
| 449 |
+
Batch Normalization **balances the speeds** so everyone runs **smoothly together**!
|
| 450 |
+
|
| 451 |
+
For a neural network, this means:
|
| 452 |
+
- **Making learning faster** 🚀
|
| 453 |
+
- **Stopping extreme values** that cause bad learning ❌
|
| 454 |
+
- **Helping the network work better** with deep layers! 🏗️
|
| 455 |
+
|
| 456 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
| 457 |
+
- **Train a neural network** to recognize patterns. 🎯
|
| 458 |
+
- **Use Batch Normalization** to help it learn better. ⚖️
|
| 459 |
+
- **Compare it to normal learning** and see the difference! ✅
|
| 460 |
+
|
| 461 |
+
By the end, we’ll see why **Batch Normalization** makes neural networks **faster and smarter!** 🧠✨
|
| 462 |
+
|
| 463 |
+
# 👀 How Do Computers See? Understanding Convolution! 🤖
|
| 464 |
+
|
| 465 |
+
## 1️⃣ What is Convolution? 🔍
|
| 466 |
+
Convolution is like **giving a computer glasses** to help it focus on parts of an image! 🕶️
|
| 467 |
+
- It **looks at small parts** of a picture instead of the whole thing at once. 🖼️
|
| 468 |
+
- It **finds patterns**, like edges, shapes, or textures. 🔲
|
| 469 |
+
|
| 470 |
+
## 2️⃣ Why Do We Use It? 🎯
|
| 471 |
+
Imagine finding **Waldo** in a giant picture! 🔎👦
|
| 472 |
+
- Instead of looking at everything at once, we **scan** small parts at a time.
|
| 473 |
+
- Convolution helps computers **scan images smartly** to recognize objects! 🏆
|
| 474 |
+
|
| 475 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
| 476 |
+
- **Learn how convolution works** step by step. 🛠️
|
| 477 |
+
- **See how it helps computers find patterns** in images. 🖼️
|
| 478 |
+
- **Understand why convolution is used in AI** for image recognition! 🤖✅
|
| 479 |
+
|
| 480 |
+
By the end, we’ll see how convolution helps computers **see and understand pictures like humans!** 🧠✨
|
| 481 |
+
|
| 482 |
+
# 🖼️ How Do Computers See Images? Understanding Activation & Max Pooling! 🤖
|
| 483 |
+
|
| 484 |
+
## 1️⃣ What is an Activation Function? ⚡
|
| 485 |
+
Activation functions **help the computer make smart decisions**! 🧠
|
| 486 |
+
- They decide **which patterns are important** in an image.
|
| 487 |
+
- Without them, the computer wouldn’t know what to focus on! 🎯
|
| 488 |
+
|
| 489 |
+
## 2️⃣ What is Max Pooling? 🔍
|
| 490 |
+
Max Pooling is like **shrinking an image** while keeping the best parts!
|
| 491 |
+
- It **takes the most important details** and removes extra noise. 🎛️
|
| 492 |
+
- This makes the computer **faster and better at recognizing objects!** 🚀
|
| 493 |
+
|
| 494 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
| 495 |
+
- **See how activation functions work** to find patterns. 🔎
|
| 496 |
+
- **Learn how max pooling makes images smaller but useful.** 📉
|
| 497 |
+
- **Understand why these tricks make AI smarter!** 🤖✅
|
| 498 |
+
|
| 499 |
+
By the end, we’ll see how **activation & pooling help computers "see" images like we do!** 🧠✨
|
| 500 |
+
|
| 501 |
+
# 🌈 How Do Computers See Color? Understanding Multiple Channel Convolution! 🤖
|
| 502 |
+
|
| 503 |
+
## 1️⃣ What is a Channel in an Image? 🎨
|
| 504 |
+
Think of a picture on your screen. 🖼️
|
| 505 |
+
- A **black & white** image has **1 channel** (just light & dark). ⚫⚪
|
| 506 |
+
- A **color image** has **3 channels**: **Red, Green, and Blue (RGB)!** 🌈
|
| 507 |
+
|
| 508 |
+
Computers **combine these channels** to see full-color pictures!
|
| 509 |
+
|
| 510 |
+
## 2️⃣ What is Multiple Channel Convolution? 🔍
|
| 511 |
+
- Instead of looking at just one channel, the computer **processes all 3 (RGB)** at the same time. 🔴🟢🔵
|
| 512 |
+
- This helps it **find edges, textures, and patterns in color images**! 🎯
|
| 513 |
+
|
| 514 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
| 515 |
+
- **See how convolution works on multiple channels.** 👀
|
| 516 |
+
- **Understand how computers recognize colors & details.** 🖼️
|
| 517 |
+
- **Learn why this is important for AI and image recognition!** 🤖✅
|
| 518 |
+
|
| 519 |
+
By the end, we’ll see how **computers process full-color images like we do!** 🧠✨
|
| 520 |
+
|
| 521 |
+
# 🖼️ How Do Computers Recognize Pictures? Understanding CNNs! 🤖
|
| 522 |
+
|
| 523 |
+
## 1️⃣ What is a Convolutional Neural Network (CNN)? 🧠
|
| 524 |
+
A CNN is a special **computer brain** designed to **look at pictures** and find patterns! 🔍
|
| 525 |
+
- It **scans an image** like our eyes do. 👀
|
| 526 |
+
- It learns to recognize **shapes, edges, and objects**. 🎯
|
| 527 |
+
- This helps AI **identify things in pictures**, like cats 🐱, dogs 🐶, or numbers 🔢!
|
| 528 |
+
|
| 529 |
+
## 2️⃣ How Does a CNN Work? ⚙️
|
| 530 |
+
A CNN has **layers** that help it learn step by step:
|
| 531 |
+
1. **Convolution Layer** – Finds small details like edges and corners. 🔲
|
| 532 |
+
2. **Pooling Layer** – Shrinks the image but keeps the important parts. 📉
|
| 533 |
+
3. **Fully Connected Layer** – Makes the final decision! ✅
|
| 534 |
+
|
| 535 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
| 536 |
+
- **Build a simple CNN** that can recognize images. 🏗️
|
| 537 |
+
- **See how each layer helps the computer "see" better.** 👀
|
| 538 |
+
- **Understand why CNNs are great at image recognition!** 🚀
|
| 539 |
+
|
| 540 |
+
By the end, we’ll see how **CNNs help computers recognize pictures just like humans do!** 🧠✨
|
| 541 |
+
|
| 542 |
+
---
|
| 543 |
+
|
| 544 |
+
# 🖼️ Teaching a Computer to See Small Pictures! 🤖
|
| 545 |
+
|
| 546 |
+
## 1️⃣ What is a CNN? 🧠
|
| 547 |
+
A **Convolutional Neural Network (CNN)** is a special AI that **looks at pictures and finds patterns**! 🔍
|
| 548 |
+
- It scans images **piece by piece** like a puzzle. 🧩
|
| 549 |
+
- It learns to recognize **shapes, edges, and objects**. 🎯
|
| 550 |
+
- CNNs help AI recognize **faces, animals, and numbers**! 🐱🔢👀
|
| 551 |
+
|
| 552 |
+
## 2️⃣ Why Small Images? 📏
|
| 553 |
+
Small images are **harder to understand** because they have **fewer details**!
|
| 554 |
+
- A CNN needs to **work extra hard** to find important features. 💪
|
| 555 |
+
- We use **smaller filters and layers** to capture details. 🎛️
|
| 556 |
+
|
| 557 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
| 558 |
+
- **Train a CNN on small images.** 🏗️
|
| 559 |
+
- **See how it learns to recognize patterns.** 🔎
|
| 560 |
+
- **Understand why CNNs work well, even with tiny pictures!** 🚀
|
| 561 |
+
|
| 562 |
+
By the end, we’ll see how **computers can recognize even small images with AI!** 🧠✨
|
| 563 |
+
|
| 564 |
+
---
|
| 565 |
+
|
| 566 |
+
# 🖼️ Teaching a Computer to See Small Pictures with Batches! 🤖
|
| 567 |
+
|
| 568 |
+
## 1️⃣ What is a CNN? 🧠
|
| 569 |
+
A **Convolutional Neural Network (CNN)** is a special AI that **looks at pictures and learns patterns**! 🔍
|
| 570 |
+
- It **finds shapes, edges, and objects** in an image. 🎯
|
| 571 |
+
- It helps AI recognize **faces, animals, and numbers**! 🐱🔢👀
|
| 572 |
+
|
| 573 |
+
## 2️⃣ What is a Batch? 📦
|
| 574 |
+
Instead of looking at **one image at a time**, the computer looks at **a group (batch) of images** at once!
|
| 575 |
+
- This **makes learning faster**. 🚀
|
| 576 |
+
- It helps the CNN **understand patterns better**. 🧠✅
|
| 577 |
+
|
| 578 |
+
## 3️⃣ Why Small Images? 📏
|
| 579 |
+
Small images have **fewer details**, so the CNN must **work harder to find patterns**. 💪
|
| 580 |
+
- We **train in batches** to help the computer **learn faster and better**. 🎛️
|
| 581 |
+
|
| 582 |
+
## 4️⃣ What Will We Do in This Notebook? 📝
|
| 583 |
+
- **Train a CNN on small images using batches.** 🏗️
|
| 584 |
+
- **See how it learns to recognize objects better.** 🔎
|
| 585 |
+
- **Understand why batching helps AI train efficiently!** ⚡
|
| 586 |
+
|
| 587 |
+
By the end, we’ll see how **CNNs learn faster and smarter with batches!** 🧠✨
|
| 588 |
+
|
| 589 |
+
---
|
| 590 |
+
|
| 591 |
+
# 🖼️ Teaching a Computer to Recognize Handwritten Numbers! 🤖
|
| 592 |
+
|
| 593 |
+
## 1️⃣ What is a CNN? 🧠
|
| 594 |
+
A **Convolutional Neural Network (CNN)** is a smart AI that **looks at pictures and learns patterns**! 🔍
|
| 595 |
+
- It **finds shapes, lines, and curves** in images. 🔢
|
| 596 |
+
- It helps AI recognize **digits and handwritten numbers**! ✏️
|
| 597 |
+
|
| 598 |
+
## 2️⃣ Why Handwritten Numbers? 🔢
|
| 599 |
+
Handwritten numbers are **tricky** because everyone writes differently!
|
| 600 |
+
- A CNN must **learn the different ways** people write the same number.
|
| 601 |
+
- This helps it **recognize digits** even if they are messy. 💡
|
| 602 |
+
|
| 603 |
+
## 3️⃣ What Will We Do in This Notebook? 📝
|
| 604 |
+
- **Train a CNN to classify images of handwritten numbers.** 🏗️
|
| 605 |
+
- **See how it learns to recognize different digits.** 🔎
|
| 606 |
+
- **Understand how AI can analyze images of handwritten numbers!** 🚀
|
| 607 |
+
|
| 608 |
+
By the end, we’ll see how **computers can recognize handwritten numbers just like we do!** 🧠✨
|
ShallowNeuralNetwork.md
ADDED
|
@@ -0,0 +1,86 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
1️⃣ **Introduction to Neural Networks (One Hidden Layer)** 🤖
|
| 2 |
+
- A neural network is like a **thinking machine** that makes decisions.
|
| 3 |
+
- It **learns from data** and gets better over time.
|
| 4 |
+
- We build a network with **one hidden layer** to help it **think smarter**.
|
| 5 |
+
|
| 6 |
+
2️⃣ **More Neurons, Better Learning!** 🧠
|
| 7 |
+
- If a network **isn’t smart enough**, we add **more neurons**!
|
| 8 |
+
- More neurons = **better decision-making**.
|
| 9 |
+
- We train the network to **recognize patterns more accurately**.
|
| 10 |
+
|
| 11 |
+
3️⃣ **Neural Networks with Multiple Inputs** 🔢
|
| 12 |
+
- Instead of just **one piece of data**, we give the network **many inputs**.
|
| 13 |
+
- This helps it **understand more complex problems**.
|
| 14 |
+
- Too many neurons = **overfitting (too specific)**, too few = **underfitting (too simple)**.
|
| 15 |
+
|
| 16 |
+
4️⃣ **Multi-Class Neural Networks** 🎨
|
| 17 |
+
- Instead of choosing between **two options**, the network can choose **many!**
|
| 18 |
+
- It learns to **classify things into multiple groups**, like recognizing **different animals**.
|
| 19 |
+
- The Softmax function helps it **pick the best answer**.
|
| 20 |
+
|
| 21 |
+
5️⃣ **Backpropagation: Learning from Mistakes** 🔄
|
| 22 |
+
- The network **makes a guess**, checks if it’s right, and **fixes itself**.
|
| 23 |
+
- It does this using **backpropagation**, which adjusts the neurons.
|
| 24 |
+
- This is how AI **gets smarter with time**!
|
| 25 |
+
|
| 26 |
+
6️⃣ **Activation Functions: Helping AI Decide** ⚡
|
| 27 |
+
- Activation functions **control how neurons react**.
|
| 28 |
+
- Three common types:
|
| 29 |
+
- **Sigmoid** → Good for probabilities.
|
| 30 |
+
- **Tanh** → Helps balance data.
|
| 31 |
+
- **ReLU** → Fastest and most useful!
|
| 32 |
+
- These functions help the network **learn efficiently**.
|
| 33 |
+
|
| 34 |
+
# 📖 AI Terms and Definitions (Based on the Videos) 🤖
|
| 35 |
+
|
| 36 |
+
### 🧠 **Neural Network**
|
| 37 |
+
A **computer brain** that learns by adjusting numbers (weights) to make decisions.
|
| 38 |
+
|
| 39 |
+
### 🎯 **Classification**
|
| 40 |
+
Teaching AI to **sort things into groups**, like recognizing cats 🐱 and dogs 🐶 in pictures.
|
| 41 |
+
|
| 42 |
+
### ⚡ **Activation Function**
|
| 43 |
+
A rule that helps AI **decide which information is important**. Examples:
|
| 44 |
+
- **Sigmoid** → Soft decision-making.
|
| 45 |
+
- **Tanh** → Balances positive and negative values.
|
| 46 |
+
- **ReLU** → Fast and effective!
|
| 47 |
+
|
| 48 |
+
### 🔄 **Backpropagation**
|
| 49 |
+
AI’s way of **fixing mistakes** by looking at errors and adjusting itself.
|
| 50 |
+
|
| 51 |
+
### 📉 **Loss Function**
|
| 52 |
+
A **score** that tells AI **how wrong** it was, so it can improve.
|
| 53 |
+
|
| 54 |
+
### 🚀 **Gradient Descent**
|
| 55 |
+
A method that helps AI **learn step by step** by making small changes to improve.
|
| 56 |
+
|
| 57 |
+
### 🏗️ **Hidden Layer**
|
| 58 |
+
A **middle part of a neural network** that helps process complex information.
|
| 59 |
+
|
| 60 |
+
### 🌀 **Softmax Function**
|
| 61 |
+
Helps AI **choose the best answer** when there are multiple choices.
|
| 62 |
+
|
| 63 |
+
### ⚖️ **Cross Entropy Loss**
|
| 64 |
+
A way to measure **how well AI is learning** when making choices.
|
| 65 |
+
|
| 66 |
+
### 📊 **Multi-Class Neural Networks**
|
| 67 |
+
AI models that can **choose from many options**, not just two.
|
| 68 |
+
|
| 69 |
+
### 🏎️ **Momentum**
|
| 70 |
+
A trick that helps AI **learn faster** by keeping track of past updates.
|
| 71 |
+
|
| 72 |
+
### 🔍 **Overfitting**
|
| 73 |
+
When AI **memorizes too much** and struggles with new data.
|
| 74 |
+
|
| 75 |
+
### 😕 **Underfitting**
|
| 76 |
+
When AI **doesn’t learn enough** and makes bad predictions.
|
| 77 |
+
|
| 78 |
+
### 🎨 **Convolutional Neural Network (CNN)**
|
| 79 |
+
A special AI for **understanding images**, used in things like face recognition.
|
| 80 |
+
|
| 81 |
+
### 📦 **Batch Processing**
|
| 82 |
+
Instead of training on **one piece of data at a time**, AI looks at **many pieces at once** to learn faster.
|
| 83 |
+
|
| 84 |
+
### 🏗️ **PyTorch**
|
| 85 |
+
A tool that helps build and train neural networks easily.
|
| 86 |
+
|
SoftmaxRegression.md
ADDED
|
@@ -0,0 +1,16 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
🔢 **Softmax: Helping AI Pick the Best Answer** 🎯
|
| 2 |
+
- AI sometimes has **many choices**, not just **yes or no**.
|
| 3 |
+
- The **Softmax function** helps AI **decide which answer is most likely**.
|
| 4 |
+
- It looks at **all possibilities** and picks the **best one**!
|
| 5 |
+
|
| 6 |
+
🖼️ **Softmax for Images: Teaching AI to Recognize Pictures** 🧠
|
| 7 |
+
- AI can look at a picture and **guess what it is**.
|
| 8 |
+
- It checks **different parts** of the image and compares them.
|
| 9 |
+
- Softmax helps AI **pick the right label** (like "cat" or "dog").
|
| 10 |
+
|
| 11 |
+
📊 **Softmax in PyTorch: Making AI Smarter** ⚙️
|
| 12 |
+
- AI needs **training** to get better at choosing answers.
|
| 13 |
+
- PyTorch helps AI **use Softmax** to **learn from mistakes**.
|
| 14 |
+
- After training, AI **makes better predictions**!
|
| 15 |
+
|
| 16 |
+
By the end, AI will **think smarter** and **recognize things better**! 🚀
|
UpperLower.svg
ADDED
|
|
ZXX Bold.otf
ADDED
|
Binary file (11.8 kB). View file
|
|
|
ZXX Camo.otf
ADDED
|
Binary file (31 kB). View file
|
|
|
ZXX False.otf
ADDED
|
Binary file (13.7 kB). View file
|
|
|
ZXX Noise.otf
ADDED
|
Binary file (18.8 kB). View file
|
|
|
ZXX Sans.otf
ADDED
|
Binary file (11.7 kB). View file
|
|
|
ZXX Xed.otf
ADDED
|
Binary file (21.2 kB). View file
|
|
|
after_logistic.png
ADDED
|
after_shallow.png
ADDED
|
after_softmax.png
ADDED
|
after_training_words.png
ADDED
|
after_training_words_Chesilin.png
ADDED
|
app.py
ADDED
|
@@ -0,0 +1,219 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import gradio as gr
|
| 2 |
+
import torch
|
| 3 |
+
import numpy as np
|
| 4 |
+
import matplotlib.pyplot as plt
|
| 5 |
+
from torch import nn, optim
|
| 6 |
+
from torch.utils.data import DataLoader
|
| 7 |
+
from io import StringIO
|
| 8 |
+
import os
|
| 9 |
+
import base64
|
| 10 |
+
# Import your modules
|
| 11 |
+
from logistic_regression import LogisticRegressionModel
|
| 12 |
+
from softmax_regression import SoftmaxRegressionModel
|
| 13 |
+
from shallow_neural_network import ShallowNeuralNetwork
|
| 14 |
+
import convolutional_neural_networks
|
| 15 |
+
from dataset_loader import CustomMNISTDataset
|
| 16 |
+
from final_project import train_final_model, get_dataset_options, FinalCNN
|
| 17 |
+
import torchvision.transforms as transforms
|
| 18 |
+
|
| 19 |
+
import torch
|
| 20 |
+
import matplotlib.pyplot as plt
|
| 21 |
+
from matplotlib import font_manager
|
| 22 |
+
import matplotlib.pyplot as plt
|
| 23 |
+
def number_to_char(number):
|
| 24 |
+
if 0 <= number <= 9:
|
| 25 |
+
return str(number) # 0-9
|
| 26 |
+
elif 10 <= number <= 35:
|
| 27 |
+
return chr(number + 87) # a-z (10 -> 'a', 35 -> 'z')
|
| 28 |
+
elif 36 <= number <= 61:
|
| 29 |
+
return chr(number + 65) # A-Z (36 -> 'A', 61 -> 'Z')
|
| 30 |
+
else:
|
| 31 |
+
return ''
|
| 32 |
+
|
| 33 |
+
def visualize_predictions_svg(model, train_loader, stage):
|
| 34 |
+
"""Visualizes predictions and returns SVG string for Gradio display."""
|
| 35 |
+
# Load the Daemon font
|
| 36 |
+
font_path = './Daemon.otf' # Path to your Daemon font
|
| 37 |
+
prop = font_manager.FontProperties(fname=font_path)
|
| 38 |
+
|
| 39 |
+
fig, ax = plt.subplots(6, 3, figsize=(12, 16)) # 6 rows and 3 columns for 18 images
|
| 40 |
+
|
| 41 |
+
model.eval()
|
| 42 |
+
images, labels = next(iter(train_loader))
|
| 43 |
+
images, labels = images[:18], labels[:18] # Get 18 images and labels
|
| 44 |
+
|
| 45 |
+
with torch.no_grad():
|
| 46 |
+
outputs = model(images)
|
| 47 |
+
_, predictions = torch.max(outputs, 1)
|
| 48 |
+
|
| 49 |
+
for i in range(18): # Iterate over 18 images
|
| 50 |
+
ax[i // 3, i % 3].imshow(images[i].squeeze(), cmap='gray')
|
| 51 |
+
|
| 52 |
+
# Convert predictions and labels to characters
|
| 53 |
+
pred_char = number_to_char(predictions[i].item())
|
| 54 |
+
label_char = number_to_char(labels[i].item())
|
| 55 |
+
|
| 56 |
+
# Display = or != based on prediction
|
| 57 |
+
if pred_char == label_char:
|
| 58 |
+
title_text = f"{pred_char} = {label_char}"
|
| 59 |
+
color = 'green' # Green if correct
|
| 60 |
+
else:
|
| 61 |
+
title_text = f"{pred_char} != {label_char}"
|
| 62 |
+
color = 'red' # Red if incorrect
|
| 63 |
+
|
| 64 |
+
# Set title with Daemon font and color
|
| 65 |
+
ax[i // 3, i % 3].set_title(title_text, fontproperties=prop, fontsize=12, color=color)
|
| 66 |
+
ax[i // 3, i % 3].axis('off')
|
| 67 |
+
|
| 68 |
+
|
| 69 |
+
# Convert the figure to SVG
|
| 70 |
+
svg_str = figure_to_svg(fig)
|
| 71 |
+
save_svg_to_output_folder(svg_str, f"{stage}_predictions.svg") # Save SVG to output folder
|
| 72 |
+
plt.close(fig)
|
| 73 |
+
|
| 74 |
+
return svg_str
|
| 75 |
+
|
| 76 |
+
def figure_to_svg(fig):
|
| 77 |
+
"""Convert a matplotlib figure to SVG string."""
|
| 78 |
+
from io import StringIO
|
| 79 |
+
from matplotlib.backends.backend_svg import FigureCanvasSVG
|
| 80 |
+
canvas = FigureCanvasSVG(fig)
|
| 81 |
+
output = StringIO()
|
| 82 |
+
canvas.print_svg(output)
|
| 83 |
+
return output.getvalue()
|
| 84 |
+
|
| 85 |
+
def save_svg_to_output_folder(svg_str, filename):
|
| 86 |
+
"""Save the SVG string to the output folder."""
|
| 87 |
+
output_path = f'./output/{filename}' # Ensure your output folder exists
|
| 88 |
+
with open(output_path, 'w') as f:
|
| 89 |
+
f.write(svg_str)
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
def plot_metrics_svg(losses, accuracies):
|
| 93 |
+
"""Generate training metrics as SVG string."""
|
| 94 |
+
fig, ax = plt.subplots(1, 2, figsize=(12, 5))
|
| 95 |
+
|
| 96 |
+
ax[0].plot(losses, label='Loss', color='red')
|
| 97 |
+
ax[0].set_title('Training Loss')
|
| 98 |
+
ax[0].set_xlabel('Epoch')
|
| 99 |
+
ax[0].set_ylabel('Loss')
|
| 100 |
+
ax[0].legend()
|
| 101 |
+
|
| 102 |
+
ax[1].plot(accuracies, label='Accuracy', color='green')
|
| 103 |
+
ax[1].set_title('Training Accuracy')
|
| 104 |
+
ax[1].set_xlabel('Epoch')
|
| 105 |
+
ax[1].set_ylabel('Accuracy')
|
| 106 |
+
ax[1].legend()
|
| 107 |
+
|
| 108 |
+
plt.tight_layout()
|
| 109 |
+
svg_str = figure_to_svg(fig)
|
| 110 |
+
save_svg_to_output_folder(svg_str, "training_metrics.svg") # Save metrics SVG to output folder
|
| 111 |
+
plt.close(fig)
|
| 112 |
+
|
| 113 |
+
return svg_str
|
| 114 |
+
|
| 115 |
+
def train_model_interface(module, dataset_name, epochs=100, lr=0.01):
|
| 116 |
+
"""Train the selected model with the chosen dataset."""
|
| 117 |
+
transform = transforms.Compose([
|
| 118 |
+
transforms.Resize((28, 28)),
|
| 119 |
+
transforms.Grayscale(num_output_channels=1),
|
| 120 |
+
transforms.ToTensor(),
|
| 121 |
+
transforms.Normalize(mean=[0.5], std=[0.5])
|
| 122 |
+
])
|
| 123 |
+
|
| 124 |
+
# Load dataset using CustomMNISTDataset
|
| 125 |
+
train_dataset = CustomMNISTDataset(os.path.join("data", dataset_name, "raw"), transform=transform)
|
| 126 |
+
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
|
| 127 |
+
|
| 128 |
+
# Select Model
|
| 129 |
+
if module == "Logistic Regression":
|
| 130 |
+
model = LogisticRegressionModel(input_size=1)
|
| 131 |
+
elif module == "Softmax Regression":
|
| 132 |
+
model = SoftmaxRegressionModel(input_size=2, num_classes=2)
|
| 133 |
+
elif module == "Shallow Neural Networks":
|
| 134 |
+
model = ShallowNeuralNetwork(input_size=2, hidden_size=5, output_size=2)
|
| 135 |
+
elif module == "Deep Networks":
|
| 136 |
+
import deep_networks
|
| 137 |
+
model = deep_networks.DeepNeuralNetwork(input_size=10, hidden_sizes=[20, 10], output_size=2)
|
| 138 |
+
elif module == "Convolutional Neural Networks":
|
| 139 |
+
model = convolutional_neural_networks.ConvolutionalNeuralNetwork()
|
| 140 |
+
elif module == "AI Calligraphy":
|
| 141 |
+
model = FinalCNN()
|
| 142 |
+
else:
|
| 143 |
+
return "Invalid module selection", None, None, None, None
|
| 144 |
+
|
| 145 |
+
# Visualize before training
|
| 146 |
+
before_svg = visualize_predictions_svg(model, train_loader, "Before")
|
| 147 |
+
|
| 148 |
+
# Train the model
|
| 149 |
+
criterion = nn.CrossEntropyLoss()
|
| 150 |
+
optimizer = optim.SGD(model.parameters(), lr=lr)
|
| 151 |
+
|
| 152 |
+
losses, accuracies = train_final_model(model, criterion, optimizer, train_loader, epochs)
|
| 153 |
+
|
| 154 |
+
# Visualize after training
|
| 155 |
+
after_svg = visualize_predictions_svg(model, train_loader, "After")
|
| 156 |
+
|
| 157 |
+
# Metrics SVG
|
| 158 |
+
metrics_svg = plot_metrics_svg(losses, accuracies)
|
| 159 |
+
|
| 160 |
+
return model, losses, accuracies, before_svg, after_svg, metrics_svg
|
| 161 |
+
|
| 162 |
+
|
| 163 |
+
def list_datasets():
|
| 164 |
+
"""List all available datasets dynamically"""
|
| 165 |
+
dataset_options = get_dataset_options()
|
| 166 |
+
if not dataset_options:
|
| 167 |
+
return ["No datasets found"]
|
| 168 |
+
return dataset_options
|
| 169 |
+
|
| 170 |
+
### 🎯 Gradio Interface ###
|
| 171 |
+
def run_module(module, dataset_name, epochs, lr):
|
| 172 |
+
"""Gradio interface callback"""
|
| 173 |
+
# Train model
|
| 174 |
+
model, losses, accuracies, before_svg, after_svg, metrics_svg = train_model_interface(
|
| 175 |
+
module, dataset_name, epochs, lr
|
| 176 |
+
)
|
| 177 |
+
|
| 178 |
+
if model is None:
|
| 179 |
+
return "Error: Invalid selection.", None, None, None, None
|
| 180 |
+
|
| 181 |
+
# Simply pass the SVG strings to Gradio's gr.Image for rendering
|
| 182 |
+
return (
|
| 183 |
+
f"Training completed for {module} with {epochs} epochs.",
|
| 184 |
+
before_svg, # Pass raw SVG for before training
|
| 185 |
+
after_svg, # Pass raw SVG for after training
|
| 186 |
+
metrics_svg # Return training metrics SVG directly
|
| 187 |
+
)
|
| 188 |
+
|
| 189 |
+
### 🌟 Gradio UI ###
|
| 190 |
+
with gr.Blocks() as app:
|
| 191 |
+
with gr.Tab("Techniques"):
|
| 192 |
+
gr.Markdown("### 🧠 Select Model to Train")
|
| 193 |
+
|
| 194 |
+
module_select = gr.Dropdown(
|
| 195 |
+
choices=[
|
| 196 |
+
"AI Calligraphy"
|
| 197 |
+
],
|
| 198 |
+
label="Select Module"
|
| 199 |
+
)
|
| 200 |
+
|
| 201 |
+
dataset_list = gr.Dropdown(choices=list_datasets(), label="Select Dataset")
|
| 202 |
+
epochs = gr.Slider(10, 1024, value=100, step=10, label="Epochs")
|
| 203 |
+
lr = gr.Slider(0.001, 0.1, value=0.01, step=0.001, label="Learning Rate")
|
| 204 |
+
|
| 205 |
+
train_button = gr.Button("Train Model")
|
| 206 |
+
|
| 207 |
+
output = gr.Textbox(label="Training Output")
|
| 208 |
+
before_svg = gr.HTML(label="Before Training Predictions")
|
| 209 |
+
after_svg = gr.HTML(label="After Training Predictions")
|
| 210 |
+
metrics_svg = gr.HTML(label="Metrics")
|
| 211 |
+
|
| 212 |
+
train_button.click(
|
| 213 |
+
run_module,
|
| 214 |
+
inputs=[module_select, dataset_list, epochs, lr],
|
| 215 |
+
outputs=[output, before_svg, after_svg, metrics_svg]
|
| 216 |
+
)
|
| 217 |
+
|
| 218 |
+
# Launch Gradio app
|
| 219 |
+
app.launch(server_name="127.0.0.1", server_port=5555, share=True)
|
before_logistic.png
ADDED
|
before_shallow.png
ADDED
|
before_softmax.png
ADDED
|
bullets4.ttf
ADDED
|
Binary file (88.7 kB). View file
|
|
|
convolutional_neural_networks.md
ADDED
|
@@ -0,0 +1,703 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
l🎵 **Music Playing**
|
| 2 |
+
|
| 3 |
+
👋 **Welcome!** Today, we’re learning about **Convolution** in Neural Networks! 🧠🖼️
|
| 4 |
+
|
| 5 |
+
## 🤔 What is Convolution?
|
| 6 |
+
Convolution helps computers **understand pictures** by looking at **patterns** instead of exact positions! 🖼️🔍
|
| 7 |
+
|
| 8 |
+
Imagine you have **two images** that look almost the same, but one is a little **moved**.
|
| 9 |
+
A computer might think they are totally **different**! 😲
|
| 10 |
+
**Convolution fixes this problem!** ✅
|
| 11 |
+
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
## 🛠️ How Convolution Works
|
| 15 |
+
|
| 16 |
+
We use something called a **kernel** (a small filter 🔲) that slides over an image.
|
| 17 |
+
It **checks different parts** of the picture and creates a new image called an **activation map**!
|
| 18 |
+
|
| 19 |
+
1️⃣ The **image** is a grid of numbers 🖼️
|
| 20 |
+
2️⃣ The **kernel** is a small grid 🔳 that moves across the image
|
| 21 |
+
3️⃣ It **multiplies** numbers in the image with the numbers in the kernel ✖️
|
| 22 |
+
4️⃣ The results are **added together** ➕
|
| 23 |
+
5️⃣ We move to the next spot and **repeat!** 🔄
|
| 24 |
+
6️⃣ The final result is the **activation map** 🎯
|
| 25 |
+
|
| 26 |
+
---
|
| 27 |
+
|
| 28 |
+
## 📏 How Big is the Activation Map?
|
| 29 |
+
|
| 30 |
+
The size of the **activation map** depends on:
|
| 31 |
+
- **M (image size)** 📏
|
| 32 |
+
- **K (kernel size)** 🔳
|
| 33 |
+
- **Stride** (how far the kernel moves) 👣
|
| 34 |
+
|
| 35 |
+
Formula:
|
| 36 |
+
```
|
| 37 |
+
New size = (Image size - Kernel size) + 1
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
Example:
|
| 41 |
+
- **4×4 image** 📷
|
| 42 |
+
- **2×2 kernel** 🔳
|
| 43 |
+
- Activation map = **3×3** ✅
|
| 44 |
+
|
| 45 |
+
---
|
| 46 |
+
|
| 47 |
+
## 👣 What is Stride?
|
| 48 |
+
|
| 49 |
+
Stride is **how far** the kernel moves each time!
|
| 50 |
+
- **Stride = 1** ➝ Moves **one step** at a time 🐢
|
| 51 |
+
- **Stride = 2** ➝ Moves **two steps** at a time 🚶♂️
|
| 52 |
+
- **Bigger stride** = **Smaller** activation map! 📏
|
| 53 |
+
|
| 54 |
+
---
|
| 55 |
+
|
| 56 |
+
## 🛑 What is Zero Padding?
|
| 57 |
+
|
| 58 |
+
Sometimes, the kernel **doesn’t fit** perfectly in the image. 😕
|
| 59 |
+
So, we **add extra rows and columns of zeros** around the image! 0️⃣0️⃣0️⃣
|
| 60 |
+
|
| 61 |
+
This makes sure the **kernel covers everything**! ✅
|
| 62 |
+
|
| 63 |
+
Formula:
|
| 64 |
+
```
|
| 65 |
+
New Image Size = Old Size + 2 × Padding
|
| 66 |
+
```
|
| 67 |
+
|
| 68 |
+
---
|
| 69 |
+
|
| 70 |
+
## 🎨 What About Color Images?
|
| 71 |
+
|
| 72 |
+
For **black & white** images, we use **Conv2D** with **one channel** (grayscale). 🌑
|
| 73 |
+
For **color images**, we use **three channels** (Red, Green, Blue - RGB)! 🎨🌈
|
| 74 |
+
|
| 75 |
+
---
|
| 76 |
+
|
| 77 |
+
## 🏆 Summary
|
| 78 |
+
|
| 79 |
+
✅ Convolution helps computers **find patterns** in images!
|
| 80 |
+
✅ We use a **kernel** to create an **activation map**!
|
| 81 |
+
✅ **Stride & padding** change how the convolution works!
|
| 82 |
+
✅ This is how computers **"see"** images! 👀🤖
|
| 83 |
+
|
| 84 |
+
---
|
| 85 |
+
|
| 86 |
+
🎉 **Great job!** Now, let’s try convolution in the lab! 🏗️🤖✨
|
| 87 |
+
|
| 88 |
+
-----------------------------------------------------------------
|
| 89 |
+
|
| 90 |
+
🎵 **Music Playing**
|
| 91 |
+
|
| 92 |
+
👋 **Welcome!** Today, we’re learning about **Activation Functions** and **Max Pooling**! 🚀🔢
|
| 93 |
+
|
| 94 |
+
## 🤖 What is an Activation Function?
|
| 95 |
+
|
| 96 |
+
Activation functions help a neural network **decide** what’s important! 🧠
|
| 97 |
+
They change the values in the activation map to **help the model learn better**.
|
| 98 |
+
|
| 99 |
+
---
|
| 100 |
+
|
| 101 |
+
## 🔥 Example: ReLU Activation Function
|
| 102 |
+
|
| 103 |
+
1️⃣ We take an **input image** 🖼️
|
| 104 |
+
2️⃣ We apply **convolution** to create an **activation map** 📊
|
| 105 |
+
3️⃣ We apply **ReLU (Rectified Linear Unit)**:
|
| 106 |
+
- **If a value is negative** ➝ Change it to **0** ❌
|
| 107 |
+
- **If a value is positive** ➝ Keep it ✅
|
| 108 |
+
|
| 109 |
+
### 🛠 Example Calculation
|
| 110 |
+
|
| 111 |
+
| Before ReLU | After ReLU |
|
| 112 |
+
|-------------|------------|
|
| 113 |
+
| -4 | 0 |
|
| 114 |
+
| 0 | 0 |
|
| 115 |
+
| 4 | 4 |
|
| 116 |
+
|
| 117 |
+
All **negative numbers** become **zero**! ✨
|
| 118 |
+
|
| 119 |
+
In PyTorch, we apply the ReLU function **after convolution**:
|
| 120 |
+
|
| 121 |
+
```python
|
| 122 |
+
import torch.nn.functional as F
|
| 123 |
+
|
| 124 |
+
output = F.relu(conv_output)
|
| 125 |
+
```
|
| 126 |
+
|
| 127 |
+
---
|
| 128 |
+
|
| 129 |
+
## 🌊 What is Max Pooling?
|
| 130 |
+
|
| 131 |
+
Max Pooling helps the network **focus on important details** while making images **smaller**! 📏🔍
|
| 132 |
+
|
| 133 |
+
### 🏗 How It Works
|
| 134 |
+
|
| 135 |
+
1️⃣ We **divide** the image into small regions (e.g., **2×2** squares)
|
| 136 |
+
2️⃣ We **keep only the largest value** in each region
|
| 137 |
+
3️⃣ We **move the window** and repeat until we’ve covered the whole image
|
| 138 |
+
|
| 139 |
+
### 📊 Example: 2×2 Max Pooling
|
| 140 |
+
|
| 141 |
+
| Before Pooling | After Pooling |
|
| 142 |
+
|--------------|--------------|
|
| 143 |
+
| 1, **6**, 2, 3 | **6**, **8** |
|
| 144 |
+
| 5, **8**, 7, 4 | **9**, **7** |
|
| 145 |
+
| **9**, 2, 3, **7** | |
|
| 146 |
+
|
| 147 |
+
**Only the biggest number** in each section is kept! ✅
|
| 148 |
+
|
| 149 |
+
---
|
| 150 |
+
|
| 151 |
+
## 🏆 Why Use Max Pooling?
|
| 152 |
+
|
| 153 |
+
✅ **Reduces image size** ➝ Makes training faster! 🚀
|
| 154 |
+
✅ **Ignores small changes** in images ➝ More stable results! 🔄
|
| 155 |
+
✅ **Helps find important features** in the picture! 🖼️
|
| 156 |
+
|
| 157 |
+
In PyTorch, we apply **Max Pooling** like this:
|
| 158 |
+
|
| 159 |
+
```python
|
| 160 |
+
import torch.nn.functional as F
|
| 161 |
+
|
| 162 |
+
output = F.max_pool2d(activation_map, kernel_size=2, stride=2)
|
| 163 |
+
```
|
| 164 |
+
|
| 165 |
+
---
|
| 166 |
+
|
| 167 |
+
🎉 **Great job!** Now, let’s try using activation functions and max pooling in our own models! 🏗️🤖✨
|
| 168 |
+
|
| 169 |
+
------------------------------------------------------------------------------------------------------
|
| 170 |
+
🎵 **Music Playing**
|
| 171 |
+
|
| 172 |
+
👋 **Welcome!** Today, we’re learning about **Convolution with Multiple Channels**! 🖼️🤖
|
| 173 |
+
|
| 174 |
+
## 🤔 What’s a Channel?
|
| 175 |
+
A **channel** is like a layer of an image! 🌈
|
| 176 |
+
- **Black & White Images** ➝ **1 channel** (grayscale) 🏳️
|
| 177 |
+
- **Color Images** ➝ **3 channels** (Red, Green, Blue - RGB) 🎨
|
| 178 |
+
|
| 179 |
+
Neural networks **see** images by looking at these channels separately! 👀
|
| 180 |
+
|
| 181 |
+
---
|
| 182 |
+
|
| 183 |
+
## 🎯 1. Multiple Output Channels
|
| 184 |
+
|
| 185 |
+
Usually, we use **one kernel** to create **one activation map** 📊
|
| 186 |
+
But what if we want to detect **different things** in an image? 🤔
|
| 187 |
+
- **Solution:** Use **multiple kernels**! Each kernel **finds different features**! 🔍
|
| 188 |
+
|
| 189 |
+
### 🔥 Example: Detecting Lines
|
| 190 |
+
1️⃣ A **vertical line kernel** finds **vertical edges** 📏
|
| 191 |
+
2️⃣ A **horizontal line kernel** finds **horizontal edges** 📐
|
| 192 |
+
|
| 193 |
+
**More kernels = More ways to see the image!** 👀✅
|
| 194 |
+
|
| 195 |
+
---
|
| 196 |
+
|
| 197 |
+
## 🎨 2. Multiple Input Channels
|
| 198 |
+
|
| 199 |
+
Color images have **3 channels** (Red, Green, Blue).
|
| 200 |
+
To process them, we use **a separate kernel for each channel**! 🎨
|
| 201 |
+
|
| 202 |
+
1️⃣ Apply a **Red kernel** to the Red part of the image 🔴
|
| 203 |
+
2️⃣ Apply a **Green kernel** to the Green part of the image 🟢
|
| 204 |
+
3️⃣ Apply a **Blue kernel** to the Blue part of the image 🔵
|
| 205 |
+
4️⃣ **Add the results together** to get one activation map!
|
| 206 |
+
|
| 207 |
+
This helps the neural network understand **colors and patterns**! 🌈
|
| 208 |
+
|
| 209 |
+
---
|
| 210 |
+
|
| 211 |
+
## 🔄 3. Multiple Input & Output Channels
|
| 212 |
+
|
| 213 |
+
Now, let’s **combine everything**! 🚀
|
| 214 |
+
- **Multiple input channels** (like RGB images)
|
| 215 |
+
- **Multiple output channels** (different filters detecting different things)
|
| 216 |
+
|
| 217 |
+
Each output channel gets its own **set of kernels** for each input channel.
|
| 218 |
+
We **apply the kernels, add the results**, and get multiple **activation maps**! 🎯
|
| 219 |
+
|
| 220 |
+
---
|
| 221 |
+
|
| 222 |
+
## 🏗 Example in PyTorch
|
| 223 |
+
|
| 224 |
+
```python
|
| 225 |
+
import torch.nn as nn
|
| 226 |
+
|
| 227 |
+
conv = nn.Conv2d(in_channels=3, out_channels=5, kernel_size=3)
|
| 228 |
+
```
|
| 229 |
+
|
| 230 |
+
This means:
|
| 231 |
+
✅ **3 input channels** (Red, Green, Blue)
|
| 232 |
+
✅ **5 output channels** (5 different filters detecting different things)
|
| 233 |
+
|
| 234 |
+
---
|
| 235 |
+
|
| 236 |
+
## 🏆 Why is This Important?
|
| 237 |
+
|
| 238 |
+
✅ Helps the neural network find **different patterns** 🎨
|
| 239 |
+
✅ Works for **color images** and **complex features** 🤖
|
| 240 |
+
✅ Makes the network **more powerful**! 💪
|
| 241 |
+
|
| 242 |
+
---
|
| 243 |
+
|
| 244 |
+
🎉 **Great job!** Now, let’s try convolution with multiple channels in our own models! 🏗️🤖✨
|
| 245 |
+
-----------------------------------------------------------------------------------------------
|
| 246 |
+
🎵 **Music Playing**
|
| 247 |
+
|
| 248 |
+
👋 **Welcome!** Today, we’re building a **CNN for MNIST**! 🏗️🔢
|
| 249 |
+
MNIST is a dataset of **handwritten numbers (0-9)**. ✍️🖼️
|
| 250 |
+
|
| 251 |
+
---
|
| 252 |
+
|
| 253 |
+
## 🏗 CNN Structure
|
| 254 |
+
|
| 255 |
+
📏 **Image Size:** 16×16 (to make training faster)
|
| 256 |
+
🔄 **Layers:**
|
| 257 |
+
- **First Convolution Layer** ➝ 16 output channels
|
| 258 |
+
- **Second Convolution Layer** ➝ 32 output channels
|
| 259 |
+
- **Final Layer** ➝ 10 output neurons (one for each digit)
|
| 260 |
+
|
| 261 |
+
---
|
| 262 |
+
|
| 263 |
+
## 🛠 Building the CNN in PyTorch
|
| 264 |
+
|
| 265 |
+
### 📌 Step 1: Define the CNN
|
| 266 |
+
|
| 267 |
+
```python
|
| 268 |
+
import torch.nn as nn
|
| 269 |
+
|
| 270 |
+
class CNN(nn.Module):
|
| 271 |
+
def __init__(self):
|
| 272 |
+
super(CNN, self).__init__()
|
| 273 |
+
self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, padding=2)
|
| 274 |
+
self.pool = nn.MaxPool2d(kernel_size=2)
|
| 275 |
+
self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, padding=2)
|
| 276 |
+
self.fc = nn.Linear(32 * 4 * 4, 10) # Fully connected layer (512 inputs, 10 outputs)
|
| 277 |
+
|
| 278 |
+
def forward(self, x):
|
| 279 |
+
x = self.pool(nn.ReLU()(self.conv1(x))) # First layer: Conv + ReLU + Pool
|
| 280 |
+
x = self.pool(nn.ReLU()(self.conv2(x))) # Second layer: Conv + ReLU + Pool
|
| 281 |
+
x = x.view(-1, 512) # Flatten the 4x4x32 output to 1D (512 elements)
|
| 282 |
+
x = self.fc(x) # Fully connected layer for classification
|
| 283 |
+
return x
|
| 284 |
+
```
|
| 285 |
+
|
| 286 |
+
---
|
| 287 |
+
|
| 288 |
+
## 🔍 Understanding the Output Shape
|
| 289 |
+
|
| 290 |
+
After **Max Pooling**, the image shrinks to **4×4 pixels**.
|
| 291 |
+
Since we have **32 channels**, the total output is:
|
| 292 |
+
```
|
| 293 |
+
4 × 4 × 32 = 512 elements
|
| 294 |
+
```
|
| 295 |
+
Each neuron in the final layer gets **512 inputs**, and since we have **10 digits (0-9)**, we use **10 neurons**.
|
| 296 |
+
|
| 297 |
+
---
|
| 298 |
+
|
| 299 |
+
## 🔄 Forward Step
|
| 300 |
+
|
| 301 |
+
1️⃣ **Apply First Convolution Layer** ➝ Activation ➝ Max Pooling
|
| 302 |
+
2️⃣ **Apply Second Convolution Layer** ➝ Activation ➝ Max Pooling
|
| 303 |
+
3️⃣ **Flatten the Output (4×4×32 → 512)**
|
| 304 |
+
4️⃣ **Apply the Final Output Layer (10 Neurons for 10 Digits)**
|
| 305 |
+
|
| 306 |
+
---
|
| 307 |
+
|
| 308 |
+
## 🏋️♂️ Training the Model
|
| 309 |
+
|
| 310 |
+
Check the **lab** to see how we train the CNN using:
|
| 311 |
+
✅ **Backpropagation**
|
| 312 |
+
✅ **Stochastic Gradient Descent (SGD)**
|
| 313 |
+
✅ **Loss Function & Accuracy Check**
|
| 314 |
+
|
| 315 |
+
---
|
| 316 |
+
|
| 317 |
+
🎉 **Great job!** Now, let’s train our CNN to recognize handwritten digits! 🏗️🔢🤖
|
| 318 |
+
------------------------------------------------------------------------------------
|
| 319 |
+
🎵 **Music Playing**
|
| 320 |
+
|
| 321 |
+
👋 **Welcome!** Today, we’re learning about **Convolutional Neural Networks (CNNs)!** 🤖🖼️
|
| 322 |
+
|
| 323 |
+
## 🤔 What is a CNN?
|
| 324 |
+
A **Convolutional Neural Network (CNN)** is a special type of neural network that **understands images!** 🎨
|
| 325 |
+
It learns to find patterns, like:
|
| 326 |
+
✅ **Edges** (lines & shapes)
|
| 327 |
+
✅ **Textures** (smooth or rough areas)
|
| 328 |
+
✅ **Objects** (faces, animals, letters)
|
| 329 |
+
|
| 330 |
+
---
|
| 331 |
+
|
| 332 |
+
## 🏗 How Does a CNN Work?
|
| 333 |
+
|
| 334 |
+
A CNN is made of **three main steps**:
|
| 335 |
+
|
| 336 |
+
1️⃣ **Convolution Layer** 🖼️➝🔍
|
| 337 |
+
- Uses **kernels** (small filters) to **detect patterns** in an image
|
| 338 |
+
- Creates an **activation map** that highlights important features
|
| 339 |
+
|
| 340 |
+
2️⃣ **Pooling Layer** 🔄➝📏
|
| 341 |
+
- **Shrinks** the activation map to keep only the most important parts
|
| 342 |
+
- **Max Pooling** picks the **biggest** values in each small region
|
| 343 |
+
|
| 344 |
+
3️⃣ **Fully Connected Layer** 🏗️➝🎯
|
| 345 |
+
- The final layer makes a **decision** (like cat 🐱 or dog 🐶)
|
| 346 |
+
|
| 347 |
+
---
|
| 348 |
+
|
| 349 |
+
## 🎨 Example: Detecting Lines
|
| 350 |
+
|
| 351 |
+
We train a CNN to recognize **horizontal** and **vertical** lines:
|
| 352 |
+
|
| 353 |
+
1️⃣ **Input Image (X)**
|
| 354 |
+
2️⃣ **First Convolution Layer**
|
| 355 |
+
- Uses **two kernels** to create two **activation maps**
|
| 356 |
+
- Applies **ReLU** (activation function) to remove negative values
|
| 357 |
+
- Uses **Max Pooling** to make learning easier
|
| 358 |
+
|
| 359 |
+
3️⃣ **Second Convolution Layer**
|
| 360 |
+
- Takes **two input channels** from the first layer
|
| 361 |
+
- Uses **two new kernels** to create **one activation map**
|
| 362 |
+
- Again, applies **ReLU + Max Pooling**
|
| 363 |
+
|
| 364 |
+
4️⃣ **Flattening** ➝ Turns the 2D image into **1D data**
|
| 365 |
+
5️⃣ **Final Prediction** ➝ Uses a **fully connected layer** to decide:
|
| 366 |
+
- `0` = **Vertical Line**
|
| 367 |
+
- `1` = **Horizontal Line**
|
| 368 |
+
|
| 369 |
+
---
|
| 370 |
+
|
| 371 |
+
## 🔄 How to Build a CNN in PyTorch
|
| 372 |
+
|
| 373 |
+
### 🏗 CNN Constructor
|
| 374 |
+
```python
|
| 375 |
+
import torch.nn as nn
|
| 376 |
+
|
| 377 |
+
class CNN(nn.Module):
|
| 378 |
+
def __init__(self):
|
| 379 |
+
super(CNN, self).__init__()
|
| 380 |
+
self.conv1 = nn.Conv2d(in_channels=1, out_channels=2, kernel_size=3, padding=1)
|
| 381 |
+
self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
|
| 382 |
+
self.conv2 = nn.Conv2d(in_channels=2, out_channels=1, kernel_size=3, padding=1)
|
| 383 |
+
self.fc = nn.Linear(49, 2) # Fully connected layer (49 inputs, 2 outputs)
|
| 384 |
+
|
| 385 |
+
def forward(self, x):
|
| 386 |
+
x = self.pool(nn.ReLU()(self.conv1(x))) # First layer: Conv + ReLU + Pool
|
| 387 |
+
x = self.pool(nn.ReLU()(self.conv2(x))) # Second layer: Conv + ReLU + Pool
|
| 388 |
+
x = x.view(-1, 49) # Flatten to 1D
|
| 389 |
+
x = self.fc(x) # Fully connected layer
|
| 390 |
+
return x
|
| 391 |
+
```
|
| 392 |
+
|
| 393 |
+
---
|
| 394 |
+
|
| 395 |
+
## 🏋️♂️ Training the CNN
|
| 396 |
+
|
| 397 |
+
We train the CNN using **backpropagation** and **gradient descent**:
|
| 398 |
+
|
| 399 |
+
1️⃣ **Load the dataset** (images of lines) 📊
|
| 400 |
+
2️⃣ **Create a CNN model** 🏗️
|
| 401 |
+
3️⃣ **Define a loss function** (to measure mistakes) ❌
|
| 402 |
+
4️⃣ **Choose an optimizer** (to improve learning) 🔄
|
| 403 |
+
5️⃣ **Train the model** until it **gets better**! 🚀
|
| 404 |
+
|
| 405 |
+
As training progresses:
|
| 406 |
+
📉 **Loss goes down** ➝ Model makes fewer mistakes!
|
| 407 |
+
📈 **Accuracy goes up** ➝ Model gets better at predictions!
|
| 408 |
+
|
| 409 |
+
---
|
| 410 |
+
|
| 411 |
+
## 🏆 Why Use CNNs?
|
| 412 |
+
|
| 413 |
+
✅ **Finds patterns** in images 🔍
|
| 414 |
+
✅ **Works with real-world data** (faces, animals, objects) 🖼️
|
| 415 |
+
✅ **More efficient** than regular neural networks 💡
|
| 416 |
+
|
| 417 |
+
---
|
| 418 |
+
|
| 419 |
+
🎉 **Great job!** Now, let’s build and train our own CNN! 🏗️🤖✨
|
| 420 |
+
----------------------------------------------------------------------
|
| 421 |
+
|
| 422 |
+
🎵 **Music Playing**
|
| 423 |
+
|
| 424 |
+
👋 **Welcome!** Today, we’re building a **CNN for MNIST**! 🏗️🖼️
|
| 425 |
+
MNIST is a dataset of **handwritten numbers (0-9)**. ✍️🔢
|
| 426 |
+
|
| 427 |
+
---
|
| 428 |
+
|
| 429 |
+
## 🏗 CNN Structure
|
| 430 |
+
|
| 431 |
+
📏 **Image Size:** 16×16 (to make training faster)
|
| 432 |
+
🔄 **Layers:**
|
| 433 |
+
- **First Convolution Layer** ➝ 16 output channels
|
| 434 |
+
- **Second Convolution Layer** ➝ 32 output channels
|
| 435 |
+
- **Final Layer** ➝ 10 output neurons (one for each digit)
|
| 436 |
+
|
| 437 |
+
---
|
| 438 |
+
|
| 439 |
+
## 🛠 Building the CNN in PyTorch
|
| 440 |
+
|
| 441 |
+
### 🔹 Step 1: Define the CNN
|
| 442 |
+
|
| 443 |
+
```python
|
| 444 |
+
import torch.nn as nn
|
| 445 |
+
|
| 446 |
+
class CNN(nn.Module):
|
| 447 |
+
def __init__(self):
|
| 448 |
+
super(CNN, self).__init__()
|
| 449 |
+
self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, padding=2)
|
| 450 |
+
self.pool = nn.MaxPool2d(kernel_size=2)
|
| 451 |
+
self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, padding=2)
|
| 452 |
+
self.fc = nn.Linear(32 * 4 * 4, 10) # Fully connected layer (512 inputs, 10 outputs)
|
| 453 |
+
|
| 454 |
+
def forward(self, x):
|
| 455 |
+
x = self.pool(nn.ReLU()(self.conv1(x))) # First layer: Conv + ReLU + Pool
|
| 456 |
+
x = self.pool(nn.ReLU()(self.conv2(x))) # Second layer: Conv + ReLU + Pool
|
| 457 |
+
x = x.view(-1, 512) # Flatten the 4x4x32 output to 1D (512 elements)
|
| 458 |
+
x = self.fc(x) # Fully connected layer for classification
|
| 459 |
+
return x
|
| 460 |
+
```
|
| 461 |
+
|
| 462 |
+
---
|
| 463 |
+
|
| 464 |
+
## 🔍 Understanding the Output Shape
|
| 465 |
+
|
| 466 |
+
After **Max Pooling**, the image shrinks to **4×4 pixels**.
|
| 467 |
+
Since we have **32 channels**, the total output is:
|
| 468 |
+
```
|
| 469 |
+
4 × 4 × 32 = 512 elements
|
| 470 |
+
```
|
| 471 |
+
Each neuron in the final layer gets **512 inputs**, and since we have **10 digits (0-9)**, we use **10 neurons**.
|
| 472 |
+
|
| 473 |
+
---
|
| 474 |
+
|
| 475 |
+
## 🔄 Forward Step
|
| 476 |
+
|
| 477 |
+
1️⃣ **Apply First Convolution Layer** ➝ Activation ➝ Max Pooling
|
| 478 |
+
2️⃣ **Apply Second Convolution Layer** ➝ Activation ➝ Max Pooling
|
| 479 |
+
3️⃣ **Flatten the Output (4×4×32 → 512)**
|
| 480 |
+
4️⃣ **Apply the Final Output Layer (10 Neurons for 10 Digits)**
|
| 481 |
+
|
| 482 |
+
---
|
| 483 |
+
|
| 484 |
+
## 🏋️♂️ Training the Model
|
| 485 |
+
|
| 486 |
+
Check the **lab** to see how we train the CNN using:
|
| 487 |
+
✅ **Backpropagation**
|
| 488 |
+
✅ **Stochastic Gradient Descent (SGD)**
|
| 489 |
+
✅ **Loss Function & Accuracy Check**
|
| 490 |
+
|
| 491 |
+
---
|
| 492 |
+
|
| 493 |
+
🎉 **Great job!** Now, let’s train our CNN to recognize handwritten digits! 🏗️🔢🤖
|
| 494 |
+
------------------------------------------------------------------------------------
|
| 495 |
+
🎵 **Music Playing**
|
| 496 |
+
|
| 497 |
+
👋 **Welcome!** Today, we’re learning how to use **Pretrained TorchVision Models**! 🤖🖼️
|
| 498 |
+
|
| 499 |
+
## 🤔 What is a Pretrained Model?
|
| 500 |
+
|
| 501 |
+
A **pretrained model** is a neural network that has already been **trained by experts** on a large dataset.
|
| 502 |
+
✅ **Saves time** (no need to train from scratch) ⏳
|
| 503 |
+
✅ **Works better** (already optimized) 🎯
|
| 504 |
+
✅ **We only train the final layer** for our own images! 🔄
|
| 505 |
+
|
| 506 |
+
---
|
| 507 |
+
|
| 508 |
+
## 🔄 Using ResNet18 (A Pretrained Model)
|
| 509 |
+
|
| 510 |
+
We will use **ResNet18**, a powerful model trained on **color images**. 🎨
|
| 511 |
+
It has **skip connections** (we won’t go into details, but it helps learning).
|
| 512 |
+
|
| 513 |
+
We only **replace the last layer** to match our dataset! 🔁
|
| 514 |
+
|
| 515 |
+
---
|
| 516 |
+
|
| 517 |
+
## 🛠 Steps to Use a Pretrained Model
|
| 518 |
+
|
| 519 |
+
### 📌 Step 1: Load the Pretrained Model
|
| 520 |
+
```python
|
| 521 |
+
import torchvision.models as models
|
| 522 |
+
|
| 523 |
+
model = models.resnet18(pretrained=True) # Load pretrained ResNet18
|
| 524 |
+
```
|
| 525 |
+
|
| 526 |
+
### 📌 Step 2: Normalize Images (Required for ResNet18)
|
| 527 |
+
```python
|
| 528 |
+
import torchvision.transforms as transforms
|
| 529 |
+
|
| 530 |
+
transform = transforms.Compose([
|
| 531 |
+
transforms.Resize((224, 224)), # Resize image
|
| 532 |
+
transforms.ToTensor(), # Convert to tensor
|
| 533 |
+
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) # Normalize
|
| 534 |
+
])
|
| 535 |
+
```
|
| 536 |
+
|
| 537 |
+
### 📌 Step 3: Prepare the Dataset
|
| 538 |
+
Create a **dataset object** for your own images with **training and testing data**. 📊
|
| 539 |
+
|
| 540 |
+
### 📌 Step 4: Replace the Output Layer
|
| 541 |
+
- The **last hidden layer** has **512 neurons**
|
| 542 |
+
- We create a **new output layer** for **our dataset**
|
| 543 |
+
|
| 544 |
+
Example: **If we have 7 classes**, we create a layer with **7 outputs**:
|
| 545 |
+
```python
|
| 546 |
+
import torch.nn as nn
|
| 547 |
+
|
| 548 |
+
for param in model.parameters():
|
| 549 |
+
param.requires_grad = False # Freeze pretrained layers
|
| 550 |
+
|
| 551 |
+
model.fc = nn.Linear(512, 7) # Replace output layer (512 inputs → 7 outputs)
|
| 552 |
+
```
|
| 553 |
+
|
| 554 |
+
---
|
| 555 |
+
|
| 556 |
+
## 🏋️♂️ Training the Model
|
| 557 |
+
|
| 558 |
+
### 📌 Step 5: Create Data Loaders
|
| 559 |
+
```python
|
| 560 |
+
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=15, shuffle=True)
|
| 561 |
+
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=10, shuffle=False)
|
| 562 |
+
```
|
| 563 |
+
|
| 564 |
+
### 📌 Step 6: Set Up Training
|
| 565 |
+
```python
|
| 566 |
+
import torch.optim as optim
|
| 567 |
+
|
| 568 |
+
criterion = nn.CrossEntropyLoss() # Loss function
|
| 569 |
+
optimizer = optim.Adam(model.fc.parameters(), lr=0.001) # Optimizer (only for last layer)
|
| 570 |
+
```
|
| 571 |
+
|
| 572 |
+
### 📌 Step 7: Train the Model
|
| 573 |
+
1️⃣ **Set model to training mode** 🏋️
|
| 574 |
+
```python
|
| 575 |
+
model.train()
|
| 576 |
+
```
|
| 577 |
+
2️⃣ Train for **20 epochs**
|
| 578 |
+
3️⃣ **Set model to evaluation mode** when predicting 📊
|
| 579 |
+
```python
|
| 580 |
+
model.eval()
|
| 581 |
+
```
|
| 582 |
+
|
| 583 |
+
---
|
| 584 |
+
|
| 585 |
+
## 🏆 Why Use Pretrained Models?
|
| 586 |
+
|
| 587 |
+
✅ **Saves time** (no need to train from scratch)
|
| 588 |
+
✅ **Works better** (pretrained on millions of images)
|
| 589 |
+
✅ **We only change one layer** for our dataset!
|
| 590 |
+
|
| 591 |
+
---
|
| 592 |
+
|
| 593 |
+
🎉 **Great job!** Now, try using a pretrained model for your own images! 🏗️🤖✨
|
| 594 |
+
---------------------------------------------------------------------------------
|
| 595 |
+
🎵 **Music Playing**
|
| 596 |
+
|
| 597 |
+
👋 **Welcome!** Today, we’re learning how to use **GPUs in PyTorch**! 🚀💻
|
| 598 |
+
|
| 599 |
+
## 🤔 Why Use a GPU?
|
| 600 |
+
A **Graphics Processing Unit (GPU)** can **train models MUCH faster** than a CPU!
|
| 601 |
+
✅ Faster computation ⏩
|
| 602 |
+
✅ Better for large datasets 📊
|
| 603 |
+
✅ Helps train deep learning models efficiently 🤖
|
| 604 |
+
|
| 605 |
+
---
|
| 606 |
+
|
| 607 |
+
## 🔥 What is CUDA?
|
| 608 |
+
CUDA is a **special tool** made by **NVIDIA** that allows us to use **GPUs for AI tasks**. 🎮🚀
|
| 609 |
+
In **PyTorch**, we use **torch.cuda** to work with GPUs.
|
| 610 |
+
|
| 611 |
+
---
|
| 612 |
+
|
| 613 |
+
## 🛠 Step 1: Check if a GPU is Available
|
| 614 |
+
|
| 615 |
+
```python
|
| 616 |
+
import torch
|
| 617 |
+
|
| 618 |
+
# Check if a GPU is available
|
| 619 |
+
torch.cuda.is_available() # Returns True if a GPU is detected
|
| 620 |
+
```
|
| 621 |
+
|
| 622 |
+
---
|
| 623 |
+
|
| 624 |
+
## 🎯 Step 2: Set Up the GPU
|
| 625 |
+
|
| 626 |
+
```python
|
| 627 |
+
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
|
| 628 |
+
```
|
| 629 |
+
|
| 630 |
+
- `"cuda:0"` = First available GPU 🎮
|
| 631 |
+
- `"cpu"` = Use the CPU if no GPU is found
|
| 632 |
+
|
| 633 |
+
---
|
| 634 |
+
|
| 635 |
+
## 🏗 Step 3: Sending Tensors to the GPU
|
| 636 |
+
|
| 637 |
+
In PyTorch, **data is stored in Tensors**.
|
| 638 |
+
To move data to the GPU, use `.to(device)`.
|
| 639 |
+
|
| 640 |
+
```python
|
| 641 |
+
tensor = torch.randn(3, 3) # Create a random tensor
|
| 642 |
+
tensor = tensor.to(device) # Move it to the GPU
|
| 643 |
+
```
|
| 644 |
+
|
| 645 |
+
✅ **Faster processing on the GPU!** ⚡
|
| 646 |
+
|
| 647 |
+
---
|
| 648 |
+
|
| 649 |
+
## 🔄 Step 4: Using a GPU with a CNN
|
| 650 |
+
|
| 651 |
+
You **don’t need to change** your CNN code! Just **move the model to the GPU** after creating it:
|
| 652 |
+
|
| 653 |
+
```python
|
| 654 |
+
model = CNN() # Create CNN model
|
| 655 |
+
model.to(device) # Move the model to the GPU
|
| 656 |
+
```
|
| 657 |
+
|
| 658 |
+
This **converts** all layers to **CUDA tensors** for GPU computation! 🎮
|
| 659 |
+
|
| 660 |
+
---
|
| 661 |
+
|
| 662 |
+
## 🏋️♂️ Step 5: Training a Model on a GPU
|
| 663 |
+
|
| 664 |
+
Training is the same, but **you must send your data to the GPU**!
|
| 665 |
+
|
| 666 |
+
```python
|
| 667 |
+
for images, labels in train_loader:
|
| 668 |
+
images, labels = images.to(device), labels.to(device) # Move data to GPU
|
| 669 |
+
optimizer.zero_grad() # Clear gradients
|
| 670 |
+
outputs = model(images) # Forward pass (on GPU)
|
| 671 |
+
loss = criterion(outputs, labels) # Compute loss
|
| 672 |
+
loss.backward() # Backpropagation
|
| 673 |
+
optimizer.step() # Update weights
|
| 674 |
+
```
|
| 675 |
+
|
| 676 |
+
✅ **The model trains much faster!** 🚀
|
| 677 |
+
|
| 678 |
+
---
|
| 679 |
+
|
| 680 |
+
## 🎯 Step 6: Testing the Model
|
| 681 |
+
|
| 682 |
+
For testing, **only move the images** (not the labels) to the GPU:
|
| 683 |
+
|
| 684 |
+
```python
|
| 685 |
+
for images, labels in test_loader:
|
| 686 |
+
images = images.to(device) # Move images to GPU
|
| 687 |
+
outputs = model(images) # Get predictions
|
| 688 |
+
```
|
| 689 |
+
|
| 690 |
+
✅ **Saves memory and speeds up testing!** ⚡
|
| 691 |
+
|
| 692 |
+
---
|
| 693 |
+
|
| 694 |
+
## 🏆 Summary
|
| 695 |
+
|
| 696 |
+
✅ **GPUs make training faster** 🎮
|
| 697 |
+
✅ Use **torch.cuda** to work with GPUs
|
| 698 |
+
✅ Move **data & models** to the GPU with `.to(device)`
|
| 699 |
+
✅ Training & testing are the same, but data **must be on the GPU**
|
| 700 |
+
|
| 701 |
+
---
|
| 702 |
+
|
| 703 |
+
🎉 **Great job!** Now, try training a model using a GPU in PyTorch! 🏗️🚀
|
convolutional_neural_networks.py
ADDED
|
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import torch
|
| 2 |
+
import torch.nn as nn
|
| 3 |
+
import torch.optim as optim
|
| 4 |
+
|
| 5 |
+
# Convolutional Neural Network Model
|
| 6 |
+
class ConvolutionalNeuralNetwork(nn.Module):
|
| 7 |
+
def __init__(self):
|
| 8 |
+
super(ConvolutionalNeuralNetwork, self).__init__()
|
| 9 |
+
self.conv1 = nn.Conv2d(1, 16, kernel_size=5)
|
| 10 |
+
self.conv2 = nn.Conv2d(16, 32, kernel_size=5)
|
| 11 |
+
self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
|
| 12 |
+
self.fc1 = nn.Linear(32 * 4 * 4, 120)
|
| 13 |
+
self.fc2 = nn.Linear(120, 84)
|
| 14 |
+
self.fc3 = nn.Linear(84, 10)
|
| 15 |
+
|
| 16 |
+
def forward(self, x):
|
| 17 |
+
x = self.pool(torch.relu(self.conv1(x)))
|
| 18 |
+
x = self.pool(torch.relu(self.conv2(x)))
|
| 19 |
+
x = x.view(-1, 32 * 4 * 4)
|
| 20 |
+
x = torch.relu(self.fc1(x))
|
| 21 |
+
x = torch.relu(self.fc2(x))
|
| 22 |
+
return x # Return raw output without applying softmax
|
| 23 |
+
|
| 24 |
+
|
| 25 |
+
# Training Function
|
| 26 |
+
def train_model(model, criterion, optimizer, x_train, y_train, epochs=100):
|
| 27 |
+
for epoch in range(epochs):
|
| 28 |
+
model.train()
|
| 29 |
+
optimizer.zero_grad()
|
| 30 |
+
y_pred = model(x_train)
|
| 31 |
+
loss = criterion(y_pred, y_train)
|
| 32 |
+
loss.backward()
|
| 33 |
+
optimizer.step()
|
| 34 |
+
if (epoch+1) % 10 == 0:
|
| 35 |
+
print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')
|
| 36 |
+
|
| 37 |
+
import matplotlib.pyplot as plt # Added import for plotting
|
| 38 |
+
|
| 39 |
+
# Example Usage
|
| 40 |
+
if __name__ == "__main__":
|
| 41 |
+
# Sample Data
|
| 42 |
+
x_train = torch.randn(100, 1, 28, 28) # Example for MNIST
|
| 43 |
+
y_train = torch.randint(0, 10, (100,))
|
| 44 |
+
|
| 45 |
+
# Plotting the input data
|
| 46 |
+
for i in range(6):
|
| 47 |
+
plt.subplot(2, 3, i + 1)
|
| 48 |
+
plt.imshow(x_train[i].squeeze(), cmap='gray')
|
| 49 |
+
plt.title(f'Label: {y_train[i].item()}')
|
| 50 |
+
plt.axis('off')
|
| 51 |
+
plt.show()
|
| 52 |
+
|
| 53 |
+
|
| 54 |
+
model = ConvolutionalNeuralNetwork()
|
| 55 |
+
|
| 56 |
+
# Plotting the predictions
|
| 57 |
+
y_pred = model(x_train).detach().numpy()
|
| 58 |
+
plt.figure(figsize=(12, 6))
|
| 59 |
+
for i in range(6):
|
| 60 |
+
plt.subplot(2, 3, i + 1)
|
| 61 |
+
plt.imshow(x_train[i].squeeze(), cmap='gray')
|
| 62 |
+
plt.title(f'Predicted: {np.argmax(y_pred[i])}')
|
| 63 |
+
plt.axis('off')
|
| 64 |
+
plt.show()
|
| 65 |
+
|
| 66 |
+
criterion = nn.CrossEntropyLoss()
|
| 67 |
+
optimizer = optim.SGD(model.parameters(), lr=0.01)
|
| 68 |
+
|
| 69 |
+
train_model(model, criterion, optimizer, x_train, y_train)
|
dataset_loader.py
ADDED
|
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
import numpy as np
|
| 3 |
+
import gzip
|
| 4 |
+
from PIL import Image
|
| 5 |
+
from torchvision import transforms
|
| 6 |
+
|
| 7 |
+
class CustomMNISTDataset:
|
| 8 |
+
def __init__(self, dataset_path, transform=None):
|
| 9 |
+
self.dataset_path = dataset_path
|
| 10 |
+
self.transform = transform
|
| 11 |
+
self.images, self.labels = self.load_dataset()
|
| 12 |
+
|
| 13 |
+
def load_dataset(self):
|
| 14 |
+
image_paths = []
|
| 15 |
+
label_paths = []
|
| 16 |
+
|
| 17 |
+
# Assuming the dataset consists of images and labels in the dataset path
|
| 18 |
+
for file in os.listdir(self.dataset_path):
|
| 19 |
+
if 'train-images-idx3-ubyte.gz' in file:
|
| 20 |
+
image_paths.append(os.path.join(self.dataset_path, file))
|
| 21 |
+
elif 'train-labels-idx1-ubyte.gz' in file:
|
| 22 |
+
label_paths.append(os.path.join(self.dataset_path, file))
|
| 23 |
+
|
| 24 |
+
if not image_paths or not label_paths:
|
| 25 |
+
raise ValueError(f"❌ Missing image or label files in {self.dataset_path}")
|
| 26 |
+
|
| 27 |
+
images = []
|
| 28 |
+
labels = []
|
| 29 |
+
|
| 30 |
+
# Assuming one image file and one label file
|
| 31 |
+
for img_path, label_path in zip(image_paths, label_paths):
|
| 32 |
+
images_data, labels_data = self.load_mnist_data(img_path, label_path)
|
| 33 |
+
images.extend(images_data)
|
| 34 |
+
labels.extend(labels_data)
|
| 35 |
+
|
| 36 |
+
return images, labels
|
| 37 |
+
|
| 38 |
+
def load_mnist_data(self, img_path, label_path):
|
| 39 |
+
"""Load MNIST data from .gz files."""
|
| 40 |
+
with gzip.open(img_path, 'rb') as f:
|
| 41 |
+
# Skip the magic number and metadata
|
| 42 |
+
f.read(16)
|
| 43 |
+
# Read the image data
|
| 44 |
+
img_data = np.frombuffer(f.read(), dtype=np.uint8)
|
| 45 |
+
img_data = img_data.reshape(-1, 28, 28) # Reshape to 28x28 images
|
| 46 |
+
|
| 47 |
+
with gzip.open(label_path, 'rb') as f:
|
| 48 |
+
# Skip the magic number and metadata
|
| 49 |
+
f.read(8)
|
| 50 |
+
# Read the label data
|
| 51 |
+
label_data = np.frombuffer(f.read(), dtype=np.uint8)
|
| 52 |
+
|
| 53 |
+
images = [Image.fromarray(img) for img in img_data] # Convert each image to a PIL Image
|
| 54 |
+
|
| 55 |
+
# If you have any transformation, apply it here
|
| 56 |
+
if self.transform:
|
| 57 |
+
images = [self.transform(img) for img in images]
|
| 58 |
+
|
| 59 |
+
return images, label_data
|
| 60 |
+
|
| 61 |
+
def __len__(self):
|
| 62 |
+
"""Return the total number of images in the dataset."""
|
| 63 |
+
return len(self.images)
|
| 64 |
+
|
| 65 |
+
def __getitem__(self, idx):
|
| 66 |
+
"""Return a single image and its label at the given index."""
|
| 67 |
+
image = self.images[idx]
|
| 68 |
+
label = self.labels[idx]
|
| 69 |
+
return image, label
|
deep_networks.md
ADDED
|
@@ -0,0 +1,356 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
🎵 **Music Playing**
|
| 2 |
+
|
| 3 |
+
👋 **Welcome!** Today, we’re learning about **Deep Neural Networks**—a cool way computers learn! 🧠💡
|
| 4 |
+
|
| 5 |
+
## 🤖 What is a Neural Network?
|
| 6 |
+
Imagine a brain made of tiny switches called **neurons**. These neurons work together to make smart decisions!
|
| 7 |
+
|
| 8 |
+
### 🟢 Input Layer
|
| 9 |
+
This is where we give the network information, like pictures or numbers.
|
| 10 |
+
|
| 11 |
+
### 🔵 Hidden Layers
|
| 12 |
+
These layers are like **magic helpers** that figure out patterns!
|
| 13 |
+
- More neurons = better learning 🤓
|
| 14 |
+
- Too many neurons = can be **confusing** (overfitting) 😵
|
| 15 |
+
|
| 16 |
+
### 🔴 Output Layer
|
| 17 |
+
This is where the network **gives us answers!** 🏆
|
| 18 |
+
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
## 🏗 Building a Deep Neural Network in PyTorch
|
| 22 |
+
|
| 23 |
+
We can **build a deep neural network** using PyTorch, a tool that helps computers learn. 🖥️
|
| 24 |
+
|
| 25 |
+
### 🛠 Layers of Our Network
|
| 26 |
+
1️⃣ **First Hidden Layer:** Has `H1` neurons.
|
| 27 |
+
2️⃣ **Second Hidden Layer:** Has `H2` neurons.
|
| 28 |
+
3️⃣ **Output Layer:** Decides the final answer! 🎯
|
| 29 |
+
|
| 30 |
+
---
|
| 31 |
+
|
| 32 |
+
## 🔄 How Does It Work?
|
| 33 |
+
1️⃣ **Start with an input (x).**
|
| 34 |
+
2️⃣ **Pass through each layer:**
|
| 35 |
+
- Apply **math functions** (like `sigmoid`, `tanh`, or `ReLU`).
|
| 36 |
+
- These help the network understand better! 🧩
|
| 37 |
+
3️⃣ **Get the final answer!** ✅
|
| 38 |
+
|
| 39 |
+
---
|
| 40 |
+
|
| 41 |
+
## 🎨 Different Activation Functions
|
| 42 |
+
Activation functions help the network **think better!** 🧠
|
| 43 |
+
- **Sigmoid** ➝ Good for small problems 🤏
|
| 44 |
+
- **Tanh** ➝ Works better for deeper networks 🌊
|
| 45 |
+
- **ReLU** ➝ Super strong for big tasks! 🚀
|
| 46 |
+
|
| 47 |
+
---
|
| 48 |
+
|
| 49 |
+
## 🔢 Example: Recognizing Handwritten Numbers
|
| 50 |
+
We train the network with **MNIST**, a dataset of handwritten numbers. 📝🔢
|
| 51 |
+
- **Input:** 784 pixels (28x28 images) 📸
|
| 52 |
+
- **Hidden Layers:** 50 neurons each 🤖
|
| 53 |
+
- **Output:** 10 neurons (digits 0-9) 🔟
|
| 54 |
+
|
| 55 |
+
---
|
| 56 |
+
|
| 57 |
+
## 🚀 Training the Network
|
| 58 |
+
We use **Stochastic Gradient Descent (SGD)** to teach the network! 📚
|
| 59 |
+
- **Loss Function:** Helps the network learn from mistakes. ❌➡✅
|
| 60 |
+
- **Validation Accuracy:** Checks how well the network is doing! 🎯
|
| 61 |
+
|
| 62 |
+
---
|
| 63 |
+
|
| 64 |
+
## 🏆 What We Learned
|
| 65 |
+
✅ Deep Neural Networks have **many hidden layers**.
|
| 66 |
+
✅ Different **activation functions** help improve performance.
|
| 67 |
+
✅ The more layers we add, the **smarter** the network becomes! 💡
|
| 68 |
+
|
| 69 |
+
---
|
| 70 |
+
|
| 71 |
+
🎉 **Great job!** Now, let's build and train our own deep neural networks! 🏗️🤖✨
|
| 72 |
+
-----------------------------------------------------------------------------------
|
| 73 |
+
🎵 **Music Playing**
|
| 74 |
+
|
| 75 |
+
👋 **Welcome!** Today, we’ll learn how to **build a deep neural network** in PyTorch using `nn.ModuleList`. 🧠💡
|
| 76 |
+
|
| 77 |
+
## 🤖 Why Use `nn.ModuleList`?
|
| 78 |
+
Instead of adding layers **one by one** (which takes a long time ⏳), we can **automate** the process! 🚀
|
| 79 |
+
|
| 80 |
+
---
|
| 81 |
+
|
| 82 |
+
## 🏗 Building the Neural Network
|
| 83 |
+
|
| 84 |
+
We create a **list** called `layers` 📋:
|
| 85 |
+
- **First item:** Input size (e.g., `2` features).
|
| 86 |
+
- **Second item:** Neurons in the **first hidden layer** (e.g., `3`).
|
| 87 |
+
- **Third item:** Neurons in the **second hidden layer** (e.g., `4`).
|
| 88 |
+
- **Fourth item:** Output size (number of classes, e.g., `3`).
|
| 89 |
+
|
| 90 |
+
---
|
| 91 |
+
|
| 92 |
+
## 🔄 Constructing the Network
|
| 93 |
+
|
| 94 |
+
### 🔹 Step 1: Create Layers
|
| 95 |
+
- We loop through the list, taking **two elements at a time**:
|
| 96 |
+
- **First element:** Input size 🎯
|
| 97 |
+
- **Second element:** Output size (number of neurons) 🧩
|
| 98 |
+
|
| 99 |
+
### 🔹 Step 2: Connecting Layers
|
| 100 |
+
- First **hidden layer** ➝ Input size = `2`, Neurons = `3`
|
| 101 |
+
- Second **hidden layer** ➝ Input size = `3`, Neurons = `4`
|
| 102 |
+
- **Output layer** ➝ Input size = `4`, Output size = `3`
|
| 103 |
+
|
| 104 |
+
---
|
| 105 |
+
|
| 106 |
+
## ⚡ Forward Function
|
| 107 |
+
|
| 108 |
+
We **pass data** through the network:
|
| 109 |
+
1️⃣ **Apply linear transformation** to each layer ➝ Makes calculations 🧮
|
| 110 |
+
2️⃣ **Apply activation function** (`ReLU`) ➝ Helps the network learn 📈
|
| 111 |
+
3️⃣ **For the last layer**, we only apply **linear transformation** (since it's a classification task 🎯).
|
| 112 |
+
|
| 113 |
+
---
|
| 114 |
+
|
| 115 |
+
## 🎯 Training the Network
|
| 116 |
+
|
| 117 |
+
The **training process** is similar to before! We:
|
| 118 |
+
- Use a **dataset** 📊
|
| 119 |
+
- Try **different combinations** of neurons and layers 🤖
|
| 120 |
+
- See which setup gives the **best performance**! 🏆
|
| 121 |
+
|
| 122 |
+
---
|
| 123 |
+
|
| 124 |
+
🎉 **Awesome!** Now, let’s explore ways to make these networks even **better!** 🚀
|
| 125 |
+
-----------------------------------------------------------------------------------
|
| 126 |
+
🎵 **Music Playing**
|
| 127 |
+
|
| 128 |
+
👋 **Welcome!** Today, we’re learning about **weight initialization** in Neural Networks! 🧠⚡
|
| 129 |
+
|
| 130 |
+
## 🤔 Why Does Weight Initialization Matter?
|
| 131 |
+
If we **don’t** choose good starting weights, our neural network **won’t learn properly**! 🚨
|
| 132 |
+
Sometimes, **all neurons** in a layer get the **same weights**, which causes problems.
|
| 133 |
+
|
| 134 |
+
---
|
| 135 |
+
|
| 136 |
+
## 🚀 How PyTorch Handles Weights
|
| 137 |
+
PyTorch **automatically** picks starting weights, but we can also set them **ourselves**! 🔧
|
| 138 |
+
Let’s see what happens when we:
|
| 139 |
+
- Set **all weights to 1** and **bias to 0** ➝ ❌ **Bad idea!**
|
| 140 |
+
- Randomly choose weights from a **uniform distribution** ➝ ✅ **Better!**
|
| 141 |
+
|
| 142 |
+
---
|
| 143 |
+
|
| 144 |
+
## 🔄 The Problem with Random Weights
|
| 145 |
+
We use a **uniform distribution** (random values between -1 and 1). But:
|
| 146 |
+
- **Too small?** ➝ Weights don’t change much 🤏
|
| 147 |
+
- **Too large?** ➝ **Vanishing gradient** problem 😵
|
| 148 |
+
|
| 149 |
+
### 📉 What’s a Vanishing Gradient?
|
| 150 |
+
If weights are **too big**, activations get **too large**, and the **gradient shrinks to zero**.
|
| 151 |
+
That means the network **stops learning**! 🚫
|
| 152 |
+
|
| 153 |
+
---
|
| 154 |
+
|
| 155 |
+
## 🛠 Fixing the Problem
|
| 156 |
+
|
| 157 |
+
### 🎯 Solution: Scale Weights Based on Neurons
|
| 158 |
+
We scale the weight range based on **how many neurons** we have:
|
| 159 |
+
- **2 neurons?** ➝ Scale by **1/2**
|
| 160 |
+
- **4 neurons?** ➝ Scale by **1/4**
|
| 161 |
+
- **100 neurons?** ➝ Scale by **1/100**
|
| 162 |
+
|
| 163 |
+
This prevents the vanishing gradient issue! ✅
|
| 164 |
+
|
| 165 |
+
---
|
| 166 |
+
|
| 167 |
+
## 🔬 Different Weight Initialization Methods
|
| 168 |
+
|
| 169 |
+
### 🏗 **1. Default PyTorch Method**
|
| 170 |
+
- PyTorch **automatically** picks a range:
|
| 171 |
+
- **Lower bound:** `-1 / sqrt(L_in)`
|
| 172 |
+
- **Upper bound:** `+1 / sqrt(L_in)`
|
| 173 |
+
|
| 174 |
+
### 🔵 **2. Xavier Initialization**
|
| 175 |
+
- Best for **tanh** activation
|
| 176 |
+
- Uses the **number of input and output neurons**
|
| 177 |
+
- We apply `xavier_uniform_()` to set the weights
|
| 178 |
+
|
| 179 |
+
### 🔴 **3. He Initialization**
|
| 180 |
+
- Best for **ReLU** activation
|
| 181 |
+
- Uses the **He initialization method**
|
| 182 |
+
- We apply `he_uniform_()` to set the weights
|
| 183 |
+
|
| 184 |
+
---
|
| 185 |
+
|
| 186 |
+
## 🏆 Which One is Best?
|
| 187 |
+
We compare:
|
| 188 |
+
✅ **PyTorch Default**
|
| 189 |
+
✅ **Xavier Method** (tanh)
|
| 190 |
+
✅ **He Method** (ReLU)
|
| 191 |
+
|
| 192 |
+
The **Xavier and He methods** help the network **learn faster**! 🚀
|
| 193 |
+
|
| 194 |
+
---
|
| 195 |
+
|
| 196 |
+
🎉 **Great job!** Now, let’s try different weight initializations and see what works best! 🏗️🔬
|
| 197 |
+
------------------------------------------------------------------------------------------------
|
| 198 |
+
🎵 **Music Playing**
|
| 199 |
+
|
| 200 |
+
👋 **Welcome!** Today, we’re learning about **Gradient Descent with Momentum**! 🚀🔄
|
| 201 |
+
|
| 202 |
+
## 🤔 What’s the Problem?
|
| 203 |
+
Sometimes, when training a neural network, the model can get **stuck**:
|
| 204 |
+
- **Saddle Points** ➝ Flat areas where learning stops 🏔️
|
| 205 |
+
- **Local Minima** ➝ Not the best solution, but we get trapped 😞
|
| 206 |
+
|
| 207 |
+
---
|
| 208 |
+
|
| 209 |
+
## 🏃♂️ What is Momentum?
|
| 210 |
+
Momentum helps the model **keep moving** even when it gets stuck! 💨
|
| 211 |
+
It’s like rolling a ball downhill:
|
| 212 |
+
- **Gradient (Force)** ➝ Tells us where to go 🏀
|
| 213 |
+
- **Momentum (Mass)** ➝ Helps us keep moving even on flat surfaces ⚡
|
| 214 |
+
|
| 215 |
+
---
|
| 216 |
+
|
| 217 |
+
## 🔄 How Does It Work?
|
| 218 |
+
|
| 219 |
+
### 🔹 Step 1: Compute Velocity
|
| 220 |
+
- Velocity (`v`) = Old velocity (`v_k`) + Learning step (`gradient * learning rate`)
|
| 221 |
+
- The **momentum term** (𝜌) controls how much we keep from the past.
|
| 222 |
+
|
| 223 |
+
### 🔹 Step 2: Update Weights
|
| 224 |
+
- New weight (`w_k+1`) = Old weight (`w_k`) - Learning rate * Velocity
|
| 225 |
+
|
| 226 |
+
The bigger the **momentum**, the harder it is to stop moving! 🏃♂️💨
|
| 227 |
+
|
| 228 |
+
---
|
| 229 |
+
|
| 230 |
+
## ⚠️ Why Does It Help?
|
| 231 |
+
|
| 232 |
+
### 🏔️ **Saddle Points**
|
| 233 |
+
- **Without Momentum** ➝ Model **stops** moving in flat areas ❌
|
| 234 |
+
- **With Momentum** ➝ Keeps moving **past** the flat spots ✅
|
| 235 |
+
|
| 236 |
+
### ⬇ **Local Minima**
|
| 237 |
+
- **Without Momentum** ➝ Gets **stuck** in a bad spot 😖
|
| 238 |
+
- **With Momentum** ➝ Pushes through and **finds a better solution!** 🎯
|
| 239 |
+
|
| 240 |
+
---
|
| 241 |
+
|
| 242 |
+
## 🏆 Picking the Right Momentum
|
| 243 |
+
|
| 244 |
+
- **Too Small?** ➝ Model gets **stuck** 😕
|
| 245 |
+
- **Too Large?** ➝ Model **overshoots** the best answer 🚀
|
| 246 |
+
- **Best Choice?** ➝ We test different values and pick what works! 🔬
|
| 247 |
+
|
| 248 |
+
---
|
| 249 |
+
|
| 250 |
+
## 🛠 Using Momentum in PyTorch
|
| 251 |
+
Just add the **momentum** value to the optimizer!
|
| 252 |
+
|
| 253 |
+
```python
|
| 254 |
+
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
|
| 255 |
+
```
|
| 256 |
+
|
| 257 |
+
In the lab, we test **different momentum values** on a dataset and see how they affect learning! 📊
|
| 258 |
+
|
| 259 |
+
---
|
| 260 |
+
|
| 261 |
+
🎉 **Great job!** Now, let’s experiment with momentum and see how it helps our model! 🏗️⚡
|
| 262 |
+
------------------------------------------------------------------------------------------
|
| 263 |
+
🎵 **Music Playing**
|
| 264 |
+
|
| 265 |
+
👋 **Welcome!** Today, we’re learning about **Batch Normalization**! 🚀🔄
|
| 266 |
+
|
| 267 |
+
## 🤔 What’s the Problem?
|
| 268 |
+
When training a neural network, the activations (outputs) can vary a lot, making learning **slower** and **unstable**. 😖
|
| 269 |
+
Batch Normalization **fixes this** by:
|
| 270 |
+
✅ Making activations more consistent
|
| 271 |
+
✅ Helping the network learn faster
|
| 272 |
+
✅ Reducing problems like vanishing gradients
|
| 273 |
+
|
| 274 |
+
---
|
| 275 |
+
|
| 276 |
+
## 🔄 How Does Batch Normalization Work?
|
| 277 |
+
|
| 278 |
+
### 🏗 Step 1: Normalize Each Mini-Batch
|
| 279 |
+
For each neuron in a layer:
|
| 280 |
+
1️⃣ Compute the **mean** and **standard deviation** of its activations. 📊
|
| 281 |
+
2️⃣ Normalize the outputs using:
|
| 282 |
+
\[
|
| 283 |
+
z' = \frac{z - \text{mean}}{\text{std dev} + \epsilon}
|
| 284 |
+
\]
|
| 285 |
+
(We add a **small** value `ε` to avoid division by zero.)
|
| 286 |
+
|
| 287 |
+
### 🏗 Step 2: Scale and Shift
|
| 288 |
+
- Instead of leaving activations at 0 and 1, we **scale** and **shift** them:
|
| 289 |
+
\[
|
| 290 |
+
z'' = \gamma \cdot z' + \beta
|
| 291 |
+
\]
|
| 292 |
+
- **γ (scale) and β (shift)** are **learned** during training! 🏋️♂️
|
| 293 |
+
|
| 294 |
+
---
|
| 295 |
+
|
| 296 |
+
## 🔬 Example: Normalizing Activations
|
| 297 |
+
|
| 298 |
+
- **First Mini-Batch (X1)** ➝ Compute mean & std for each neuron, normalize, then scale & shift
|
| 299 |
+
- **Second Mini-Batch (X2)** ➝ Repeat for new batch! ♻
|
| 300 |
+
- **Next Layer** ➝ Apply batch normalization again! 🔄
|
| 301 |
+
|
| 302 |
+
### 🏆 Prediction Time
|
| 303 |
+
- During **training**, we compute the mean & std for **each batch**.
|
| 304 |
+
- During **testing**, we use the **population mean & std** instead. 📊
|
| 305 |
+
|
| 306 |
+
---
|
| 307 |
+
|
| 308 |
+
## 🛠 Using Batch Normalization in PyTorch
|
| 309 |
+
|
| 310 |
+
```python
|
| 311 |
+
import torch.nn as nn
|
| 312 |
+
|
| 313 |
+
class NeuralNetwork(nn.Module):
|
| 314 |
+
def __init__(self):
|
| 315 |
+
super(NeuralNetwork, self).__init__()
|
| 316 |
+
self.fc1 = nn.Linear(10, 3) # First layer (10 inputs, 3 neurons)
|
| 317 |
+
self.bn1 = nn.BatchNorm1d(3) # Batch Norm for first layer
|
| 318 |
+
self.fc2 = nn.Linear(3, 4) # Second layer (3 inputs, 4 neurons)
|
| 319 |
+
self.bn2 = nn.BatchNorm1d(4) # Batch Norm for second layer
|
| 320 |
+
|
| 321 |
+
def forward(self, x):
|
| 322 |
+
x = self.bn1(self.fc1(x)) # Apply Batch Norm
|
| 323 |
+
x = self.bn2(self.fc2(x)) # Apply Batch Norm again
|
| 324 |
+
return x
|
| 325 |
+
```
|
| 326 |
+
|
| 327 |
+
- **Training?** Set the model to **train mode** 🏋️♂️
|
| 328 |
+
```python
|
| 329 |
+
model.train()
|
| 330 |
+
```
|
| 331 |
+
- **Predicting?** Use **evaluation mode** 📈
|
| 332 |
+
```python
|
| 333 |
+
model.eval()
|
| 334 |
+
```
|
| 335 |
+
|
| 336 |
+
---
|
| 337 |
+
|
| 338 |
+
## 🚀 Why Does Batch Normalization Work?
|
| 339 |
+
|
| 340 |
+
### ✅ Helps Gradient Descent Work Better
|
| 341 |
+
- Normalized data = **smoother** loss function 🎯
|
| 342 |
+
- Gradients point in the **right** direction = Faster learning! 🚀
|
| 343 |
+
|
| 344 |
+
### ✅ Reduces Vanishing Gradient Problem
|
| 345 |
+
- Sigmoid & Tanh activations suffer from small gradients 😢
|
| 346 |
+
- Normalization **keeps activations in a good range** 📊
|
| 347 |
+
|
| 348 |
+
### ✅ Allows Higher Learning Rates
|
| 349 |
+
- Networks can **train faster** without getting unstable ⏩
|
| 350 |
+
|
| 351 |
+
### ✅ Reduces Need for Dropout
|
| 352 |
+
- Some studies show **Batch Norm can replace Dropout** 🤯
|
| 353 |
+
|
| 354 |
+
---
|
| 355 |
+
|
| 356 |
+
🎉 **Great job!** Now, let’s try batch normalization in our own models! 🏗️📈
|
deep_networks.py
ADDED
|
@@ -0,0 +1,80 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import torch
|
| 2 |
+
import torch.nn as nn
|
| 3 |
+
import torch.optim as optim
|
| 4 |
+
import matplotlib.pyplot as plt
|
| 5 |
+
import numpy as np
|
| 6 |
+
|
| 7 |
+
# 🔥 Deep Neural Network Model
|
| 8 |
+
class DeepNeuralNetwork(nn.Module):
|
| 9 |
+
def __init__(self, input_size, hidden_sizes, output_size):
|
| 10 |
+
super(DeepNeuralNetwork, self).__init__()
|
| 11 |
+
layers = []
|
| 12 |
+
in_size = input_size
|
| 13 |
+
|
| 14 |
+
for hidden_size in hidden_sizes:
|
| 15 |
+
layers.append(nn.Linear(in_size, hidden_size))
|
| 16 |
+
layers.append(nn.ReLU())
|
| 17 |
+
in_size = hidden_size
|
| 18 |
+
|
| 19 |
+
layers.append(nn.Linear(in_size, output_size))
|
| 20 |
+
self.model = nn.Sequential(*layers)
|
| 21 |
+
|
| 22 |
+
def forward(self, x):
|
| 23 |
+
return self.model(x) # ✅ Apply the model layers properly
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
# 🔥 Training Function
|
| 27 |
+
def train_model(model, criterion, optimizer, x_train, y_train, epochs=100):
|
| 28 |
+
model.train()
|
| 29 |
+
for epoch in range(epochs):
|
| 30 |
+
optimizer.zero_grad()
|
| 31 |
+
|
| 32 |
+
# Forward pass
|
| 33 |
+
y_pred = model(x_train)
|
| 34 |
+
|
| 35 |
+
# Loss calculation
|
| 36 |
+
loss = criterion(y_pred, y_train)
|
| 37 |
+
|
| 38 |
+
# Backward pass
|
| 39 |
+
loss.backward()
|
| 40 |
+
optimizer.step()
|
| 41 |
+
|
| 42 |
+
if (epoch + 1) % 10 == 0:
|
| 43 |
+
print(f'Epoch [{epoch + 1}/{epochs}], Loss: {loss.item():.4f}')
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
# ✅ Example Usage
|
| 47 |
+
if __name__ == "__main__":
|
| 48 |
+
# 🔥 Sample Data
|
| 49 |
+
x_train = torch.randn(100, 10, requires_grad=True) # ✅ Require gradient tracking
|
| 50 |
+
y_train = torch.randint(0, 2, (100,), dtype=torch.long) # ✅ Ensure LongTensor for CrossEntropyLoss
|
| 51 |
+
|
| 52 |
+
# Plotting the input data
|
| 53 |
+
plt.scatter(x_train[:, 0].detach().numpy(), x_train[:, 1].detach().numpy(), c=y_train.numpy(), cmap='viridis')
|
| 54 |
+
plt.title('Deep Neural Network Input Data')
|
| 55 |
+
plt.xlabel('Input Feature 1')
|
| 56 |
+
plt.ylabel('Input Feature 2')
|
| 57 |
+
plt.colorbar(label='Output Class')
|
| 58 |
+
plt.show()
|
| 59 |
+
|
| 60 |
+
# Initialize Model
|
| 61 |
+
model = DeepNeuralNetwork(input_size=10, hidden_sizes=[20, 10], output_size=2)
|
| 62 |
+
|
| 63 |
+
# Criterion and Optimizer
|
| 64 |
+
criterion = nn.CrossEntropyLoss()
|
| 65 |
+
optimizer = optim.SGD(model.parameters(), lr=0.01)
|
| 66 |
+
|
| 67 |
+
# Train the model
|
| 68 |
+
train_model(model, criterion, optimizer, x_train, y_train, epochs=100)
|
| 69 |
+
|
| 70 |
+
# ✅ Plotting the predictions with softmax
|
| 71 |
+
model.eval()
|
| 72 |
+
with torch.no_grad():
|
| 73 |
+
y_pred = torch.softmax(model(x_train), dim=1).detach().numpy()
|
| 74 |
+
|
| 75 |
+
plt.scatter(x_train[:, 0].detach().numpy(), x_train[:, 1].detach().numpy(), c=np.argmax(y_pred, axis=1), cmap='viridis')
|
| 76 |
+
plt.title('Deep Neural Network Predictions')
|
| 77 |
+
plt.xlabel('Input Feature 1')
|
| 78 |
+
plt.ylabel('Input Feature 2')
|
| 79 |
+
plt.colorbar(label='Predicted Class')
|
| 80 |
+
plt.show()
|
final_project.md
ADDED
|
@@ -0,0 +1,29 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# 🧠 **Simple Summary of the Program**
|
| 2 |
+
|
| 3 |
+
1. **Loads and Prepares Data:**
|
| 4 |
+
- Uses the **MNIST dataset**, which contains images of handwritten digits (0-9).
|
| 5 |
+
- Resizes the images and converts them to tensors.
|
| 6 |
+
- Creates a **data loader** to batch the images and shuffle them for training.
|
| 7 |
+
|
| 8 |
+
2. **Defines a CNN Model:**
|
| 9 |
+
- The **FinalCNN** model processes the images through layers:
|
| 10 |
+
- **Conv1:** Finds simple features like edges.
|
| 11 |
+
- **Pool1:** Reduces the size to focus on important features.
|
| 12 |
+
- **Conv2:** Finds more complex patterns.
|
| 13 |
+
- **Pool2:** Reduces the size again.
|
| 14 |
+
- **Flattening:** Converts the features into a single line of numbers.
|
| 15 |
+
- **Fully Connected Layers:** Makes predictions about what digit is in the image.
|
| 16 |
+
|
| 17 |
+
3. **Trains the Model:**
|
| 18 |
+
- Uses the **Cross-Entropy Loss** to measure how far the predictions are from the real digit labels.
|
| 19 |
+
- Uses **Stochastic Gradient Descent (SGD)** to adjust the model parameters and make better predictions.
|
| 20 |
+
- Runs the training for **32 epochs**, slowly improving the accuracy.
|
| 21 |
+
|
| 22 |
+
4. **Displays Predictions:**
|
| 23 |
+
- Shows **6 sample images** with the model's predictions and the actual labels.
|
| 24 |
+
- Prints the accuracy and loss for each epoch.
|
| 25 |
+
|
| 26 |
+
5. **GPU Acceleration:**
|
| 27 |
+
- Uses **CUDA** if available, making the training faster by running on the GPU.
|
| 28 |
+
|
| 29 |
+
✅ This program is like a smart detective that learns to recognize handwritten numbers by studying lots of examples and gradually improving its guesses.
|
final_project.py
ADDED
|
@@ -0,0 +1,260 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import torch
|
| 2 |
+
import torch.nn as nn
|
| 3 |
+
import torch.optim as optim
|
| 4 |
+
from torchvision import datasets, transforms
|
| 5 |
+
from torch.utils.data import DataLoader
|
| 6 |
+
import matplotlib.pyplot as plt
|
| 7 |
+
from tqdm import tqdm
|
| 8 |
+
from dataset_loader import CustomMNISTDataset
|
| 9 |
+
import os
|
| 10 |
+
import matplotlib.font_manager as fm
|
| 11 |
+
# CNN Model
|
| 12 |
+
# CNN Model with output layer for 62 categories
|
| 13 |
+
class FinalCNN(nn.Module):
|
| 14 |
+
def __init__(self):
|
| 15 |
+
super(FinalCNN, self).__init__()
|
| 16 |
+
self.conv1 = nn.Conv2d(1, 16, kernel_size=5, stride=1, padding=0)
|
| 17 |
+
self.conv2 = nn.Conv2d(16, 32, kernel_size=5, stride=1, padding=0)
|
| 18 |
+
self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
|
| 19 |
+
self.fc1 = nn.Linear(32 * 4 * 4, 120)
|
| 20 |
+
self.fc2 = nn.Linear(120, 84)
|
| 21 |
+
self.fc3 = nn.Linear(84, 62) # Output layer with 62 units for (0-9, a-z, A-Z)
|
| 22 |
+
|
| 23 |
+
def forward(self, x):
|
| 24 |
+
x = torch.relu(self.conv1(x))
|
| 25 |
+
x = self.pool(x)
|
| 26 |
+
x = torch.relu(self.conv2(x))
|
| 27 |
+
x = self.pool(x)
|
| 28 |
+
x = x.view(-1, 32 * 4 * 4)
|
| 29 |
+
x = torch.relu(self.fc1(x))
|
| 30 |
+
x = torch.relu(self.fc2(x))
|
| 31 |
+
x = self.fc3(x) # Final output
|
| 32 |
+
return x
|
| 33 |
+
def plot_loss_accuracy(losses, accuracies):
|
| 34 |
+
"""Plots Loss vs Accuracy on the same graph."""
|
| 35 |
+
plt.figure(figsize=(10, 6))
|
| 36 |
+
|
| 37 |
+
# Plot Loss
|
| 38 |
+
plt.plot(losses, color='red', label='Loss (Cost)', linestyle='-', marker='o')
|
| 39 |
+
|
| 40 |
+
# Plot Accuracy
|
| 41 |
+
plt.plot(accuracies, color='blue', label='Accuracy', linestyle='-', marker='x')
|
| 42 |
+
|
| 43 |
+
plt.title('Training Loss and Accuracy', fontsize=14)
|
| 44 |
+
plt.xlabel('Epochs', fontsize=12)
|
| 45 |
+
plt.ylabel('Value', fontsize=12)
|
| 46 |
+
plt.legend(loc='best')
|
| 47 |
+
plt.grid(True)
|
| 48 |
+
|
| 49 |
+
# Show the plot
|
| 50 |
+
plt.savefig("plot.svg")
|
| 51 |
+
|
| 52 |
+
# 🔥 Function to choose the dataset dynamically
|
| 53 |
+
def choose_dataset(dataset_name):
|
| 54 |
+
"""Choose and load a custom dataset dynamically."""
|
| 55 |
+
|
| 56 |
+
# ✅ Dynamic path generation
|
| 57 |
+
base_path = './data'
|
| 58 |
+
dataset_path = os.path.join(base_path, dataset_name, 'raw')
|
| 59 |
+
|
| 60 |
+
# Validate dataset path
|
| 61 |
+
if not os.path.exists(dataset_path):
|
| 62 |
+
raise ValueError(f"❌ Dataset {dataset_name} not found at {dataset_path}")
|
| 63 |
+
|
| 64 |
+
# ✅ Locate image and label files dynamically
|
| 65 |
+
image_file = None
|
| 66 |
+
label_file = None
|
| 67 |
+
|
| 68 |
+
for file in os.listdir(dataset_path):
|
| 69 |
+
if 'images' in file:
|
| 70 |
+
image_file = os.path.join(dataset_path, file)
|
| 71 |
+
elif 'labels' in file:
|
| 72 |
+
label_file = os.path.join(dataset_path, file)
|
| 73 |
+
|
| 74 |
+
# Ensure both image and label files are found
|
| 75 |
+
if not image_file or not label_file:
|
| 76 |
+
raise ValueError(f"❌ Missing image or label files in {dataset_path}")
|
| 77 |
+
|
| 78 |
+
transform = transforms.Compose([
|
| 79 |
+
transforms.ToTensor(),
|
| 80 |
+
transforms.Normalize((0.5,), (0.5,)) # Normalize between -1 and 1
|
| 81 |
+
])
|
| 82 |
+
|
| 83 |
+
# ✅ Load the custom dataset with file paths
|
| 84 |
+
dataset = CustomMNISTDataset(dataset_path=dataset_path, transform=transform)
|
| 85 |
+
|
| 86 |
+
return dataset
|
| 87 |
+
|
| 88 |
+
|
| 89 |
+
# Print activation details once
|
| 90 |
+
def print_activation_details(model, sample_batch):
|
| 91 |
+
"""Print activation map sizes once before training."""
|
| 92 |
+
with torch.no_grad():
|
| 93 |
+
x = sample_batch
|
| 94 |
+
print("\n--- CNN Activation Details (One-time) ---")
|
| 95 |
+
|
| 96 |
+
x = model.conv1(x)
|
| 97 |
+
print(f"Conv1: {x.shape}")
|
| 98 |
+
|
| 99 |
+
x = model.pool(x)
|
| 100 |
+
print(f"Pool1: {x.shape}")
|
| 101 |
+
|
| 102 |
+
x = model.conv2(x)
|
| 103 |
+
print(f"Conv2: {x.shape}")
|
| 104 |
+
|
| 105 |
+
x = model.pool(x)
|
| 106 |
+
print(f"Pool2: {x.shape}")
|
| 107 |
+
|
| 108 |
+
x = x.view(-1, 32 * 4 * 4)
|
| 109 |
+
print(f"Flattened: {x.shape}")
|
| 110 |
+
|
| 111 |
+
x = model.fc1(x)
|
| 112 |
+
print(f"FC1: {x.shape}")
|
| 113 |
+
|
| 114 |
+
x = model.fc2(x)
|
| 115 |
+
print(f"FC2: {x.shape}")
|
| 116 |
+
|
| 117 |
+
x = model.fc3(x)
|
| 118 |
+
print(f"Output (Logits): {x.shape}\n")
|
| 119 |
+
|
| 120 |
+
|
| 121 |
+
# Training Function
|
| 122 |
+
def train_final_model(model, criterion, optimizer, train_loader, epochs=256):
|
| 123 |
+
losses = []
|
| 124 |
+
accuracies = []
|
| 125 |
+
|
| 126 |
+
# Print activation details once before training
|
| 127 |
+
sample_batch, _ = next(iter(train_loader))
|
| 128 |
+
print_activation_details(model, sample_batch)
|
| 129 |
+
|
| 130 |
+
model.train()
|
| 131 |
+
|
| 132 |
+
for epoch in range(epochs):
|
| 133 |
+
epoch_loss = 0.0
|
| 134 |
+
correct, total = 0, 0
|
| 135 |
+
|
| 136 |
+
# tqdm progress bar
|
| 137 |
+
with tqdm(train_loader, desc=f'Epoch {epoch + 1}/{epochs}', unit='batch') as t:
|
| 138 |
+
for images, labels in t:
|
| 139 |
+
optimizer.zero_grad()
|
| 140 |
+
outputs = model(images)
|
| 141 |
+
loss = criterion(outputs, labels)
|
| 142 |
+
loss.backward()
|
| 143 |
+
optimizer.step()
|
| 144 |
+
|
| 145 |
+
# Update metrics
|
| 146 |
+
epoch_loss += loss.item()
|
| 147 |
+
_, predicted = torch.max(outputs, 1)
|
| 148 |
+
total += labels.size(0)
|
| 149 |
+
correct += (predicted == labels).sum().item()
|
| 150 |
+
|
| 151 |
+
t.set_postfix(loss=loss.item())
|
| 152 |
+
|
| 153 |
+
# Store epoch loss and accuracy
|
| 154 |
+
losses.append(epoch_loss / len(train_loader))
|
| 155 |
+
accuracy = 100 * correct / total
|
| 156 |
+
accuracies.append(accuracy)
|
| 157 |
+
|
| 158 |
+
print(f"Epoch [{epoch+1}/{epochs}], Loss: {epoch_loss / len(train_loader):.4f}, Accuracy: {accuracy:.2f}%")
|
| 159 |
+
|
| 160 |
+
# After training, plot the loss and accuracy
|
| 161 |
+
plot_loss_accuracy(losses, accuracies)
|
| 162 |
+
|
| 163 |
+
return losses, accuracies
|
| 164 |
+
|
| 165 |
+
|
| 166 |
+
# Display sample predictions
|
| 167 |
+
|
| 168 |
+
def get_dataset_options(base_path='./data'):
|
| 169 |
+
"""List all subdirectories in the data directory."""
|
| 170 |
+
try:
|
| 171 |
+
# List all subdirectories in the base_path (data folder)
|
| 172 |
+
options = [folder for folder in os.listdir(base_path) if os.path.isdir(os.path.join(base_path, folder))]
|
| 173 |
+
return options
|
| 174 |
+
except FileNotFoundError:
|
| 175 |
+
print(f"❌ Directory {base_path} not found!")
|
| 176 |
+
return []
|
| 177 |
+
|
| 178 |
+
def number_to_char(number):
|
| 179 |
+
if 0 <= number <= 9:
|
| 180 |
+
return str(number) # 0-9
|
| 181 |
+
elif 10 <= number <= 35:
|
| 182 |
+
return chr(number + 87) # a-z (10 -> 'a', 35 -> 'z')
|
| 183 |
+
elif 36 <= number <= 61:
|
| 184 |
+
return chr(number + 65) # A-Z (36 -> 'A', 61 -> 'Z')
|
| 185 |
+
else:
|
| 186 |
+
return ''
|
| 187 |
+
|
| 188 |
+
|
| 189 |
+
def display_predictions(model, data_loader, output_name, num_samples=6, font_path='./Daemon.otf'):
|
| 190 |
+
"""Displays sample images with predicted labels"""
|
| 191 |
+
model.eval()
|
| 192 |
+
|
| 193 |
+
# Load custom font
|
| 194 |
+
prop = fm.FontProperties(fname=font_path)
|
| 195 |
+
|
| 196 |
+
images, labels = next(iter(data_loader))
|
| 197 |
+
with torch.no_grad():
|
| 198 |
+
outputs = model(images)
|
| 199 |
+
_, predictions = torch.max(outputs, 1)
|
| 200 |
+
|
| 201 |
+
# Displaying 6 samples
|
| 202 |
+
plt.figure(figsize=(12, 6))
|
| 203 |
+
|
| 204 |
+
for i in range(num_samples):
|
| 205 |
+
plt.subplot(2, 3, i + 1)
|
| 206 |
+
plt.imshow(images[i].squeeze(), cmap='gray')
|
| 207 |
+
|
| 208 |
+
# Convert predicted number to corresponding character
|
| 209 |
+
predicted_char = number_to_char(predictions[i].item())
|
| 210 |
+
actual_char = number_to_char(labels[i].item())
|
| 211 |
+
|
| 212 |
+
# Title with 'Predicted' and 'Actual' both in custom font
|
| 213 |
+
if(predicted_char == actual_char):
|
| 214 |
+
plt.title(f'{predicted_char} = {actual_char}', fontsize=84, fontproperties=prop)
|
| 215 |
+
else:
|
| 216 |
+
plt.title(f'{predicted_char} != {actual_char}', fontsize=84, fontproperties=prop)
|
| 217 |
+
|
| 218 |
+
plt.axis('off')
|
| 219 |
+
|
| 220 |
+
plt.savefig(output_name)
|
| 221 |
+
|
| 222 |
+
|
| 223 |
+
if __name__ == "__main__":
|
| 224 |
+
# Choose Dataset
|
| 225 |
+
dataset_options = get_dataset_options()
|
| 226 |
+
|
| 227 |
+
if dataset_options:
|
| 228 |
+
# Dynamically display dataset options
|
| 229 |
+
print("Available datasets:")
|
| 230 |
+
for i, option in enumerate(dataset_options, 1):
|
| 231 |
+
print(f"{i}. {option}")
|
| 232 |
+
|
| 233 |
+
# User input to choose a dataset
|
| 234 |
+
dataset_index = int(input(f"Enter the number corresponding to the dataset (1-{len(dataset_options)}): ")) - 1
|
| 235 |
+
|
| 236 |
+
# Ensure valid selection
|
| 237 |
+
if 0 <= dataset_index < len(dataset_options):
|
| 238 |
+
dataset_name = dataset_options[dataset_index]
|
| 239 |
+
print(f"You selected: {dataset_name}")
|
| 240 |
+
else:
|
| 241 |
+
print("❌ Invalid selection.")
|
| 242 |
+
dataset_name = None
|
| 243 |
+
else:
|
| 244 |
+
print("❌ No datasets found in the data folder.")
|
| 245 |
+
dataset_name = None
|
| 246 |
+
|
| 247 |
+
train_dataset = choose_dataset(dataset_name)
|
| 248 |
+
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
|
| 249 |
+
|
| 250 |
+
# Model, Criterion, and Optimizer
|
| 251 |
+
model = FinalCNN()
|
| 252 |
+
criterion = nn.CrossEntropyLoss()
|
| 253 |
+
optimizer = optim.SGD(model.parameters(), lr=0.005)
|
| 254 |
+
|
| 255 |
+
display_predictions(model, train_loader, output_name="before.svg")
|
| 256 |
+
# Train the Model
|
| 257 |
+
losses, accuracies = train_final_model(model, criterion, optimizer, train_loader, epochs=256)
|
| 258 |
+
|
| 259 |
+
# Display sample predictions
|
| 260 |
+
display_predictions(model, train_loader, output_name="after.svg")
|
font.py
ADDED
|
@@ -0,0 +1,138 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import os
|
| 2 |
+
import struct
|
| 3 |
+
import numpy as np
|
| 4 |
+
import torch
|
| 5 |
+
import gzip
|
| 6 |
+
from PIL import Image, ImageFont, ImageDraw
|
| 7 |
+
import cv2
|
| 8 |
+
import random
|
| 9 |
+
import string
|
| 10 |
+
|
| 11 |
+
# 📝 Define the HandwrittenFontDataset class
|
| 12 |
+
class HandwrittenFontDataset(torch.utils.data.Dataset):
|
| 13 |
+
def __init__(self, font_path, num_samples):
|
| 14 |
+
self.font_path = font_path
|
| 15 |
+
self.num_samples = num_samples
|
| 16 |
+
self.font = ImageFont.truetype(self.font_path, 32) # Font size
|
| 17 |
+
self.characters = string.digits + string.ascii_uppercase + string.ascii_lowercase
|
| 18 |
+
|
| 19 |
+
def __len__(self):
|
| 20 |
+
return self.num_samples
|
| 21 |
+
|
| 22 |
+
def __getitem__(self, index):
|
| 23 |
+
# Randomly choose a character
|
| 24 |
+
char = random.choice(self.characters)
|
| 25 |
+
# Proceed with image creation and processing...
|
| 26 |
+
|
| 27 |
+
# Create image with that character
|
| 28 |
+
img = Image.new('L', (64, 64), color=255) # Create a blank image (grayscale)
|
| 29 |
+
draw = ImageDraw.Draw(img)
|
| 30 |
+
draw.text((10, 10), char, font=self.font, fill=0) # Draw the character
|
| 31 |
+
|
| 32 |
+
# Convert image to numpy array (resize to 28x28 for MNIST format)
|
| 33 |
+
img = np.array(img)
|
| 34 |
+
img = preprocess_for_mnist(img)
|
| 35 |
+
|
| 36 |
+
# Convert character to label (integer)
|
| 37 |
+
label = self.characters.index(char)
|
| 38 |
+
|
| 39 |
+
return torch.tensor(img, dtype=torch.uint8), label
|
| 40 |
+
|
| 41 |
+
# 📄 Resize and preprocess images for MNIST format
|
| 42 |
+
def preprocess_for_mnist(img):
|
| 43 |
+
"""Resize image to 28x28 and normalize to 0-255 range."""
|
| 44 |
+
img = cv2.resize(img, (28, 28), interpolation=cv2.INTER_AREA)
|
| 45 |
+
img = img.astype(np.uint8) # Convert to unsigned byte
|
| 46 |
+
return img
|
| 47 |
+
|
| 48 |
+
# 📄 Write images to idx3-ubyte format
|
| 49 |
+
def write_idx3_ubyte(images, file_path):
|
| 50 |
+
"""Write images to idx3-ubyte format."""
|
| 51 |
+
with open(file_path, 'wb') as f:
|
| 52 |
+
# Magic number (0x00000801 for image files)
|
| 53 |
+
f.write(struct.pack(">IIII", 2051, len(images), 28, 28))
|
| 54 |
+
|
| 55 |
+
# Write image data as unsigned bytes (each pixel in range [0, 255])
|
| 56 |
+
for image in images:
|
| 57 |
+
f.write(image.tobytes())
|
| 58 |
+
|
| 59 |
+
# 📄 Write labels to idx1-ubyte format
|
| 60 |
+
def write_idx1_ubyte(labels, file_path):
|
| 61 |
+
"""Write labels to idx1-ubyte format."""
|
| 62 |
+
with open(file_path, 'wb') as f:
|
| 63 |
+
# Magic number (0x00000801 for label files)
|
| 64 |
+
f.write(struct.pack(">II", 2049, len(labels)))
|
| 65 |
+
|
| 66 |
+
# Write each label as a byte
|
| 67 |
+
for label in labels:
|
| 68 |
+
f.write(struct.pack("B", label))
|
| 69 |
+
|
| 70 |
+
# 📄 Compress file to .gz format
|
| 71 |
+
def compress_file(input_path, output_path):
|
| 72 |
+
"""Compress the idx3 and idx1 files to .gz format."""
|
| 73 |
+
with open(input_path, 'rb') as f_in:
|
| 74 |
+
with gzip.open(output_path, 'wb') as f_out:
|
| 75 |
+
f_out.writelines(f_in)
|
| 76 |
+
|
| 77 |
+
# 📊 Save dataset in MNIST format
|
| 78 |
+
def save_mnist_format(images, labels, output_dir):
|
| 79 |
+
"""Save the dataset in MNIST format to raw/ directory."""
|
| 80 |
+
raw_dir = os.path.join(output_dir, "raw")
|
| 81 |
+
os.makedirs(raw_dir, exist_ok=True)
|
| 82 |
+
|
| 83 |
+
# Prepare file paths
|
| 84 |
+
train_images_path = os.path.join(raw_dir, "train-images-idx3-ubyte")
|
| 85 |
+
train_labels_path = os.path.join(raw_dir, "train-labels-idx1-ubyte")
|
| 86 |
+
|
| 87 |
+
# Write uncompressed idx3 and idx1 files
|
| 88 |
+
write_idx3_ubyte(images, train_images_path)
|
| 89 |
+
write_idx1_ubyte(labels, train_labels_path)
|
| 90 |
+
|
| 91 |
+
# Compress idx3 and idx1 files into .gz format
|
| 92 |
+
compress_file(train_images_path, f"{train_images_path}.gz")
|
| 93 |
+
compress_file(train_labels_path, f"{train_labels_path}.gz")
|
| 94 |
+
|
| 95 |
+
print(f"Dataset saved in MNIST format at {raw_dir}")
|
| 96 |
+
|
| 97 |
+
# ✅ Generate and save the dataset
|
| 98 |
+
def create_mnist_dataset(font_path, num_samples=4096):
|
| 99 |
+
"""Generate dataset and save in MNIST format."""
|
| 100 |
+
# Get font name without extension
|
| 101 |
+
font_name = os.path.splitext(os.path.basename(font_path))[0]
|
| 102 |
+
output_dir = os.path.join("./data", font_name)
|
| 103 |
+
|
| 104 |
+
# Ensure the directory exists
|
| 105 |
+
os.makedirs(output_dir, exist_ok=True)
|
| 106 |
+
|
| 107 |
+
dataset = HandwrittenFontDataset(font_path, num_samples)
|
| 108 |
+
|
| 109 |
+
images = []
|
| 110 |
+
labels = []
|
| 111 |
+
|
| 112 |
+
for i in range(num_samples):
|
| 113 |
+
img, label = dataset[i]
|
| 114 |
+
images.append(img.numpy())
|
| 115 |
+
labels.append(label)
|
| 116 |
+
|
| 117 |
+
# Save in MNIST format
|
| 118 |
+
save_mnist_format(images, labels, output_dir)
|
| 119 |
+
|
| 120 |
+
# 🔥 Example usage
|
| 121 |
+
def choose_font_and_create_dataset():
|
| 122 |
+
# List all TTF and OTF files in the root directory
|
| 123 |
+
font_files = [f for f in os.listdir("./") if f.endswith(".ttf") or f.endswith(".otf")]
|
| 124 |
+
|
| 125 |
+
# Display available fonts for user to choose
|
| 126 |
+
print("Available fonts:")
|
| 127 |
+
for i, font_file in enumerate(font_files):
|
| 128 |
+
print(f"{i+1}. {font_file}")
|
| 129 |
+
|
| 130 |
+
# Get user's choice
|
| 131 |
+
choice = int(input(f"Choose a font (1-{len(font_files)}): "))
|
| 132 |
+
chosen_font = font_files[choice - 1]
|
| 133 |
+
|
| 134 |
+
print(f"Creating dataset using font: {chosen_font}")
|
| 135 |
+
create_mnist_dataset(chosen_font)
|
| 136 |
+
|
| 137 |
+
# Run the font selection and dataset creation
|
| 138 |
+
choose_font_and_create_dataset()
|
install.bat
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
@echo off
|
| 2 |
+
REM Create a virtual environment
|
| 3 |
+
python -m venv .pytorchenv
|
| 4 |
+
REM Activate the virtual environment
|
| 5 |
+
call .pytorchenv\Scripts\activate
|
| 6 |
+
REM Install required Python packages
|
| 7 |
+
pip install -r requirements.txt
|
install.sh
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/bin/bash
|
| 2 |
+
# Create a virtual environment
|
| 3 |
+
python -m venv .pytorchenv
|
| 4 |
+
# Activate the virtual environment
|
| 5 |
+
source .pytorchenv/bin/activate
|
| 6 |
+
# Install required Python packages
|
| 7 |
+
pip install -r requirements.txt
|
logistic_regression.py
ADDED
|
@@ -0,0 +1,65 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import torch
|
| 2 |
+
import torch.nn as nn
|
| 3 |
+
import torch.optim as optim
|
| 4 |
+
import numpy as np
|
| 5 |
+
from torch.utils.data import DataLoader, TensorDataset
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
# Logistic Regression Model
|
| 9 |
+
class LogisticRegressionModel(nn.Module):
|
| 10 |
+
def __init__(self, input_size):
|
| 11 |
+
super(LogisticRegressionModel, self).__init__()
|
| 12 |
+
self.linear = nn.Linear(input_size, 1)
|
| 13 |
+
|
| 14 |
+
def forward(self, x):
|
| 15 |
+
return torch.sigmoid(self.linear(x))
|
| 16 |
+
|
| 17 |
+
# Cross-Entropy Loss Function
|
| 18 |
+
def cross_entropy_loss(y_pred, y_true):
|
| 19 |
+
return -torch.mean(y_true * torch.log(y_pred) + (1 - y_true) * torch.log(1 - y_pred))
|
| 20 |
+
|
| 21 |
+
# Training Function
|
| 22 |
+
def train_model(model, criterion, optimizer, x_train, y_train, epochs=100):
|
| 23 |
+
print("Initial model parameters:", list(model.parameters()))
|
| 24 |
+
for epoch in range(epochs):
|
| 25 |
+
print(f'Epoch {epoch+1}/{epochs}')
|
| 26 |
+
|
| 27 |
+
model.train()
|
| 28 |
+
optimizer.zero_grad()
|
| 29 |
+
y_pred = model(x_train)
|
| 30 |
+
loss = criterion(y_pred, y_train)
|
| 31 |
+
loss.backward()
|
| 32 |
+
optimizer.step()
|
| 33 |
+
# Removed cross_entropy calculation as it is redundant
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
|
| 38 |
+
print(f'Loss: {loss.item():.4f}, Learning Rate: {optimizer.param_groups[0]["lr"]}')
|
| 39 |
+
|
| 40 |
+
if (epoch+1) % 10 == 0:
|
| 41 |
+
|
| 42 |
+
print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')
|
| 43 |
+
|
| 44 |
+
# Example Usage
|
| 45 |
+
if __name__ == "__main__":
|
| 46 |
+
# Sample Data
|
| 47 |
+
x_data = torch.tensor(np.random.rand(100, 2), dtype=torch.float32)
|
| 48 |
+
y_data = torch.tensor(np.random.randint(0, 2, (100, 1)), dtype=torch.float32)
|
| 49 |
+
|
| 50 |
+
# Create Dataset and DataLoader
|
| 51 |
+
dataset = TensorDataset(x_data, y_data)
|
| 52 |
+
train_loader = DataLoader(dataset, batch_size=10, shuffle=True)
|
| 53 |
+
|
| 54 |
+
model = LogisticRegressionModel(input_size=2)
|
| 55 |
+
|
| 56 |
+
# Initialize criterion and optimizer
|
| 57 |
+
criterion = nn.BCELoss()
|
| 58 |
+
optimizer = optim.SGD(model.parameters(), lr=0.01)
|
| 59 |
+
|
| 60 |
+
# Call the training function
|
| 61 |
+
train_model(model, criterion, optimizer, x_data, y_data, epochs=100)
|
| 62 |
+
|
| 63 |
+
with torch.no_grad():
|
| 64 |
+
sample_predictions = model(x_data)
|
| 65 |
+
print("Sample predictions after training:", sample_predictions)
|
metrics.png
ADDED
|