EverythingIsAFont / convolutional_neural_networks.md
taellinglin's picture
Upload 61 files
9dce563 verified
l๐ŸŽต **Music Playing**
๐Ÿ‘‹ **Welcome!** Today, weโ€™re learning about **Convolution** in Neural Networks! ๐Ÿง ๐Ÿ–ผ๏ธ
## ๐Ÿค” What is Convolution?
Convolution helps computers **understand pictures** by looking at **patterns** instead of exact positions! ๐Ÿ–ผ๏ธ๐Ÿ”
Imagine you have **two images** that look almost the same, but one is a little **moved**.
A computer might think they are totally **different**! ๐Ÿ˜ฒ
**Convolution fixes this problem!** โœ…
---
## ๐Ÿ› ๏ธ How Convolution Works
We use something called a **kernel** (a small filter ๐Ÿ”ฒ) that slides over an image.
It **checks different parts** of the picture and creates a new image called an **activation map**!
1๏ธโƒฃ The **image** is a grid of numbers ๐Ÿ–ผ๏ธ
2๏ธโƒฃ The **kernel** is a small grid ๐Ÿ”ณ that moves across the image
3๏ธโƒฃ It **multiplies** numbers in the image with the numbers in the kernel โœ–๏ธ
4๏ธโƒฃ The results are **added together** โž•
5๏ธโƒฃ We move to the next spot and **repeat!** ๐Ÿ”„
6๏ธโƒฃ The final result is the **activation map** ๐ŸŽฏ
---
## ๐Ÿ“ How Big is the Activation Map?
The size of the **activation map** depends on:
- **M (image size)** ๐Ÿ“
- **K (kernel size)** ๐Ÿ”ณ
- **Stride** (how far the kernel moves) ๐Ÿ‘ฃ
Formula:
```
New size = (Image size - Kernel size) + 1
```
Example:
- **4ร—4 image** ๐Ÿ“ท
- **2ร—2 kernel** ๐Ÿ”ณ
- Activation map = **3ร—3** โœ…
---
## ๐Ÿ‘ฃ What is Stride?
Stride is **how far** the kernel moves each time!
- **Stride = 1** โž Moves **one step** at a time ๐Ÿข
- **Stride = 2** โž Moves **two steps** at a time ๐Ÿšถโ€โ™‚๏ธ
- **Bigger stride** = **Smaller** activation map! ๐Ÿ“
---
## ๐Ÿ›‘ What is Zero Padding?
Sometimes, the kernel **doesnโ€™t fit** perfectly in the image. ๐Ÿ˜•
So, we **add extra rows and columns of zeros** around the image! 0๏ธโƒฃ0๏ธโƒฃ0๏ธโƒฃ
This makes sure the **kernel covers everything**! โœ…
Formula:
```
New Image Size = Old Size + 2 ร— Padding
```
---
## ๐ŸŽจ What About Color Images?
For **black & white** images, we use **Conv2D** with **one channel** (grayscale). ๐ŸŒ‘
For **color images**, we use **three channels** (Red, Green, Blue - RGB)! ๐ŸŽจ๐ŸŒˆ
---
## ๐Ÿ† Summary
โœ… Convolution helps computers **find patterns** in images!
โœ… We use a **kernel** to create an **activation map**!
โœ… **Stride & padding** change how the convolution works!
โœ… This is how computers **"see"** images! ๐Ÿ‘€๐Ÿค–
---
๐ŸŽ‰ **Great job!** Now, letโ€™s try convolution in the lab! ๐Ÿ—๏ธ๐Ÿค–โœจ
-----------------------------------------------------------------
๐ŸŽต **Music Playing**
๐Ÿ‘‹ **Welcome!** Today, weโ€™re learning about **Activation Functions** and **Max Pooling**! ๐Ÿš€๐Ÿ”ข
## ๐Ÿค– What is an Activation Function?
Activation functions help a neural network **decide** whatโ€™s important! ๐Ÿง 
They change the values in the activation map to **help the model learn better**.
---
## ๐Ÿ”ฅ Example: ReLU Activation Function
1๏ธโƒฃ We take an **input image** ๐Ÿ–ผ๏ธ
2๏ธโƒฃ We apply **convolution** to create an **activation map** ๐Ÿ“Š
3๏ธโƒฃ We apply **ReLU (Rectified Linear Unit)**:
- **If a value is negative** โž Change it to **0** โŒ
- **If a value is positive** โž Keep it โœ…
### ๐Ÿ›  Example Calculation
| Before ReLU | After ReLU |
|-------------|------------|
| -4 | 0 |
| 0 | 0 |
| 4 | 4 |
All **negative numbers** become **zero**! โœจ
In PyTorch, we apply the ReLU function **after convolution**:
```python
import torch.nn.functional as F
output = F.relu(conv_output)
```
---
## ๐ŸŒŠ What is Max Pooling?
Max Pooling helps the network **focus on important details** while making images **smaller**! ๐Ÿ“๐Ÿ”
### ๐Ÿ— How It Works
1๏ธโƒฃ We **divide** the image into small regions (e.g., **2ร—2** squares)
2๏ธโƒฃ We **keep only the largest value** in each region
3๏ธโƒฃ We **move the window** and repeat until weโ€™ve covered the whole image
### ๐Ÿ“Š Example: 2ร—2 Max Pooling
| Before Pooling | After Pooling |
|--------------|--------------|
| 1, **6**, 2, 3 | **6**, **8** |
| 5, **8**, 7, 4 | **9**, **7** |
| **9**, 2, 3, **7** | |
**Only the biggest number** in each section is kept! โœ…
---
## ๐Ÿ† Why Use Max Pooling?
โœ… **Reduces image size** โž Makes training faster! ๐Ÿš€
โœ… **Ignores small changes** in images โž More stable results! ๐Ÿ”„
โœ… **Helps find important features** in the picture! ๐Ÿ–ผ๏ธ
In PyTorch, we apply **Max Pooling** like this:
```python
import torch.nn.functional as F
output = F.max_pool2d(activation_map, kernel_size=2, stride=2)
```
---
๐ŸŽ‰ **Great job!** Now, letโ€™s try using activation functions and max pooling in our own models! ๐Ÿ—๏ธ๐Ÿค–โœจ
------------------------------------------------------------------------------------------------------
๐ŸŽต **Music Playing**
๐Ÿ‘‹ **Welcome!** Today, weโ€™re learning about **Convolution with Multiple Channels**! ๐Ÿ–ผ๏ธ๐Ÿค–
## ๐Ÿค” Whatโ€™s a Channel?
A **channel** is like a layer of an image! ๐ŸŒˆ
- **Black & White Images** โž **1 channel** (grayscale) ๐Ÿณ๏ธ
- **Color Images** โž **3 channels** (Red, Green, Blue - RGB) ๐ŸŽจ
Neural networks **see** images by looking at these channels separately! ๐Ÿ‘€
---
## ๐ŸŽฏ 1. Multiple Output Channels
Usually, we use **one kernel** to create **one activation map** ๐Ÿ“Š
But what if we want to detect **different things** in an image? ๐Ÿค”
- **Solution:** Use **multiple kernels**! Each kernel **finds different features**! ๐Ÿ”
### ๐Ÿ”ฅ Example: Detecting Lines
1๏ธโƒฃ A **vertical line kernel** finds **vertical edges** ๐Ÿ“
2๏ธโƒฃ A **horizontal line kernel** finds **horizontal edges** ๐Ÿ“
**More kernels = More ways to see the image!** ๐Ÿ‘€โœ…
---
## ๐ŸŽจ 2. Multiple Input Channels
Color images have **3 channels** (Red, Green, Blue).
To process them, we use **a separate kernel for each channel**! ๐ŸŽจ
1๏ธโƒฃ Apply a **Red kernel** to the Red part of the image ๐Ÿ”ด
2๏ธโƒฃ Apply a **Green kernel** to the Green part of the image ๐ŸŸข
3๏ธโƒฃ Apply a **Blue kernel** to the Blue part of the image ๐Ÿ”ต
4๏ธโƒฃ **Add the results together** to get one activation map!
This helps the neural network understand **colors and patterns**! ๐ŸŒˆ
---
## ๐Ÿ”„ 3. Multiple Input & Output Channels
Now, letโ€™s **combine everything**! ๐Ÿš€
- **Multiple input channels** (like RGB images)
- **Multiple output channels** (different filters detecting different things)
Each output channel gets its own **set of kernels** for each input channel.
We **apply the kernels, add the results**, and get multiple **activation maps**! ๐ŸŽฏ
---
## ๐Ÿ— Example in PyTorch
```python
import torch.nn as nn
conv = nn.Conv2d(in_channels=3, out_channels=5, kernel_size=3)
```
This means:
โœ… **3 input channels** (Red, Green, Blue)
โœ… **5 output channels** (5 different filters detecting different things)
---
## ๐Ÿ† Why is This Important?
โœ… Helps the neural network find **different patterns** ๐ŸŽจ
โœ… Works for **color images** and **complex features** ๐Ÿค–
โœ… Makes the network **more powerful**! ๐Ÿ’ช
---
๐ŸŽ‰ **Great job!** Now, letโ€™s try convolution with multiple channels in our own models! ๐Ÿ—๏ธ๐Ÿค–โœจ
-----------------------------------------------------------------------------------------------
๐ŸŽต **Music Playing**
๐Ÿ‘‹ **Welcome!** Today, weโ€™re building a **CNN for MNIST**! ๐Ÿ—๏ธ๐Ÿ”ข
MNIST is a dataset of **handwritten numbers (0-9)**. โœ๏ธ๐Ÿ–ผ๏ธ
---
## ๐Ÿ— CNN Structure
๐Ÿ“ **Image Size:** 16ร—16 (to make training faster)
๐Ÿ”„ **Layers:**
- **First Convolution Layer** โž 16 output channels
- **Second Convolution Layer** โž 32 output channels
- **Final Layer** โž 10 output neurons (one for each digit)
---
## ๐Ÿ›  Building the CNN in PyTorch
### ๐Ÿ“Œ Step 1: Define the CNN
```python
import torch.nn as nn
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, padding=2)
self.pool = nn.MaxPool2d(kernel_size=2)
self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, padding=2)
self.fc = nn.Linear(32 * 4 * 4, 10) # Fully connected layer (512 inputs, 10 outputs)
def forward(self, x):
x = self.pool(nn.ReLU()(self.conv1(x))) # First layer: Conv + ReLU + Pool
x = self.pool(nn.ReLU()(self.conv2(x))) # Second layer: Conv + ReLU + Pool
x = x.view(-1, 512) # Flatten the 4x4x32 output to 1D (512 elements)
x = self.fc(x) # Fully connected layer for classification
return x
```
---
## ๐Ÿ” Understanding the Output Shape
After **Max Pooling**, the image shrinks to **4ร—4 pixels**.
Since we have **32 channels**, the total output is:
```
4 ร— 4 ร— 32 = 512 elements
```
Each neuron in the final layer gets **512 inputs**, and since we have **10 digits (0-9)**, we use **10 neurons**.
---
## ๐Ÿ”„ Forward Step
1๏ธโƒฃ **Apply First Convolution Layer** โž Activation โž Max Pooling
2๏ธโƒฃ **Apply Second Convolution Layer** โž Activation โž Max Pooling
3๏ธโƒฃ **Flatten the Output (4ร—4ร—32 โ†’ 512)**
4๏ธโƒฃ **Apply the Final Output Layer (10 Neurons for 10 Digits)**
---
## ๐Ÿ‹๏ธโ€โ™‚๏ธ Training the Model
Check the **lab** to see how we train the CNN using:
โœ… **Backpropagation**
โœ… **Stochastic Gradient Descent (SGD)**
โœ… **Loss Function & Accuracy Check**
---
๐ŸŽ‰ **Great job!** Now, letโ€™s train our CNN to recognize handwritten digits! ๐Ÿ—๏ธ๐Ÿ”ข๐Ÿค–
------------------------------------------------------------------------------------
๐ŸŽต **Music Playing**
๐Ÿ‘‹ **Welcome!** Today, weโ€™re learning about **Convolutional Neural Networks (CNNs)!** ๐Ÿค–๐Ÿ–ผ๏ธ
## ๐Ÿค” What is a CNN?
A **Convolutional Neural Network (CNN)** is a special type of neural network that **understands images!** ๐ŸŽจ
It learns to find patterns, like:
โœ… **Edges** (lines & shapes)
โœ… **Textures** (smooth or rough areas)
โœ… **Objects** (faces, animals, letters)
---
## ๐Ÿ— How Does a CNN Work?
A CNN is made of **three main steps**:
1๏ธโƒฃ **Convolution Layer** ๐Ÿ–ผ๏ธโž๐Ÿ”
- Uses **kernels** (small filters) to **detect patterns** in an image
- Creates an **activation map** that highlights important features
2๏ธโƒฃ **Pooling Layer** ๐Ÿ”„โž๐Ÿ“
- **Shrinks** the activation map to keep only the most important parts
- **Max Pooling** picks the **biggest** values in each small region
3๏ธโƒฃ **Fully Connected Layer** ๐Ÿ—๏ธโž๐ŸŽฏ
- The final layer makes a **decision** (like cat ๐Ÿฑ or dog ๐Ÿถ)
---
## ๐ŸŽจ Example: Detecting Lines
We train a CNN to recognize **horizontal** and **vertical** lines:
1๏ธโƒฃ **Input Image (X)**
2๏ธโƒฃ **First Convolution Layer**
- Uses **two kernels** to create two **activation maps**
- Applies **ReLU** (activation function) to remove negative values
- Uses **Max Pooling** to make learning easier
3๏ธโƒฃ **Second Convolution Layer**
- Takes **two input channels** from the first layer
- Uses **two new kernels** to create **one activation map**
- Again, applies **ReLU + Max Pooling**
4๏ธโƒฃ **Flattening** โž Turns the 2D image into **1D data**
5๏ธโƒฃ **Final Prediction** โž Uses a **fully connected layer** to decide:
- `0` = **Vertical Line**
- `1` = **Horizontal Line**
---
## ๐Ÿ”„ How to Build a CNN in PyTorch
### ๐Ÿ— CNN Constructor
```python
import torch.nn as nn
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=2, kernel_size=3, padding=1)
self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
self.conv2 = nn.Conv2d(in_channels=2, out_channels=1, kernel_size=3, padding=1)
self.fc = nn.Linear(49, 2) # Fully connected layer (49 inputs, 2 outputs)
def forward(self, x):
x = self.pool(nn.ReLU()(self.conv1(x))) # First layer: Conv + ReLU + Pool
x = self.pool(nn.ReLU()(self.conv2(x))) # Second layer: Conv + ReLU + Pool
x = x.view(-1, 49) # Flatten to 1D
x = self.fc(x) # Fully connected layer
return x
```
---
## ๐Ÿ‹๏ธโ€โ™‚๏ธ Training the CNN
We train the CNN using **backpropagation** and **gradient descent**:
1๏ธโƒฃ **Load the dataset** (images of lines) ๐Ÿ“Š
2๏ธโƒฃ **Create a CNN model** ๐Ÿ—๏ธ
3๏ธโƒฃ **Define a loss function** (to measure mistakes) โŒ
4๏ธโƒฃ **Choose an optimizer** (to improve learning) ๐Ÿ”„
5๏ธโƒฃ **Train the model** until it **gets better**! ๐Ÿš€
As training progresses:
๐Ÿ“‰ **Loss goes down** โž Model makes fewer mistakes!
๐Ÿ“ˆ **Accuracy goes up** โž Model gets better at predictions!
---
## ๐Ÿ† Why Use CNNs?
โœ… **Finds patterns** in images ๐Ÿ”
โœ… **Works with real-world data** (faces, animals, objects) ๐Ÿ–ผ๏ธ
โœ… **More efficient** than regular neural networks ๐Ÿ’ก
---
๐ŸŽ‰ **Great job!** Now, letโ€™s build and train our own CNN! ๐Ÿ—๏ธ๐Ÿค–โœจ
----------------------------------------------------------------------
๐ŸŽต **Music Playing**
๐Ÿ‘‹ **Welcome!** Today, weโ€™re building a **CNN for MNIST**! ๐Ÿ—๏ธ๐Ÿ–ผ๏ธ
MNIST is a dataset of **handwritten numbers (0-9)**. โœ๏ธ๐Ÿ”ข
---
## ๐Ÿ— CNN Structure
๐Ÿ“ **Image Size:** 16ร—16 (to make training faster)
๐Ÿ”„ **Layers:**
- **First Convolution Layer** โž 16 output channels
- **Second Convolution Layer** โž 32 output channels
- **Final Layer** โž 10 output neurons (one for each digit)
---
## ๐Ÿ›  Building the CNN in PyTorch
### ๐Ÿ”น Step 1: Define the CNN
```python
import torch.nn as nn
class CNN(nn.Module):
def __init__(self):
super(CNN, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, padding=2)
self.pool = nn.MaxPool2d(kernel_size=2)
self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, padding=2)
self.fc = nn.Linear(32 * 4 * 4, 10) # Fully connected layer (512 inputs, 10 outputs)
def forward(self, x):
x = self.pool(nn.ReLU()(self.conv1(x))) # First layer: Conv + ReLU + Pool
x = self.pool(nn.ReLU()(self.conv2(x))) # Second layer: Conv + ReLU + Pool
x = x.view(-1, 512) # Flatten the 4x4x32 output to 1D (512 elements)
x = self.fc(x) # Fully connected layer for classification
return x
```
---
## ๐Ÿ” Understanding the Output Shape
After **Max Pooling**, the image shrinks to **4ร—4 pixels**.
Since we have **32 channels**, the total output is:
```
4 ร— 4 ร— 32 = 512 elements
```
Each neuron in the final layer gets **512 inputs**, and since we have **10 digits (0-9)**, we use **10 neurons**.
---
## ๐Ÿ”„ Forward Step
1๏ธโƒฃ **Apply First Convolution Layer** โž Activation โž Max Pooling
2๏ธโƒฃ **Apply Second Convolution Layer** โž Activation โž Max Pooling
3๏ธโƒฃ **Flatten the Output (4ร—4ร—32 โ†’ 512)**
4๏ธโƒฃ **Apply the Final Output Layer (10 Neurons for 10 Digits)**
---
## ๐Ÿ‹๏ธโ€โ™‚๏ธ Training the Model
Check the **lab** to see how we train the CNN using:
โœ… **Backpropagation**
โœ… **Stochastic Gradient Descent (SGD)**
โœ… **Loss Function & Accuracy Check**
---
๐ŸŽ‰ **Great job!** Now, letโ€™s train our CNN to recognize handwritten digits! ๐Ÿ—๏ธ๐Ÿ”ข๐Ÿค–
------------------------------------------------------------------------------------
๐ŸŽต **Music Playing**
๐Ÿ‘‹ **Welcome!** Today, weโ€™re learning how to use **Pretrained TorchVision Models**! ๐Ÿค–๐Ÿ–ผ๏ธ
## ๐Ÿค” What is a Pretrained Model?
A **pretrained model** is a neural network that has already been **trained by experts** on a large dataset.
โœ… **Saves time** (no need to train from scratch) โณ
โœ… **Works better** (already optimized) ๐ŸŽฏ
โœ… **We only train the final layer** for our own images! ๐Ÿ”„
---
## ๐Ÿ”„ Using ResNet18 (A Pretrained Model)
We will use **ResNet18**, a powerful model trained on **color images**. ๐ŸŽจ
It has **skip connections** (we wonโ€™t go into details, but it helps learning).
We only **replace the last layer** to match our dataset! ๐Ÿ”
---
## ๐Ÿ›  Steps to Use a Pretrained Model
### ๐Ÿ“Œ Step 1: Load the Pretrained Model
```python
import torchvision.models as models
model = models.resnet18(pretrained=True) # Load pretrained ResNet18
```
### ๐Ÿ“Œ Step 2: Normalize Images (Required for ResNet18)
```python
import torchvision.transforms as transforms
transform = transforms.Compose([
transforms.Resize((224, 224)), # Resize image
transforms.ToTensor(), # Convert to tensor
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) # Normalize
])
```
### ๐Ÿ“Œ Step 3: Prepare the Dataset
Create a **dataset object** for your own images with **training and testing data**. ๐Ÿ“Š
### ๐Ÿ“Œ Step 4: Replace the Output Layer
- The **last hidden layer** has **512 neurons**
- We create a **new output layer** for **our dataset**
Example: **If we have 7 classes**, we create a layer with **7 outputs**:
```python
import torch.nn as nn
for param in model.parameters():
param.requires_grad = False # Freeze pretrained layers
model.fc = nn.Linear(512, 7) # Replace output layer (512 inputs โ†’ 7 outputs)
```
---
## ๐Ÿ‹๏ธโ€โ™‚๏ธ Training the Model
### ๐Ÿ“Œ Step 5: Create Data Loaders
```python
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=15, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=10, shuffle=False)
```
### ๐Ÿ“Œ Step 6: Set Up Training
```python
import torch.optim as optim
criterion = nn.CrossEntropyLoss() # Loss function
optimizer = optim.Adam(model.fc.parameters(), lr=0.001) # Optimizer (only for last layer)
```
### ๐Ÿ“Œ Step 7: Train the Model
1๏ธโƒฃ **Set model to training mode** ๐Ÿ‹๏ธ
```python
model.train()
```
2๏ธโƒฃ Train for **20 epochs**
3๏ธโƒฃ **Set model to evaluation mode** when predicting ๐Ÿ“Š
```python
model.eval()
```
---
## ๐Ÿ† Why Use Pretrained Models?
โœ… **Saves time** (no need to train from scratch)
โœ… **Works better** (pretrained on millions of images)
โœ… **We only change one layer** for our dataset!
---
๐ŸŽ‰ **Great job!** Now, try using a pretrained model for your own images! ๐Ÿ—๏ธ๐Ÿค–โœจ
---------------------------------------------------------------------------------
๐ŸŽต **Music Playing**
๐Ÿ‘‹ **Welcome!** Today, weโ€™re learning how to use **GPUs in PyTorch**! ๐Ÿš€๐Ÿ’ป
## ๐Ÿค” Why Use a GPU?
A **Graphics Processing Unit (GPU)** can **train models MUCH faster** than a CPU!
โœ… Faster computation โฉ
โœ… Better for large datasets ๐Ÿ“Š
โœ… Helps train deep learning models efficiently ๐Ÿค–
---
## ๐Ÿ”ฅ What is CUDA?
CUDA is a **special tool** made by **NVIDIA** that allows us to use **GPUs for AI tasks**. ๐ŸŽฎ๐Ÿš€
In **PyTorch**, we use **torch.cuda** to work with GPUs.
---
## ๐Ÿ›  Step 1: Check if a GPU is Available
```python
import torch
# Check if a GPU is available
torch.cuda.is_available() # Returns True if a GPU is detected
```
---
## ๐ŸŽฏ Step 2: Set Up the GPU
```python
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
```
- `"cuda:0"` = First available GPU ๐ŸŽฎ
- `"cpu"` = Use the CPU if no GPU is found
---
## ๐Ÿ— Step 3: Sending Tensors to the GPU
In PyTorch, **data is stored in Tensors**.
To move data to the GPU, use `.to(device)`.
```python
tensor = torch.randn(3, 3) # Create a random tensor
tensor = tensor.to(device) # Move it to the GPU
```
โœ… **Faster processing on the GPU!** โšก
---
## ๐Ÿ”„ Step 4: Using a GPU with a CNN
You **donโ€™t need to change** your CNN code! Just **move the model to the GPU** after creating it:
```python
model = CNN() # Create CNN model
model.to(device) # Move the model to the GPU
```
This **converts** all layers to **CUDA tensors** for GPU computation! ๐ŸŽฎ
---
## ๐Ÿ‹๏ธโ€โ™‚๏ธ Step 5: Training a Model on a GPU
Training is the same, but **you must send your data to the GPU**!
```python
for images, labels in train_loader:
images, labels = images.to(device), labels.to(device) # Move data to GPU
optimizer.zero_grad() # Clear gradients
outputs = model(images) # Forward pass (on GPU)
loss = criterion(outputs, labels) # Compute loss
loss.backward() # Backpropagation
optimizer.step() # Update weights
```
โœ… **The model trains much faster!** ๐Ÿš€
---
## ๐ŸŽฏ Step 6: Testing the Model
For testing, **only move the images** (not the labels) to the GPU:
```python
for images, labels in test_loader:
images = images.to(device) # Move images to GPU
outputs = model(images) # Get predictions
```
โœ… **Saves memory and speeds up testing!** โšก
---
## ๐Ÿ† Summary
โœ… **GPUs make training faster** ๐ŸŽฎ
โœ… Use **torch.cuda** to work with GPUs
โœ… Move **data & models** to the GPU with `.to(device)`
โœ… Training & testing are the same, but data **must be on the GPU**
---
๐ŸŽ‰ **Great job!** Now, try training a model using a GPU in PyTorch! ๐Ÿ—๏ธ๐Ÿš€