File size: 22,117 Bytes
9dce563
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
l๐ŸŽต **Music Playing**  

๐Ÿ‘‹ **Welcome!** Today, weโ€™re learning about **Convolution** in Neural Networks! ๐Ÿง ๐Ÿ–ผ๏ธ  

## ๐Ÿค” What is Convolution?  
Convolution helps computers **understand pictures** by looking at **patterns** instead of exact positions! ๐Ÿ–ผ๏ธ๐Ÿ”  

Imagine you have **two images** that look almost the same, but one is a little **moved**.  
A computer might think they are totally **different**! ๐Ÿ˜ฒ  
**Convolution fixes this problem!** โœ…  

---

## ๐Ÿ› ๏ธ How Convolution Works  

We use something called a **kernel** (a small filter ๐Ÿ”ฒ) that slides over an image.  
It **checks different parts** of the picture and creates a new image called an **activation map**!  

1๏ธโƒฃ The **image** is a grid of numbers ๐Ÿ–ผ๏ธ  
2๏ธโƒฃ The **kernel** is a small grid ๐Ÿ”ณ that moves across the image  
3๏ธโƒฃ It **multiplies** numbers in the image with the numbers in the kernel โœ–๏ธ  
4๏ธโƒฃ The results are **added together** โž•  
5๏ธโƒฃ We move to the next spot and **repeat!** ๐Ÿ”„  
6๏ธโƒฃ The final result is the **activation map** ๐ŸŽฏ  

---

## ๐Ÿ“ How Big is the Activation Map?  

The size of the **activation map** depends on:  
- **M (image size)** ๐Ÿ“  
- **K (kernel size)** ๐Ÿ”ณ  
- **Stride** (how far the kernel moves) ๐Ÿ‘ฃ  

Formula:  
```

New size = (Image size - Kernel size) + 1

```

Example:  
- **4ร—4 image** ๐Ÿ“ท  
- **2ร—2 kernel** ๐Ÿ”ณ  
- Activation map = **3ร—3** โœ…  

---

## ๐Ÿ‘ฃ What is Stride?  

Stride is **how far** the kernel moves each time!  
- **Stride = 1** โž Moves **one step** at a time ๐Ÿข  
- **Stride = 2** โž Moves **two steps** at a time ๐Ÿšถโ€โ™‚๏ธ  
- **Bigger stride** = **Smaller** activation map! ๐Ÿ“  

---

## ๐Ÿ›‘ What is Zero Padding?  

Sometimes, the kernel **doesnโ€™t fit** perfectly in the image. ๐Ÿ˜•  
So, we **add extra rows and columns of zeros** around the image! 0๏ธโƒฃ0๏ธโƒฃ0๏ธโƒฃ  

This makes sure the **kernel covers everything**! โœ…  

Formula:  
```

New Image Size = Old Size + 2 ร— Padding

```

---

## ๐ŸŽจ What About Color Images?  

For **black & white** images, we use **Conv2D** with **one channel** (grayscale). ๐ŸŒ‘  
For **color images**, we use **three channels** (Red, Green, Blue - RGB)! ๐ŸŽจ๐ŸŒˆ  

---

## ๐Ÿ† Summary  

โœ… Convolution helps computers **find patterns** in images!  
โœ… We use a **kernel** to create an **activation map**!  
โœ… **Stride & padding** change how the convolution works!  
โœ… This is how computers **"see"** images! ๐Ÿ‘€๐Ÿค–  

---

๐ŸŽ‰ **Great job!** Now, letโ€™s try convolution in the lab! ๐Ÿ—๏ธ๐Ÿค–โœจ  

-----------------------------------------------------------------

๐ŸŽต **Music Playing**  

๐Ÿ‘‹ **Welcome!** Today, weโ€™re learning about **Activation Functions** and **Max Pooling**! ๐Ÿš€๐Ÿ”ข  

## ๐Ÿค– What is an Activation Function?  

Activation functions help a neural network **decide** whatโ€™s important! ๐Ÿง   
They change the values in the activation map to **help the model learn better**.  

---

## ๐Ÿ”ฅ Example: ReLU Activation Function  

1๏ธโƒฃ We take an **input image** ๐Ÿ–ผ๏ธ  
2๏ธโƒฃ We apply **convolution** to create an **activation map** ๐Ÿ“Š  
3๏ธโƒฃ We apply **ReLU (Rectified Linear Unit)**:  
   - **If a value is negative** โž Change it to **0** โŒ  
   - **If a value is positive** โž Keep it โœ…  

### ๐Ÿ›  Example Calculation  

| Before ReLU  | After ReLU  |
|-------------|------------|
| -4  | 0  |
|  0  | 0  |
|  4  | 4  |

All **negative numbers** become **zero**! โœจ  

In PyTorch, we apply the ReLU function **after convolution**:  

```python

import torch.nn.functional as F



output = F.relu(conv_output)

```

---

## ๐ŸŒŠ What is Max Pooling?  

Max Pooling helps the network **focus on important details** while making images **smaller**! ๐Ÿ“๐Ÿ”  

### ๐Ÿ— How It Works  

1๏ธโƒฃ We **divide** the image into small regions (e.g., **2ร—2** squares)  
2๏ธโƒฃ We **keep only the largest value** in each region  
3๏ธโƒฃ We **move the window** and repeat until weโ€™ve covered the whole image  

### ๐Ÿ“Š Example: 2ร—2 Max Pooling  

| Before Pooling | After Pooling |
|--------------|--------------|
| 1, **6**, 2, 3 | **6**, **8**  |
| 5, **8**, 7, 4 | **9**, **7**  |
| **9**, 2, 3, **7** | |

**Only the biggest number** in each section is kept! โœ…  

---

## ๐Ÿ† Why Use Max Pooling?  

โœ… **Reduces image size** โž Makes training faster! ๐Ÿš€  
โœ… **Ignores small changes** in images โž More stable results! ๐Ÿ”„  
โœ… **Helps find important features** in the picture! ๐Ÿ–ผ๏ธ  

In PyTorch, we apply **Max Pooling** like this:  

```python

import torch.nn.functional as F



output = F.max_pool2d(activation_map, kernel_size=2, stride=2)

```

---

๐ŸŽ‰ **Great job!** Now, letโ€™s try using activation functions and max pooling in our own models! ๐Ÿ—๏ธ๐Ÿค–โœจ  

------------------------------------------------------------------------------------------------------
๐ŸŽต **Music Playing**  

๐Ÿ‘‹ **Welcome!** Today, weโ€™re learning about **Convolution with Multiple Channels**! ๐Ÿ–ผ๏ธ๐Ÿค–  

## ๐Ÿค” Whatโ€™s a Channel?  
A **channel** is like a layer of an image! ๐ŸŒˆ  
- **Black & White Images** โž **1 channel** (grayscale) ๐Ÿณ๏ธ  
- **Color Images** โž **3 channels** (Red, Green, Blue - RGB) ๐ŸŽจ  

Neural networks **see** images by looking at these channels separately! ๐Ÿ‘€  

---

## ๐ŸŽฏ 1. Multiple Output Channels  

Usually, we use **one kernel** to create **one activation map** ๐Ÿ“Š  
But what if we want to detect **different things** in an image? ๐Ÿค”  
- **Solution:** Use **multiple kernels**! Each kernel **finds different features**! ๐Ÿ”  

### ๐Ÿ”ฅ Example: Detecting Lines  
1๏ธโƒฃ A **vertical line kernel** finds **vertical edges** ๐Ÿ“  
2๏ธโƒฃ A **horizontal line kernel** finds **horizontal edges** ๐Ÿ“  

**More kernels = More ways to see the image!** ๐Ÿ‘€โœ…  

---

## ๐ŸŽจ 2. Multiple Input Channels  

Color images have **3 channels** (Red, Green, Blue).  
To process them, we use **a separate kernel for each channel**! ๐ŸŽจ  

1๏ธโƒฃ Apply a **Red kernel** to the Red part of the image ๐Ÿ”ด  
2๏ธโƒฃ Apply a **Green kernel** to the Green part of the image ๐ŸŸข  
3๏ธโƒฃ Apply a **Blue kernel** to the Blue part of the image ๐Ÿ”ต  
4๏ธโƒฃ **Add the results together** to get one activation map!  

This helps the neural network understand **colors and patterns**! ๐ŸŒˆ  

---

## ๐Ÿ”„ 3. Multiple Input & Output Channels  

Now, letโ€™s **combine everything**! ๐Ÿš€  
- **Multiple input channels** (like RGB images)  
- **Multiple output channels** (different filters detecting different things)  

Each output channel gets its own **set of kernels** for each input channel.  
We **apply the kernels, add the results**, and get multiple **activation maps**! ๐ŸŽฏ  

---

## ๐Ÿ— Example in PyTorch  

```python

import torch.nn as nn



conv = nn.Conv2d(in_channels=3, out_channels=5, kernel_size=3)  

```

This means:  
โœ… **3 input channels** (Red, Green, Blue)  
โœ… **5 output channels** (5 different filters detecting different things)  

---

## ๐Ÿ† Why is This Important?  

โœ… Helps the neural network find **different patterns** ๐ŸŽจ  
โœ… Works for **color images** and **complex features** ๐Ÿค–  
โœ… Makes the network **more powerful**! ๐Ÿ’ช  

---

๐ŸŽ‰ **Great job!** Now, letโ€™s try convolution with multiple channels in our own models! ๐Ÿ—๏ธ๐Ÿค–โœจ  
-----------------------------------------------------------------------------------------------
๐ŸŽต **Music Playing**  

๐Ÿ‘‹ **Welcome!** Today, weโ€™re building a **CNN for MNIST**! ๐Ÿ—๏ธ๐Ÿ”ข  
MNIST is a dataset of **handwritten numbers (0-9)**. โœ๏ธ๐Ÿ–ผ๏ธ  

---

## ๐Ÿ— CNN Structure  

๐Ÿ“ **Image Size:** 16ร—16 (to make training faster)  
๐Ÿ”„ **Layers:**  
- **First Convolution Layer** โž 16 output channels  
- **Second Convolution Layer** โž 32 output channels  
- **Final Layer** โž 10 output neurons (one for each digit)  

---

## ๐Ÿ›  Building the CNN in PyTorch  

### ๐Ÿ“Œ Step 1: Define the CNN  

```python

import torch.nn as nn



class CNN(nn.Module):

    def __init__(self):

        super(CNN, self).__init__()

        self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, padding=2)  

        self.pool = nn.MaxPool2d(kernel_size=2)  

        self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, padding=2)  

        self.fc = nn.Linear(32 * 4 * 4, 10)  # Fully connected layer (512 inputs, 10 outputs)



    def forward(self, x):

        x = self.pool(nn.ReLU()(self.conv1(x)))  # First layer: Conv + ReLU + Pool

        x = self.pool(nn.ReLU()(self.conv2(x)))  # Second layer: Conv + ReLU + Pool

        x = x.view(-1, 512)  # Flatten the 4x4x32 output to 1D (512 elements)

        x = self.fc(x)  # Fully connected layer for classification

        return x

```

---

## ๐Ÿ” Understanding the Output Shape  

After **Max Pooling**, the image shrinks to **4ร—4 pixels**.  
Since we have **32 channels**, the total output is:  
```

4 ร— 4 ร— 32 = 512 elements

```
Each neuron in the final layer gets **512 inputs**, and since we have **10 digits (0-9)**, we use **10 neurons**.  

---

## ๐Ÿ”„ Forward Step  

1๏ธโƒฃ **Apply First Convolution Layer** โž Activation โž Max Pooling  
2๏ธโƒฃ **Apply Second Convolution Layer** โž Activation โž Max Pooling  
3๏ธโƒฃ **Flatten the Output (4ร—4ร—32 โ†’ 512)**  
4๏ธโƒฃ **Apply the Final Output Layer (10 Neurons for 10 Digits)**  

---

## ๐Ÿ‹๏ธโ€โ™‚๏ธ Training the Model  

Check the **lab** to see how we train the CNN using:  
โœ… **Backpropagation**  
โœ… **Stochastic Gradient Descent (SGD)**  
โœ… **Loss Function & Accuracy Check**  

---

๐ŸŽ‰ **Great job!** Now, letโ€™s train our CNN to recognize handwritten digits! ๐Ÿ—๏ธ๐Ÿ”ข๐Ÿค–  
------------------------------------------------------------------------------------
๐ŸŽต **Music Playing**  

๐Ÿ‘‹ **Welcome!** Today, weโ€™re learning about **Convolutional Neural Networks (CNNs)!** ๐Ÿค–๐Ÿ–ผ๏ธ  

## ๐Ÿค” What is a CNN?  
A **Convolutional Neural Network (CNN)** is a special type of neural network that **understands images!** ๐ŸŽจ  
It learns to find patterns, like:  
โœ… **Edges** (lines & shapes)  
โœ… **Textures** (smooth or rough areas)  
โœ… **Objects** (faces, animals, letters)  

---

## ๐Ÿ— How Does a CNN Work?  

A CNN is made of **three main steps**:  

1๏ธโƒฃ **Convolution Layer** ๐Ÿ–ผ๏ธโž๐Ÿ”  
   - Uses **kernels** (small filters) to **detect patterns** in an image  
   - Creates an **activation map** that highlights important features  

2๏ธโƒฃ **Pooling Layer** ๐Ÿ”„โž๐Ÿ“  
   - **Shrinks** the activation map to keep only the most important parts  
   - **Max Pooling** picks the **biggest** values in each small region  

3๏ธโƒฃ **Fully Connected Layer** ๐Ÿ—๏ธโž๐ŸŽฏ  
   - The final layer makes a **decision** (like cat ๐Ÿฑ or dog ๐Ÿถ)  

---

## ๐ŸŽจ Example: Detecting Lines  

We train a CNN to recognize **horizontal** and **vertical** lines:  

1๏ธโƒฃ **Input Image (X)**  
2๏ธโƒฃ **First Convolution Layer**  
   - Uses **two kernels** to create two **activation maps**  
   - Applies **ReLU** (activation function) to remove negative values  
   - Uses **Max Pooling** to make learning easier  

3๏ธโƒฃ **Second Convolution Layer**  
   - Takes **two input channels** from the first layer  
   - Uses **two new kernels** to create **one activation map**  
   - Again, applies **ReLU + Max Pooling**  

4๏ธโƒฃ **Flattening** โž Turns the 2D image into **1D data**  
5๏ธโƒฃ **Final Prediction** โž Uses a **fully connected layer** to decide:  
   - `0` = **Vertical Line**  
   - `1` = **Horizontal Line**  

---

## ๐Ÿ”„ How to Build a CNN in PyTorch  

### ๐Ÿ— CNN Constructor  
```python

import torch.nn as nn



class CNN(nn.Module):

    def __init__(self):

        super(CNN, self).__init__()

        self.conv1 = nn.Conv2d(in_channels=1, out_channels=2, kernel_size=3, padding=1)

        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)

        self.conv2 = nn.Conv2d(in_channels=2, out_channels=1, kernel_size=3, padding=1)

        self.fc = nn.Linear(49, 2)  # Fully connected layer (49 inputs, 2 outputs)



    def forward(self, x):

        x = self.pool(nn.ReLU()(self.conv1(x)))  # First layer: Conv + ReLU + Pool

        x = self.pool(nn.ReLU()(self.conv2(x)))  # Second layer: Conv + ReLU + Pool

        x = x.view(-1, 49)  # Flatten to 1D

        x = self.fc(x)  # Fully connected layer

        return x

```

---

## ๐Ÿ‹๏ธโ€โ™‚๏ธ Training the CNN  

We train the CNN using **backpropagation** and **gradient descent**:  

1๏ธโƒฃ **Load the dataset** (images of lines) ๐Ÿ“Š  
2๏ธโƒฃ **Create a CNN model** ๐Ÿ—๏ธ  
3๏ธโƒฃ **Define a loss function** (to measure mistakes) โŒ  
4๏ธโƒฃ **Choose an optimizer** (to improve learning) ๐Ÿ”„  
5๏ธโƒฃ **Train the model** until it **gets better**! ๐Ÿš€  

As training progresses:  
๐Ÿ“‰ **Loss goes down** โž Model makes fewer mistakes!  
๐Ÿ“ˆ **Accuracy goes up** โž Model gets better at predictions!  

---

## ๐Ÿ† Why Use CNNs?  

โœ… **Finds patterns** in images ๐Ÿ”  
โœ… **Works with real-world data** (faces, animals, objects) ๐Ÿ–ผ๏ธ  
โœ… **More efficient** than regular neural networks ๐Ÿ’ก  

---

๐ŸŽ‰ **Great job!** Now, letโ€™s build and train our own CNN! ๐Ÿ—๏ธ๐Ÿค–โœจ  
----------------------------------------------------------------------

๐ŸŽต **Music Playing**  

๐Ÿ‘‹ **Welcome!** Today, weโ€™re building a **CNN for MNIST**! ๐Ÿ—๏ธ๐Ÿ–ผ๏ธ  
MNIST is a dataset of **handwritten numbers (0-9)**. โœ๏ธ๐Ÿ”ข  

---

## ๐Ÿ— CNN Structure  

๐Ÿ“ **Image Size:** 16ร—16 (to make training faster)  
๐Ÿ”„ **Layers:**  
- **First Convolution Layer** โž 16 output channels  
- **Second Convolution Layer** โž 32 output channels  
- **Final Layer** โž 10 output neurons (one for each digit)  

---

## ๐Ÿ›  Building the CNN in PyTorch  

### ๐Ÿ”น Step 1: Define the CNN  

```python

import torch.nn as nn



class CNN(nn.Module):

    def __init__(self):

        super(CNN, self).__init__()

        self.conv1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, padding=2)  

        self.pool = nn.MaxPool2d(kernel_size=2)  

        self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, padding=2)  

        self.fc = nn.Linear(32 * 4 * 4, 10)  # Fully connected layer (512 inputs, 10 outputs)



    def forward(self, x):

        x = self.pool(nn.ReLU()(self.conv1(x)))  # First layer: Conv + ReLU + Pool

        x = self.pool(nn.ReLU()(self.conv2(x)))  # Second layer: Conv + ReLU + Pool

        x = x.view(-1, 512)  # Flatten the 4x4x32 output to 1D (512 elements)

        x = self.fc(x)  # Fully connected layer for classification

        return x

```

---

## ๐Ÿ” Understanding the Output Shape  

After **Max Pooling**, the image shrinks to **4ร—4 pixels**.  
Since we have **32 channels**, the total output is:  
```

4 ร— 4 ร— 32 = 512 elements

```
Each neuron in the final layer gets **512 inputs**, and since we have **10 digits (0-9)**, we use **10 neurons**.  

---

## ๐Ÿ”„ Forward Step  

1๏ธโƒฃ **Apply First Convolution Layer** โž Activation โž Max Pooling  
2๏ธโƒฃ **Apply Second Convolution Layer** โž Activation โž Max Pooling  
3๏ธโƒฃ **Flatten the Output (4ร—4ร—32 โ†’ 512)**  
4๏ธโƒฃ **Apply the Final Output Layer (10 Neurons for 10 Digits)**  

---

## ๐Ÿ‹๏ธโ€โ™‚๏ธ Training the Model  

Check the **lab** to see how we train the CNN using:  
โœ… **Backpropagation**  
โœ… **Stochastic Gradient Descent (SGD)**  
โœ… **Loss Function & Accuracy Check**  

---

๐ŸŽ‰ **Great job!** Now, letโ€™s train our CNN to recognize handwritten digits! ๐Ÿ—๏ธ๐Ÿ”ข๐Ÿค–  
------------------------------------------------------------------------------------
๐ŸŽต **Music Playing**  

๐Ÿ‘‹ **Welcome!** Today, weโ€™re learning how to use **Pretrained TorchVision Models**! ๐Ÿค–๐Ÿ–ผ๏ธ  

## ๐Ÿค” What is a Pretrained Model?  

A **pretrained model** is a neural network that has already been **trained by experts** on a large dataset.  
โœ… **Saves time** (no need to train from scratch) โณ  
โœ… **Works better** (already optimized) ๐ŸŽฏ  
โœ… **We only train the final layer** for our own images! ๐Ÿ”„  

---

## ๐Ÿ”„ Using ResNet18 (A Pretrained Model)  

We will use **ResNet18**, a powerful model trained on **color images**. ๐ŸŽจ  
It has **skip connections** (we wonโ€™t go into details, but it helps learning).  

We only **replace the last layer** to match our dataset! ๐Ÿ”  

---

## ๐Ÿ›  Steps to Use a Pretrained Model  

### ๐Ÿ“Œ Step 1: Load the Pretrained Model  
```python

import torchvision.models as models



model = models.resnet18(pretrained=True)  # Load pretrained ResNet18

```

### ๐Ÿ“Œ Step 2: Normalize Images (Required for ResNet18)  
```python

import torchvision.transforms as transforms



transform = transforms.Compose([

    transforms.Resize((224, 224)),  # Resize image

    transforms.ToTensor(),  # Convert to tensor

    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])  # Normalize

])

```

### ๐Ÿ“Œ Step 3: Prepare the Dataset  
Create a **dataset object** for your own images with **training and testing data**. ๐Ÿ“Š  

### ๐Ÿ“Œ Step 4: Replace the Output Layer  
- The **last hidden layer** has **512 neurons**  
- We create a **new output layer** for **our dataset**  

Example: **If we have 7 classes**, we create a layer with **7 outputs**:  
```python

import torch.nn as nn



for param in model.parameters():

    param.requires_grad = False  # Freeze pretrained layers



model.fc = nn.Linear(512, 7)  # Replace output layer (512 inputs โ†’ 7 outputs)

```

---

## ๐Ÿ‹๏ธโ€โ™‚๏ธ Training the Model  

### ๐Ÿ“Œ Step 5: Create Data Loaders  
```python

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=15, shuffle=True)

test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=10, shuffle=False)

```

### ๐Ÿ“Œ Step 6: Set Up Training  
```python

import torch.optim as optim



criterion = nn.CrossEntropyLoss()  # Loss function

optimizer = optim.Adam(model.fc.parameters(), lr=0.001)  # Optimizer (only for last layer)

```

### ๐Ÿ“Œ Step 7: Train the Model  
1๏ธโƒฃ **Set model to training mode** ๐Ÿ‹๏ธ  
```python

model.train()

```  
2๏ธโƒฃ Train for **20 epochs**  
3๏ธโƒฃ **Set model to evaluation mode** when predicting ๐Ÿ“Š  
```python

model.eval()

```  

---

## ๐Ÿ† Why Use Pretrained Models?  

โœ… **Saves time** (no need to train from scratch)  
โœ… **Works better** (pretrained on millions of images)  
โœ… **We only change one layer** for our dataset!  

---

๐ŸŽ‰ **Great job!** Now, try using a pretrained model for your own images! ๐Ÿ—๏ธ๐Ÿค–โœจ  
---------------------------------------------------------------------------------
๐ŸŽต **Music Playing**  

๐Ÿ‘‹ **Welcome!** Today, weโ€™re learning how to use **GPUs in PyTorch**! ๐Ÿš€๐Ÿ’ป  

## ๐Ÿค” Why Use a GPU?  
A **Graphics Processing Unit (GPU)** can **train models MUCH faster** than a CPU!  
โœ… Faster computation โฉ  
โœ… Better for large datasets ๐Ÿ“Š  
โœ… Helps train deep learning models efficiently ๐Ÿค–  

---

## ๐Ÿ”ฅ What is CUDA?  
CUDA is a **special tool** made by **NVIDIA** that allows us to use **GPUs for AI tasks**. ๐ŸŽฎ๐Ÿš€  
In **PyTorch**, we use **torch.cuda** to work with GPUs.  

---

## ๐Ÿ›  Step 1: Check if a GPU is Available  

```python

import torch



# Check if a GPU is available

torch.cuda.is_available()  # Returns True if a GPU is detected

```

---

## ๐ŸŽฏ Step 2: Set Up the GPU  

```python

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

```

- `"cuda:0"` = First available GPU ๐ŸŽฎ  
- `"cpu"` = Use the CPU if no GPU is found  

---

## ๐Ÿ— Step 3: Sending Tensors to the GPU  

In PyTorch, **data is stored in Tensors**.  
To move data to the GPU, use `.to(device)`.  

```python

tensor = torch.randn(3, 3)  # Create a random tensor

tensor = tensor.to(device)  # Move it to the GPU

```

โœ… **Faster processing on the GPU!** โšก  

---

## ๐Ÿ”„ Step 4: Using a GPU with a CNN  

You **donโ€™t need to change** your CNN code! Just **move the model to the GPU** after creating it:  

```python

model = CNN()  # Create CNN model

model.to(device)  # Move the model to the GPU

```

This **converts** all layers to **CUDA tensors** for GPU computation! ๐ŸŽฎ  

---

## ๐Ÿ‹๏ธโ€โ™‚๏ธ Step 5: Training a Model on a GPU  

Training is the same, but **you must send your data to the GPU**!  

```python

for images, labels in train_loader:

    images, labels = images.to(device), labels.to(device)  # Move data to GPU

    optimizer.zero_grad()  # Clear gradients

    outputs = model(images)  # Forward pass (on GPU)

    loss = criterion(outputs, labels)  # Compute loss

    loss.backward()  # Backpropagation

    optimizer.step()  # Update weights

```

โœ… **The model trains much faster!** ๐Ÿš€  

---

## ๐ŸŽฏ Step 6: Testing the Model  

For testing, **only move the images** (not the labels) to the GPU:  

```python

for images, labels in test_loader:

    images = images.to(device)  # Move images to GPU

    outputs = model(images)  # Get predictions

```

โœ… **Saves memory and speeds up testing!** โšก  

---

## ๐Ÿ† Summary  

โœ… **GPUs make training faster** ๐ŸŽฎ  
โœ… Use **torch.cuda** to work with GPUs  
โœ… Move **data & models** to the GPU with `.to(device)`  
โœ… Training & testing are the same, but data **must be on the GPU**  

---

๐ŸŽ‰ **Great job!** Now, try training a model using a GPU in PyTorch! ๐Ÿ—๏ธ๐Ÿš€