LWWZH commited on
Commit
32f07a2
Β·
verified Β·
1 Parent(s): 6688acd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +112 -111
README.md CHANGED
@@ -1,111 +1,112 @@
1
- ---
2
- license: mit
3
- library_name: pytorch
4
- tags:
5
- - image-classification
6
- - cifar10
7
- - cnn
8
- - computer-vision
9
- - pytorch
10
- - mini-vision
11
- - mini-vision-series
12
- metrics:
13
- - accuracy
14
- pipeline_tag: image-classification
15
- datasets:
16
- - uoft-cs/cifar10
17
- ---
18
-
19
- # Mini-Vision-V1: CIFAR-10 CNN Classifier
20
-
21
- ![Model Size](https://img.shields.io/badge/Params-1.34M-blue) ![Accuracy](https://img.shields.io/badge/Accuracy-78%25-green)
22
-
23
- Welcome to **Mini-Vision-V1**, the first model in the Mini-Vision series. This project demonstrates a robust implementation of a Convolutional Neural Network (CNN) for image classification using the CIFAR-10 dataset. It is designed to be lightweight, efficient, and easy to understand, making it perfect for beginners learning PyTorch.
24
-
25
- ## Model Description
26
-
27
- Mini-Vision-V1 is a custom 4-layer CNN architecture. It utilizes Batch Normalization and Dropout to prevent overfitting and ensure stable training. With only **1.34M parameters**, it achieves a competitive accuracy on the CIFAR-10 test set.
28
-
29
- - **Dataset**: [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) (32x32 color images, 10 classes)
30
- - **Framework**: PyTorch
31
- - **Total Parameters**: 1.34M
32
-
33
- ## Model Architecture
34
-
35
- The network consists of 4 convolutional blocks followed by a classifier head.
36
-
37
- | Layer | Input Channels | Output Channels | Kernel Size | Stride | Padding | Activation | Other |
38
- | :--- | :---: | :---: | :---: | :---: | :---: | :--- | :--- |
39
- | **Conv Block 1** | 3 | 32 | 5 | 1 | 2 | ReLU | MaxPool(2), BatchNorm |
40
- | **Conv Block 2** | 32 | 64 | 5 | 1 | 2 | ReLU | MaxPool(2), BatchNorm |
41
- | **Conv Block 3** | 64 | 128 | 5 | 1 | 2 | ReLU | MaxPool(2), BatchNorm |
42
- | **Conv Block 4** | 128 | 256 | 5 | 1 | 2 | ReLU | MaxPool(2), BatchNorm |
43
- | **Flatten** | - | - | - | - | - | - | Output: 1024 |
44
- | **Linear 1** | 1024 | 256 | - | - | - | ReLU | Dropout(0.5) |
45
- | **Linear 2** | 256 | 10 | - | - | - | - | - |
46
-
47
- ## Training Strategy
48
-
49
- The model was trained using standard practices for CIFAR-10 to maximize performance on a small footprint.
50
-
51
- - **Optimizer**: SGD (Momentum=0.9)
52
- - **Initial Learning Rate**: 0.007
53
- - **Scheduler**: StepLR (Step size=5, Gamma=0.5)
54
- - **Loss Function**: CrossEntropyLoss
55
- - **Batch Size**: 256
56
- - **Epochs**: Total 100 epochs, Best Accuracy 31 epoch
57
- - **Data Augmentation**:
58
- - Random Crop (32x32 with padding=4)
59
- - Random Horizontal Flip
60
-
61
- ## Performance
62
-
63
- The model achieved the following results on the CIFAR-10 test set:
64
-
65
- | Metric | Value |
66
- | :--- | :---: |
67
- | **Test Accuracy** | **78%** |
68
- | Parameters | 1.34M |
69
-
70
- ### Training Visualization (TensorBoard)
71
-
72
- Below are the training and testing curves visualized via TensorBoard.
73
-
74
- #### 1. Training Loss
75
-
76
- ![Training Loss](assets/train_loss.png)
77
- *(Recorded every step)*
78
-
79
- #### 2. Test Loss
80
- ![Test Loss](assets/test_loss.png)
81
- *(Recorded every epoch)*
82
-
83
- ## Quick Start
84
-
85
- ### Dependencies
86
- - Python 3.x
87
- - PyTorch
88
- - Torchvision
89
- - requirements.txt
90
-
91
- ### Inference
92
-
93
- You can easily load the model and perform inference on a single image using the **test.py** file.
94
-
95
- ## File Structure
96
-
97
- ```
98
- .
99
- β”œβ”€β”€ model.py # Model architecture definition
100
- β”œβ”€β”€ train.py # Training script
101
- β”œβ”€β”€ test.py # Inference script
102
- β”œβ”€β”€ Mini-Vision-V1.pth # Trained model weights
103
- β”œβ”€β”€ README.md
104
- └── assets
105
- β”œβ”€β”€ train_loss.png # Visualized train loss graph
106
- └── test_loss.png # Visualized test loss graph
107
- ```
108
-
109
- ## License
110
-
111
- This project is licensed under the MIT License.
 
 
1
+ ---
2
+ license: mit
3
+ library_name: pytorch
4
+ tags:
5
+ - image-classification
6
+ - cifar10
7
+ - cnn
8
+ - computer-vision
9
+ - pytorch
10
+ - mini-vision
11
+ - mini-vision-series
12
+ metrics:
13
+ - accuracy
14
+ pipeline_tag: image-classification
15
+ datasets:
16
+ - uoft-cs/cifar10
17
+ ---
18
+
19
+ # Mini-Vision-V1: CIFAR-10 CNN Classifier
20
+
21
+ ![Model Size](https://img.shields.io/badge/Params-1.34M-blue) ![Accuracy](https://img.shields.io/badge/Accuracy-78%25-green)
22
+
23
+ Welcome to **Mini-Vision-V1**, the first model in the Mini-Vision series. This project demonstrates a robust implementation of a Convolutional Neural Network (CNN) for image classification using the CIFAR-10 dataset. It is designed to be lightweight, efficient, and easy to understand, making it perfect for beginners learning PyTorch.
24
+
25
+ ## Model Description
26
+
27
+ Mini-Vision-V1 is a custom 4-layer CNN architecture. It utilizes Batch Normalization and Dropout to prevent overfitting and ensure stable training. With only **1.34M parameters**, it achieves a competitive accuracy on the CIFAR-10 test set.
28
+
29
+ - **Dataset**: [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) (32x32 color images, 10 classes)
30
+ - **Framework**: PyTorch
31
+ - **Total Parameters**: 1.34M
32
+
33
+ ## Model Architecture
34
+
35
+ The network consists of 4 convolutional blocks followed by a classifier head.
36
+
37
+ | Layer | Input Channels | Output Channels | Kernel Size | Stride | Padding | Activation | Other |
38
+ | :--- | :---: | :---: | :---: | :---: | :---: | :--- | :--- |
39
+ | **Conv Block 1** | 3 | 32 | 5 | 1 | 2 | ReLU | MaxPool(2), BatchNorm |
40
+ | **Conv Block 2** | 32 | 64 | 5 | 1 | 2 | ReLU | MaxPool(2), BatchNorm |
41
+ | **Conv Block 3** | 64 | 128 | 5 | 1 | 2 | ReLU | MaxPool(2), BatchNorm |
42
+ | **Conv Block 4** | 128 | 256 | 5 | 1 | 2 | ReLU | MaxPool(2), BatchNorm |
43
+ | **Flatten** | - | - | - | - | - | - | Output: 1024 |
44
+ | **Linear 1** | 1024 | 256 | - | - | - | ReLU | Dropout(0.5) |
45
+ | **Linear 2** | 256 | 10 | - | - | - | - | - |
46
+
47
+ ## Training Strategy
48
+
49
+ The model was trained using standard practices for CIFAR-10 to maximize performance on a small footprint.
50
+
51
+ - **Optimizer**: SGD (Momentum=0.9)
52
+ - **Initial Learning Rate**: 0.007
53
+ - **Scheduler**: StepLR (Step size=5, Gamma=0.5)
54
+ - **Loss Function**: CrossEntropyLoss
55
+ - **Batch Size**: 256
56
+ - **Epochs**: Total 100 epochs, Best Accuracy 31 epoch
57
+ - **Data Augmentation**:
58
+ - Random Crop (32x32 with padding=4)
59
+ - Random Horizontal Flip
60
+
61
+ ## Performance
62
+
63
+ The model achieved the following results on the CIFAR-10 test set:
64
+
65
+ | Metric | Value |
66
+ | :--- | :---: |
67
+ | **Test Accuracy** | **78%** |
68
+ | Parameters | 1.34M |
69
+
70
+ ### Training Visualization (TensorBoard)
71
+
72
+ Below are the training and testing curves visualized via TensorBoard.
73
+
74
+ #### 1. Training Loss
75
+
76
+ ![Training Loss](assets/train_loss.png)
77
+ *(Recorded every step)*
78
+
79
+ #### 2. Test Loss
80
+ ![Test Loss](assets/test_loss.png)
81
+ *(Recorded every epoch)*
82
+
83
+ ## Quick Start
84
+
85
+ ### Dependencies
86
+ - Python 3.x
87
+ - PyTorch
88
+ - Torchvision
89
+ - requirements.txt
90
+
91
+ ### Inference
92
+
93
+ You can easily load the model and perform inference on a single image using the **test.py** file.
94
+
95
+ ## File Structure
96
+
97
+ ```
98
+ .
99
+ β”œβ”€β”€ model.py # Model architecture definition
100
+ β”œβ”€β”€ train.py # Training script
101
+ β”œβ”€β”€ test.py # Inference script
102
+ β”œβ”€β”€ Mini-Vision-V1.pth # Trained model weights
103
+ β”œβ”€β”€ Config.json
104
+ β”œβ”€β”€ README.md
105
+ └── assets
106
+ β”œβ”€β”€ train_loss.png # Visualized train loss graph
107
+ └── test_loss.png # Visualized test loss graph
108
+ ```
109
+
110
+ ## License
111
+
112
+ This project is licensed under the MIT License.