File size: 3,926 Bytes
296a12e
 
 
 
 
c58b867
f42b2ce
 
c58b867
 
 
 
 
9e2c742
ee87f3c
 
 
 
 
 
 
f42b2ce
296a12e
c58b867
0075336
 
 
 
f42b2ce
 
 
0075336
 
b38b8a3
0075336
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57ff00c
0075336
 
 
 
 
 
 
296a12e
 
 
 
 
 
 
 
 
9e2c742
0075336
57ff00c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0075336
 
 
 
 
 
 
 
 
57ff00c
 
 
 
0075336
57ff00c
 
0075336
 
 
 
57ff00c
0075336
 
 
 
 
 
 
 
 
 
 
b38b8a3
0075336
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
---
title: CNN
app_file: main.py
sdk: gradio
sdk_version: 5.35.0
license: mit
authors:
- Rahuletto
language:
- en
metrics:
- accuracy
pipeline_tag: image-classification
python_version: 3.12.10
datasets:
- uoft-cs/cifar10
tags:
- image
- classification
- cifar
- cnn
- spaces
---

# CNN with CIFAR-10

A PyTorch implementation of a Convolutional Neural Network (CNN) for image classification on the CIFAR-10 dataset, achieving **81.45% test accuracy**.

> Try it out! 
> https://huggingface.co/spaces/Rahuletto/CNN

## Architecture

![CNN Architecture](assets/architecture.png)

The CNN model consists of
### Convolutional Layers:
- **Conv1**: 3 → 32 channels, 3x3 kernel, padding=1
- **Conv2**: 32 → 64 channels, 3x3 kernel, padding=1  
- **Conv3**: 64 → 128 channels, 3x3 kernel, padding=1

### Others
- **Batch Normalization** after each convolutional layer
- **MaxPooling2D** (2x2) for downsampling
- **ReLU** activation functions
- **Fully Connected Layers**: 2048 → 512 → 10
- **Dropout** (50%) for regularization


## Getting Started

### Prerequisites
- Python 3.12+
- PyTorch 2.7.1+
- torchvision 0.22.1+

> [!TIP]
> This project was developed with `uv`, so it is best to use `uv` for project management.

### Installation

1. **Clone the repository:**
   ```bash
   git clone https://github.com/rahuletto/cnn
   cd CNN
   ```

2. **Create virtual environment:**
   ```bash
   python -m venv .venv
   source .venv/bin/activate  # Windows: .venv\Scripts\activate
   ```

3. **Install dependencies:**
   ```bash
   pip install -r requirements.txt
   ```

### Running the model

1. Run the main file
```bash
python main.py
```

You can play around in the gradio interface

https://github.com/user-attachments/assets/1f742c32-79bd-4d16-a74f-68c241f4a841

## Model Code
```py
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, 3, stride=1, padding=1)  # 32x32 -> 16x16
        self.bn1 = nn.BatchNorm2d(32)
        self.conv2 = nn.Conv2d(32, 64, 3, stride=1, padding=1)  # 16x16 -> 8x8
        self.bn2 = nn.BatchNorm2d(64)
        self.conv3 = nn.Conv2d(64, 128, 3, stride=1, padding=1)  # 8x8 -> 4x4
        self.bn3 = nn.BatchNorm2d(128)
        self.pool = nn.MaxPool2d(stride=2, kernel_size=2)
        self.fc1 = nn.Linear(128 * 4 * 4, 512)
        self.fc2 = nn.Linear(512, 10)
        self.dropout = nn.Dropout(0.5)

    def forward(self, x):
        x = self.pool(F.relu(self.bn1(self.conv1(x))))
        x = self.pool(F.relu(self.bn2(self.conv2(x))))
        x = self.pool(F.relu(self.bn3(self.conv3(x))))
        x = x.view(x.size(0), -1)
        x = self.dropout(x)
        x = F.relu(self.fc1(x))
        x = self.dropout(x)
        x = self.fc2(x)
        return x
```


## Training Configuration

- **Optimizer**: Adam (lr=0.001)
- **Batch Size**: 64
- **Epochs**: 50

> Best model checkpoint was saved at epoch 49 with validation loss of 0.6553.


# Model
There are two CNN models in `cnn/` folder
- `model.pt`
- `model-old.pt`

`model.pt` was trained with `BatchNorm2d` to reach 81.45% accuracy in CIFAR-10 dataset
`model-old.pt` was trained without fine tuning which gets 75% accuracy in CIFAR-10 dataset

### Accuracy:

Total Accuracy: `81.45%`

- **Airplane**: `84.60%`
- **Automobile**: `93.20%`
- **Bird**: `76.90%`
- **Cat**: `69.70%`
- **Deer**: `77.20%`
- **Dog**: `64.00%`
- **Frog**: `89.30%`
- **Horse**: `82.10%`
- **Ship**: `89.60%`
- **Truck**: `87.90%`

![Accuracy Benchmark](assets/accuracy.png)

---

## References

- [CIFAR-10 Dataset](https://www.cs.toronto.edu/~kriz/cifar.html)
- [PyTorch Documentation](https://pytorch.org/docs/)
- [Convolutional Neural Networks for Visual Recognition (CS231n)](http://cs231n.stanford.edu/)
- [Deep Learning Book - Ian Goodfellow](https://www.deeplearningbook.org/)

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.