File size: 9,287 Bytes
21f4ad5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
"""
QUICK START GUIDE - How to Run the Improved MNIST Classifier
===============================================================

Follow these steps to get started quickly!
"""

# STEP 1: INSTALLATION
# ====================

"""
1. Make sure you have Python 3.8+ installed
   Check with: python --version or python3 --version

2. Create a new folder for your project and put all the files there:
   - improved_mnist_classifier.py
   - config.yaml
   - requirements.txt
   - inference.py

3. Open terminal/command prompt in that folder
"""

# Windows:
# cd C:\path\to\your\folder

# Mac/Linux:
# cd /path/to/your/folder

"""
4. Install required packages:
"""

# OPTION A - Using pip directly (recommended):
pip install torch torchvision numpy matplotlib seaborn tqdm scikit-learn tensorboard PyYAML Pillow

# OPTION B - Using requirements.txt:
pip install -r requirements.txt

# If you get permission errors, try:
pip install --user -r requirements.txt


# STEP 2: BASIC TRAINING (SIMPLEST WAY)
# ======================================

"""
Run this command to start training with default settings:
"""

# CPU only (slower, works everywhere):
python improved_mnist_classifier.py

# GPU (if you have NVIDIA GPU with CUDA):
python improved_mnist_classifier.py --use-gpu

# GPU with mixed precision (fastest):
python improved_mnist_classifier.py --use-gpu --use-amp


# STEP 3: MONITOR TRAINING (OPTIONAL)
# ====================================

"""
While training is running, open a NEW terminal window and run:
"""

tensorboard --logdir=./runs

"""
Then open your web browser and go to:
http://localhost:6006

You'll see real-time graphs of training progress!
"""


# STEP 4: CUSTOMIZED TRAINING
# ============================

"""
You can customize many settings:
"""

# Train for 30 epochs instead of 20:
python improved_mnist_classifier.py --epochs 30 --use-gpu

# Use larger batch size (faster but needs more memory):
python improved_mnist_classifier.py --batch-size 256 --use-gpu

# Try fully connected network instead of CNN:
python improved_mnist_classifier.py --model-type fc --use-gpu

# Change learning rate:
python improved_mnist_classifier.py --lr 0.0005 --use-gpu

# Combine multiple options:
python improved_mnist_classifier.py --epochs 25 --batch-size 256 --lr 0.001 --use-gpu --use-amp


# STEP 5: AFTER TRAINING COMPLETES
# =================================

"""
Training will create several folders and files:

checkpoints/
  β”œβ”€β”€ best_model.pth              ← Your trained model
  β”œβ”€β”€ training.log                ← Training logs
  β”œβ”€β”€ training_history.json       ← Loss and accuracy data
  β”œβ”€β”€ classification_report.txt   ← Detailed metrics
  β”œβ”€β”€ training_curves.png         ← Training graphs
  β”œβ”€β”€ confusion_matrix.png        ← Error analysis
  └── predictions.png             ← Sample predictions

runs/                             ← TensorBoard logs
data/                             ← MNIST dataset (auto-downloaded)
"""


# STEP 6: MAKE PREDICTIONS ON YOUR OWN IMAGES
# ============================================

"""
Once training is done, use your model to recognize digits!

1. Create a 28x28 grayscale image of a digit (or any size, it will be resized)
2. Run the inference script:
"""

# Predict a single image:
python inference.py --model-path checkpoints/best_model.pth --image-path my_digit.png --use-gpu

# This will show:
# - The predicted digit
# - Confidence score
# - Probability for all 10 digits
# - A visualization saved as prediction_visualization.png


# FULL EXAMPLE SESSION
# =====================

"""
Here's a complete workflow from start to finish:
"""

# 1. Install packages
pip install torch torchvision numpy matplotlib seaborn tqdm scikit-learn tensorboard PyYAML Pillow

# 2. Train the model (this will take 5-10 minutes)
python improved_mnist_classifier.py --use-gpu --epochs 20

# 3. Look at the results
# - Open checkpoints/training_curves.png to see training progress
# - Open checkpoints/confusion_matrix.png to see which digits are confused
# - Open checkpoints/predictions.png to see sample predictions
# - Read checkpoints/classification_report.txt for detailed metrics

# 4. Make predictions on new images
python inference.py --model-path checkpoints/best_model.pth --image-path my_digit.png


# TROUBLESHOOTING COMMON ISSUES
# ==============================

"""
Problem 1: "No module named 'torch'"
Solution: Install PyTorch first
"""
pip install torch torchvision

"""
Problem 2: "CUDA out of memory"
Solution: Reduce batch size
"""
python improved_mnist_classifier.py --batch-size 64 --use-gpu

"""
Problem 3: Slow on Windows with multiprocessing
Solution: Set num_workers to 0
"""
python improved_mnist_classifier.py --num-workers 0

"""
Problem 4: "RuntimeError: DataLoader worker"
Solution: Run without multiprocessing
"""
python improved_mnist_classifier.py --num-workers 0

"""
Problem 5: Can't see TensorBoard
Solution: Make sure you installed it and the port is not blocked
"""
pip install tensorboard
tensorboard --logdir=./runs --port 6007  # Try different port

"""
Problem 6: Import errors
Solution: Make sure all files are in the same folder
"""
# Put these files together:
# - improved_mnist_classifier.py
# - inference.py
# - config.yaml
# - requirements.txt


# WHAT TO EXPECT
# ===============

"""
Training output will look like this:

Epoch 1/20 [Train]: 100%|β–ˆβ–ˆβ–ˆβ–ˆ| 469/469 [00:15<00:00, Loss: 0.1234, Acc: 95.67%]
[Val]: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 79/79 [00:02<00:00, Loss: 0.0987, Acc: 97.23%]

Epoch 1/20 | LR: 0.001000
Train Loss: 0.1234, Acc: 95.67%
Val Loss: 0.0987, Acc: 97.23%
βœ“ New best model saved! Val Acc: 97.23%
----------------------------------------------------------------------

... (continues for all epochs) ...

Training complete! Time: 0:05:23
Best Val Acc: 99.34%

Final Test Accuracy: 99.28%

Files created:
- checkpoints/best_model.pth
- checkpoints/training_curves.png
- checkpoints/confusion_matrix.png
- checkpoints/predictions.png
"""


# COMPLETE COMMAND REFERENCE
# ===========================

"""
All available options:

--model-type {cnn,fc}           # Model architecture (default: cnn)
--dropout-rate FLOAT            # Dropout rate (default: 0.3)
--epochs INT                    # Number of training epochs (default: 20)
--batch-size INT                # Batch size (default: 128)
--lr FLOAT                      # Learning rate (default: 0.001)
--optimizer {adam,sgd,adamw}    # Optimizer (default: adamw)
--weight-decay FLOAT            # Weight decay (default: 0.0001)
--scheduler {cosine,onecycle,step}  # LR scheduler (default: onecycle)
--warmup-epochs INT             # Warmup epochs (default: 2)
--data-dir PATH                 # Data directory (default: ./data)
--val-split FLOAT               # Validation split (default: 0.1)
--num-workers INT               # Data loading workers (default: 4)
--early-stop-patience INT       # Early stopping patience (default: 7)
--use-amp                       # Use mixed precision training
--save-dir PATH                 # Save directory (default: ./checkpoints)
--log-dir PATH                  # TensorBoard logs (default: ./runs)
--save-freq INT                 # Save checkpoint frequency (default: 5)
--seed INT                      # Random seed (default: 42)
--use-gpu                       # Use GPU if available
"""


# EXAMPLES FOR DIFFERENT SCENARIOS
# =================================

# Example 1: I just want to see if it works (fastest test)
python improved_mnist_classifier.py --epochs 5

# Example 2: I want the best accuracy (recommended)
python improved_mnist_classifier.py --model-type cnn --epochs 20 --use-gpu

# Example 3: I want it as fast as possible
python improved_mnist_classifier.py --use-gpu --use-amp --batch-size 256

# Example 4: I have limited GPU memory
python improved_mnist_classifier.py --use-gpu --batch-size 64

# Example 5: I only have CPU (will be slower)
python improved_mnist_classifier.py --epochs 10 --num-workers 0

# Example 6: I want to experiment with different settings
python improved_mnist_classifier.py --model-type fc --lr 0.01 --optimizer sgd --epochs 15


# NEXT STEPS
# ==========

"""
After you successfully run training:

1. Compare your original model with the new CNN model
2. Try different hyperparameters (learning rate, batch size, epochs)
3. Create your own digit images and test the inference script
4. Look at the confusion matrix to see which digits are hardest
5. Check TensorBoard to understand training dynamics
6. Read COMPARISON.md to understand all the improvements
7. Modify the code to add your own ideas!
"""


# GETTING HELP
# ============

"""
If you run into issues:

1. Check the error message carefully
2. Make sure all required packages are installed
3. Try running with --num-workers 0 first
4. Check that all files are in the same directory
5. Read the README.md for detailed documentation
6. Read COMPARISON.md to understand the differences

Common first-time issues:
- Missing packages β†’ pip install -r requirements.txt
- CUDA errors β†’ Don't use --use-gpu, train on CPU first
- Multiprocessing errors β†’ Add --num-workers 0
- Import errors β†’ Check all files are in same folder
"""

print("Good luck with your training! πŸš€")