deepshah23 commited on
Commit
fcf6391
Β·
verified Β·
1 Parent(s): fd83cab

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +147 -3
README.md CHANGED
@@ -1,3 +1,147 @@
1
- ---
2
- license: gpl-3.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Digit & Blank Image Classifier (PyTorch CNN)
2
+
3
+ A high-accuracy convolutional neural network trained to classify handwritten digits from the **MNIST** and **EMNIST Digits** datasets, and additionally detect **blank images** (unfilled boxes) as a distinct class. This model is trained using PyTorch and exported in TorchScript format (`.pt`) for reliable and portable inference.
4
+
5
+ ---
6
+
7
+ ## License & Attribution
8
+
9
+ This model is licensed under the **AGPL-3.0** license to comply with the [Plom Project](https://gitlab.com/plom/plom) licensing requirements.
10
+
11
+ ### Developed as part of the Plom Project
12
+
13
+ **Authors & Credits**:
14
+ - Model: **Deep Shah**, Undergraduate Research Assistant, UBC
15
+ - Supervision: **Prof. Andrew Rechnitzer** and **Prof. Colin B. MacDonald**
16
+ - Project: [The Plom Project GitLab](https://gitlab.com/plom/plom)
17
+
18
+ ---
19
+
20
+ ## Overview
21
+
22
+ - **Input**: 1Γ—28Γ—28 grayscale image
23
+ - **Output**: Integer class prediction:
24
+ - 0–9: Digits
25
+ - 10: Blank image
26
+ - **Architecture**: 3-layer CNN with BatchNorm, ReLU, MaxPooling, Dropout, Fully Connected Layers
27
+ - **Model Format**: TorchScript (`.pt`)
28
+ - **Training Dataset**: Combined MNIST, EMNIST Digits, and 5000 synthetic blank images
29
+
30
+ ---
31
+
32
+ ## Dataset Details
33
+
34
+ ### Datasets Used:
35
+
36
+ - **MNIST** – 28Γ—28 handwritten digits (0–9), 60,000 training images
37
+ - **EMNIST Digits** – 28Γ—28 digits extracted from handwritten characters, 240,000+ training samples
38
+ - **Blank Images** – 5,000 synthetic all-black 28Γ—28 images, labeled as class `10` to simulate unfilled regions
39
+
40
+ ### Preprocessing:
41
+
42
+ - Normalized pixel values to [0, 1]
43
+ - Converted images to channel-first format (N, C, H, W)
44
+ - Combined and shuffled datasets
45
+
46
+ ---
47
+
48
+ ## Data Augmentation
49
+
50
+ To improve generalization and robustness to handwriting variation:
51
+
52
+ - `RandomRotation(Β±10Β°)`
53
+ - `RandomAffine`: scale (0.9–1.1), translate (Β±10%)
54
+
55
+ These transformations simulate handwritten noise and variation in real student submissions.
56
+
57
+ ---
58
+
59
+ ## πŸ—οΈ Model Architecture
60
+
61
+ ```
62
+ Input: (1, 28, 28)
63
+ ↓ Conv2D(1 β†’ 32) + BatchNorm + ReLU
64
+ ↓ Conv2D(32 β†’ 64) + BatchNorm + ReLU
65
+ ↓ MaxPool2d(2x2) + Dropout(0.2)
66
+ ↓ Conv2D(64 β†’ 128) + BatchNorm + ReLU
67
+ ↓ MaxPool2d(2x2) + Dropout(0.2)
68
+ ↓ Flatten
69
+ ↓ Linear(128*7*7 β†’ 128) + BatchNorm + ReLU + Dropout(0.1)
70
+ ↓ Linear(128 β†’ 11)
71
+ β†’ Output: class logits (digits 0–9, blank = 10)
72
+ ```
73
+
74
+ ---
75
+
76
+ ## Training Configuration
77
+
78
+ | Hyperparameter | Value |
79
+ | -------------- | ------------------- |
80
+ | Optimizer | Adam (lr=0.001) |
81
+ | Loss Function | CrossEntropyLoss |
82
+ | Scheduler | ReduceLROnPlateau |
83
+ | Early Stopping | Patience = 5 |
84
+ | Epochs | Max 50 |
85
+ | Batch Size | 64 |
86
+ | Device | CPU or CUDA |
87
+ | Random Seed | 42 |
88
+
89
+ ---
90
+
91
+ ## Evaluation Results
92
+
93
+ | Metric | Value |
94
+ | -------------------- | --------- |
95
+ | Test Accuracy | 98.25% |
96
+ | Blank Image Accuracy | 100.00% |
97
+ | TorchScript Export | βœ… Yes |
98
+
99
+ All 5,000 blank images were correctly classified.
100
+
101
+ ---
102
+
103
+ ## Inference Example (Python)
104
+
105
+ ```python
106
+ import torch
107
+
108
+ # Load TorchScript model
109
+ model = torch.jit.load("e_mnist_digit_blank_cnn_ts_v1.pt")
110
+ model.eval()
111
+
112
+ # Dummy input (1 image, 1 channel, 28x28)
113
+ img = torch.randn(1, 1, 28, 28)
114
+
115
+ # Predict
116
+ with torch.no_grad():
117
+ out = model(img)
118
+ predicted = out.argmax(dim=1).item()
119
+
120
+ print("Predicted class:", predicted)
121
+ ```
122
+
123
+ > πŸ”Ž If the prediction is `10`, the model considers the image to be blank (no digits present).
124
+
125
+ ---
126
+
127
+ ## Included Files
128
+
129
+ - `train_digit_classifier.py`: Training script with full documentation
130
+ - `e_mnist_digit_blank_cnn_v6.pth`: Final trained model weights
131
+ - `e_mnist_digit_blank_cnn_ts_v1.pt`: TorchScript export for deployment
132
+ - `requirements.txt`: Required dependencies for training or inference
133
+
134
+ ---
135
+
136
+ ## Intended Use
137
+
138
+ This model was designed to support the Plom Project’s student ID digit detection system, helping automatically identify handwritten digits (and detect blank/unfilled boxes) from scanned exam sheets.
139
+
140
+ It may also be adapted for other handwritten digit classification tasks or real-time blank field detection applications.
141
+
142
+ <!-- ---
143
+
144
+ ## Maintainer & Contact
145
+
146
+ - **Deep Shah** β€” [Hugging Face Profile](https://huggingface.co/deepshah23)
147
+ - For Plom inquiries: [The Plom Project GitLab](https://gitlab.com/plom/plom) -->