deepshah23
/

digit-blank-classifier-cnn

+# Digit & Blank Image Classifier (PyTorch CNN)
+A high-accuracy convolutional neural network trained to classify handwritten digits from the **MNIST** and **EMNIST Digits** datasets, and additionally detect **blank images** (unfilled boxes) as a distinct class. This model is trained using PyTorch and exported in TorchScript format (`.pt`) for reliable and portable inference.
+---
+## License & Attribution
+This model is licensed under the **AGPL-3.0** license to comply with the [Plom Project](https://gitlab.com/plom/plom) licensing requirements.
+### Developed as part of the Plom Project
+**Authors & Credits**:
+- Model: **Deep Shah**, Undergraduate Research Assistant, UBC
+- Supervision: **Prof. Andrew Rechnitzer** and **Prof. Colin B. MacDonald**
+- Project: [The Plom Project GitLab](https://gitlab.com/plom/plom)
+---
+## Overview
+- **Input**: 1×28×28 grayscale image
+- **Output**: Integer class prediction:
+  - 0–9: Digits
+  - 10: Blank image
+- **Architecture**: 3-layer CNN with BatchNorm, ReLU, MaxPooling, Dropout, Fully Connected Layers
+- **Model Format**: TorchScript (`.pt`)
+- **Training Dataset**: Combined MNIST, EMNIST Digits, and 5000 synthetic blank images
+---
+## Dataset Details
+### Datasets Used:
+-  **MNIST** – 28×28 handwritten digits (0–9), 60,000 training images
+-  **EMNIST Digits** – 28×28 digits extracted from handwritten characters, 240,000+ training samples
+-  **Blank Images** – 5,000 synthetic all-black 28×28 images, labeled as class `10` to simulate unfilled regions
+### Preprocessing:
+- Normalized pixel values to [0, 1]
+- Converted images to channel-first format (N, C, H, W)
+- Combined and shuffled datasets
+---
+## Data Augmentation
+To improve generalization and robustness to handwriting variation:
+- `RandomRotation(±10°)`
+- `RandomAffine`: scale (0.9–1.1), translate (±10%)
+These transformations simulate handwritten noise and variation in real student submissions.
+---
+## 🏗️ Model Architecture
+```
+Input: (1, 28, 28)
+↓ Conv2D(1 → 32) + BatchNorm + ReLU
+↓ Conv2D(32 → 64) + BatchNorm + ReLU
+↓ MaxPool2d(2x2) + Dropout(0.2)
+↓ Conv2D(64 → 128) + BatchNorm + ReLU
+↓ MaxPool2d(2x2) + Dropout(0.2)
+↓ Flatten
+↓ Linear(128*7*7 → 128) + BatchNorm + ReLU + Dropout(0.1)
+↓ Linear(128 → 11)
+→ Output: class logits (digits 0–9, blank = 10)
+```
+---
+## Training Configuration
+| Hyperparameter | Value               |
+| -------------- | ------------------- |
+| Optimizer      | Adam (lr=0.001)     |
+| Loss Function  | CrossEntropyLoss    |
+| Scheduler      | ReduceLROnPlateau   |
+| Early Stopping | Patience = 5        |
+| Epochs         | Max 50              |
+| Batch Size     | 64                  |
+| Device         | CPU or CUDA         |
+| Random Seed    | 42                  |
+---
+## Evaluation Results
+| Metric               | Value     |
+| -------------------- | --------- |
+| Test Accuracy        | 98.25%    |
+| Blank Image Accuracy | 100.00%   |
+| TorchScript Export   | ✅ Yes    |
+All 5,000 blank images were correctly classified.
+---
+## Inference Example (Python)
+```python
+import torch
+# Load TorchScript model
+model = torch.jit.load("e_mnist_digit_blank_cnn_ts_v1.pt")
+model.eval()
+# Dummy input (1 image, 1 channel, 28x28)
+img = torch.randn(1, 1, 28, 28)
+# Predict
+with torch.no_grad():
+    out = model(img)
+    predicted = out.argmax(dim=1).item()
+print("Predicted class:", predicted)
+```
+> 🔎 If the prediction is `10`, the model considers the image to be blank (no digits present).
+---
+## Included Files
+- `train_digit_classifier.py`: Training script with full documentation
+- `e_mnist_digit_blank_cnn_v6.pth`: Final trained model weights
+- `e_mnist_digit_blank_cnn_ts_v1.pt`: TorchScript export for deployment
+- `requirements.txt`: Required dependencies for training or inference
+---
+## Intended Use
+This model was designed to support the Plom Project’s student ID digit detection system, helping automatically identify handwritten digits (and detect blank/unfilled boxes) from scanned exam sheets.
+It may also be adapted for other handwritten digit classification tasks or real-time blank field detection applications.
+<!-- ---
+## Maintainer & Contact
+- **Deep Shah** — [Hugging Face Profile](https://huggingface.co/deepshah23)
+- For Plom inquiries: [The Plom Project GitLab](https://gitlab.com/plom/plom) -->