Model Card — Handwritten Digit Classifier (CNN)

A Convolutional Neural Network (CNN) trained on the MNIST dataset to classify handwritten digits (0–9) with high accuracy. Designed for real-time inference in a web-based drawing interface.

Model Details

Model Description

This model is a CNN trained from scratch on the MNIST benchmark dataset. It accepts 28×28 grayscale images of handwritten digits and outputs a probability distribution over 10 classes (digits 0–9). It is the backbone of the Digit Classifier web app.

Developed by: Abdul Rafay
Model type: Convolutional Neural Network (CNN)
Language(s): N/A (Computer Vision — image input only)
License: MIT
Framework: PyTorch 2.0+
Finetuned from: Trained from scratch (no pretrained base)

Model Sources

Demo: Hugging Face Space

digit_classifier(1)

Uses

Direct Use

This model can be used directly to classify 28×28 grayscale images of handwritten digits — no fine-tuning required. It is best suited for:

Educational demos of deep learning and CNNs
Handwritten digit recognition in controlled environments
Integration into apps via the provided web UI or API

Downstream Use

The model can be fine-tuned or adapted for:

Multi-digit number recognition (e.g., street numbers, forms)
Similar single-character classification tasks
Transfer learning baseline for other image classification problems

Out-of-Scope Use

This model is not suitable for:

Recognizing letters, symbols, or non-digit characters
Noisy, real-world document scans without preprocessing
Multi-digit or multi-character sequences in a single image
Safety-critical systems (e.g., medical, legal document processing)

Bias, Risks, and Limitations

Dataset bias: MNIST digits are clean, centered, and size-normalized. The model may underperform on digits written in non-Western styles, extreme stroke widths, or unusual orientations.
Domain shift: Performance degrades on images that differ significantly from the MNIST distribution (e.g., photos of digits on paper, different fonts).
No uncertainty calibration: The model outputs softmax probabilities, which may appear confident even on out-of-distribution inputs.

Recommendations

Preprocess input images to 28×28 grayscale and center/normalize digits before inference.
Do not rely on model confidence scores alone — add a rejection threshold for production use.
Evaluate on your specific distribution before deploying in any real-world scenario.

How to Get Started with the Model

import torch
from torchvision import transforms
from PIL import Image
from model import Model  # your model definition

# Load model
model = Model()
model.load_state_dict(torch.load("model.pt"))
model.eval()

# Preprocess image
transform = transforms.Compose([
    transforms.Grayscale(),
    transforms.Resize((28, 28)),
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

img = Image.open("digit.png")
tensor = transform(img).unsqueeze(0)  # shape: [1, 1, 28, 28]

# Predict
with torch.no_grad():
    output = model(tensor)
    prediction = output.argmax(dim=1).item()

print(f"Predicted digit: {prediction}")

Training Details

Training Data

Dataset: MNIST — 70,000 grayscale images (60,000 train / 10,000 test)
Input size: 28×28 pixels, single channel
Classes: 10 (digits 0–9)

Training Procedure

Preprocessing

Images converted to tensors and normalized using MNIST dataset mean (0.1307) and std (0.3081)
Training augmentation: random rotation (±10°), random affine with translation (±10%), scale (0.9–1.1×), and shear (±5°)
Test images: normalization only — no augmentation

Training Hyperparameters

Parameter	Value
Optimizer	AdamW
Learning Rate	3e-3 (max, OneCycleLR)
Weight Decay	1e-4
Batch Size	64
Epochs	50
Loss Function	CrossEntropyLoss
Label Smoothing	0.1
LR Scheduler	OneCycleLR (10% warmup, cosine anneal)
Dropout (conv)	0.25 (Dropout2d)
Dropout (FC)	0.25
Random Seed	23
Training regime	fp32

Speeds, Sizes, Times

Training time: ~10 minutes on a single GPU (NVIDIA T4, Google Colab)
Model parameters: 160,842
Inference speed: <50ms per image (CPU)

Evaluation

Testing Data, Factors & Metrics

Testing Data

Evaluated on the standard MNIST test split — 10,000 images not seen during training.

Factors

Evaluation was performed across all 10 digit classes. No disaggregation by subpopulation was conducted (MNIST does not include demographic metadata).

Metrics

Accuracy — primary metric; proportion of correctly classified digits
Confusion Matrix — to identify per-class error patterns

Results

Metric	Value
Test Accuracy	99.43%

Per-Class Accuracy

Digit	Correct	Errors	Accuracy
0	980	0	100.0%
1	1132	3	99.7%
2	1025	7	99.3%
3	1008	2	99.8%
4	976	6	99.4%
5	885	7	99.2%
6	949	9	99.1%
7	1020	8	99.2%
8	968	6	99.4%
9	1000	9	99.1%

Summary

The model achieves 99.43% accuracy on the MNIST test set (57 total errors out of 10,000). Digit 0 achieves perfect classification. The most challenging classes are 6 and 9 (9 errors each), consistent with their visual similarity.

Model Examination

The model's convolutional filters learn edge detectors and stroke patterns in early layers, which compose into digit-specific features in deeper layers. Standard CNN interpretability techniques (e.g., Grad-CAM) can be applied to visualize which regions most influence predictions.

Environmental Impact

Carbon emissions estimated using the ML Impact Calculator.

Factor	Value
Hardware Type	NVIDIA T4 GPU
Hours Used	~0.2 hrs (10 min)
Cloud Provider	Google Colab
Compute Region	Singapore
Carbon Emitted	~0.01 kg CO₂eq (est.)

Technical Specifications

Model Architecture

The model uses 4 convolutional blocks followed by a compact fully connected head.

Convolutional Blocks

Block	Layer	Output Shape	Details
Block 1	Conv2d	(32, 28, 28)	32 filters, 3×3, padding=1
	BatchNorm2d	(32, 28, 28)	—
	ReLU	(32, 28, 28)	—
	MaxPool2d	(32, 14, 14)	2×2
	Dropout2d	(32, 14, 14)	p=0.25
Block 2	Conv2d	(64, 14, 14)	64 filters, 3×3, padding=1
	BatchNorm2d	(64, 14, 14)	—
	ReLU	(64, 14, 14)	—
	MaxPool2d	(64, 7, 7)	2×2
	Dropout2d	(64, 7, 7)	p=0.25
Block 3	Conv2d	(128, 7, 7)	128 filters, 3×3, padding=1
	BatchNorm2d	(128, 7, 7)	—
	ReLU	(128, 7, 7)	—
	MaxPool2d	(128, 3, 3)	2×2
	Dropout2d	(128, 3, 3)	p=0.25
Block 4	Conv2d	(256, 3, 3)	256 filters, 1×1 kernel (no pad)
	BatchNorm2d	(256, 3, 3)	—
	ReLU	(256, 3, 3)	—
	MaxPool2d	(256, 1, 1)	2×2
	Dropout2d	(256, 1, 1)	p=0.25

Fully Connected Layers

Layer	Output	Details
Flatten	256	256 × 1 × 1 = 256
Linear	128	+ ReLU + Dropout(0.25)
Linear	10	Raw logits

Total Parameters: 160,842

Shape Flow

Input:   (B,   1, 28, 28)
Block 1: (B,  32, 14, 14)
Block 2: (B,  64,  7,  7)
Block 3: (B, 128,  3,  3)
Block 4: (B, 256,  1,  1)
Flatten: (B, 256)
FC1:     (B, 128)
Output:  (B,  10)

Compute Infrastructure

Hardware: NVIDIA T4 GPU (Google Colab)
Software: Python 3.10+, PyTorch 2.0, torchvision

Citation

If you use this model in your work, please cite:

BibTeX:

@misc{digit-classifier-2026,
  author    = {Abdul Rafay},
  title     = {Handwritten Digit Classifier (CNN on MNIST)},
  year      = {2026},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/abdurafay19/Digit-Classifier}
}

APA:

Abdul Rafay. (2026). Handwritten Digit Classifier (CNN on MNIST). Hugging Face. https://huggingface.co/abdurafay19/Digit-Classifier

Glossary

Term	Definition
CNN	Convolutional Neural Network — a deep learning architecture suited for image data
MNIST	A benchmark dataset of 70,000 handwritten digit images
Softmax	Activation function that converts raw outputs to probabilities summing to 1
Dropout	Regularization technique that randomly disables neurons during training
BatchNorm	Batch Normalization — normalizes layer activations to stabilize and speed up training
OneCycleLR	Learning rate schedule with warmup and cosine decay for faster convergence
Label Smoothing	Softens hard targets to reduce overconfidence and improve generalization
Grad-CAM	Gradient-weighted Class Activation Mapping — a model interpretability technique

Model Card Authors

Abdul Rafay — abdulrafay17wolf@gmail.com

Model Card Contact

For questions or issues, open a GitHub issue at github.com/abdurafay19/Digit-Classifier or reach out via Hugging Face.

Downloads last month: -; Downloads are not tracked for this model. How to track

abdurafay19
/

Digit-Classifier