MNIST CNN Classifier

This is a compact convolutional MNIST digit classifier trained as part of the dlab deep-learning experimentation roadmap.

The model was selected by validation loss, then evaluated once on the held-out MNIST test set across five random seeds. The uploaded checkpoint is the seed-1 run, which had the best validation loss in the final 5-seed evaluation sweep.

Results

5-seed held-out test evaluation:

metric result
test accuracy 99.6140% ± 0.0802 pp
test loss 0.14677 ± 0.00272
best validation loss 0.14282 ± 0.00138

Per-seed held-out test results:

seed test accuracy test loss best validation loss
1 99.6100% 0.14772 0.14124
2 99.6800% 0.14437 0.14317
3 99.4800% 0.15107 0.14403
4 99.6300% 0.14573 0.14150
5 99.6700% 0.14494 0.14418

Architecture

  • Dataset: MNIST
  • Model: CNN
  • Channels: 32, 64, 128
  • Dropout: 0.1
  • Optimizer: Adam
  • Learning rate: 0.001
  • Scheduler: OneCycleLR
  • Weight decay: 0.0001
  • Label smoothing: 0.02
  • Weight averaging: EMA
  • Batch size: 512
  • Train augmentation: random affine, 10 degrees, translate 0.05, scale 0.9-1.1
  • Early stopping: validation loss, patience 8, min_delta 0.0005

Files

  • model.ckpt: PyTorch Lightning checkpoint from the validation-selected seed-1 run.
  • model.onnx: ONNX export of the checkpoint for inference.
  • config.yaml: resolved Hydra training config.
  • metrics.csv: training metrics from the uploaded checkpoint run.
  • metrics_summary.csv: compact 5-seed final evaluation summary.
  • metadata.json: compact metadata for inference and provenance.

Preprocessing

Inputs should be MNIST grayscale images converted to tensors and normalized with:

mean = [0.1307]
std = [0.3081]

The ONNX model expects float tensors shaped [batch, 1, 28, 28] under the input name images, and returns class logits under the output name logits.

Provenance

Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train tsilva/mnist-cnn-classifier