MNIST MLP Classifier

This is a fully connected MNIST digit classifier trained as part of the dlab deep-learning experimentation roadmap.

The model is intentionally an MLP rather than a CNN. It is useful as a strong baseline for studying optimization, regularization, augmentation, seed variance, and the ceiling of non-convolutional models on MNIST.

Results

10-seed confirmation sweep:

metric result
validation accuracy 99.3600% ± 0.0817 pp
validation loss 0.15172 ± 0.00235
test accuracy 99.4470% ± 0.0195 pp
test loss 0.14746 ± 0.00034
test errors 55.3 ± 1.95 / 10000

Architecture

  • Dataset: MNIST
  • Model: MLP
  • Hidden width: 1024
  • Hidden layers: 3
  • Activation: ReLU
  • Batch normalization: enabled
  • Dropout: 0.2
  • Optimizer: Adam
  • Learning rate: 0.001
  • Scheduler: OneCycleLR
  • Weight decay: 0.0001
  • Label smoothing: 0.02
  • Weight averaging: EMA
  • Train augmentation: random affine, 10 degrees, translate 0.05, scale 0.9-1.1

Files

  • model.ckpt: PyTorch Lightning checkpoint from the best run.
  • model.onnx: ONNX export of the EMA/current model state used for evaluation.
  • config.yaml: resolved Hydra training config.
  • metrics.csv: training metrics from the run.
  • metadata.json: compact metadata for inference and provenance.

Preprocessing

Inputs should be MNIST grayscale images converted to tensors and normalized with:

mean = [0.1307]
std = [0.3081]

The ONNX model expects float tensors shaped [batch, 1, 28, 28] under the input name images, and returns class logits under the output name logits.

Provenance

Downloads last month
23
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train tsilva/mnist-mlp-classifier