mnist-cnn / README.md
lbernick's picture
Upload trained MNIST teacher model
62b687a verified
metadata
tags:
  - pytorch
  - mnist
  - image-classification
  - computer-vision
library_name: pytorch

MNIST Teacher Model

A convolutional neural network trained on the MNIST dataset for handwritten digit classification.

Model Description

This is a ConvNet model trained on MNIST with the following architecture:

  • Conv layer 1: 1 → 32 channels, 3x3 kernel
  • ReLU + MaxPool (2x2)
  • Conv layer 2: 32 → 64 channels, 3x3 kernel
  • ReLU + MaxPool (2x2)
  • Fully connected: 64 × 7 × 7 → 128 → 10 (output)

Training Details

Training Hyperparameters

  • Batch size: 128
  • Epochs: 10
  • Learning rate: 0.001
  • Weight decay: 0.0001
  • Optimizer: AdamW
  • Training set size: 50,000
  • Validation set size: 10,000
  • Test set size: 10,000
  • Device: cuda
  • Seed: 42

Results

  • Test Accuracy: 0.9887
  • Test Loss: 0.0402

Usage

import torch
from pathlib import Path

# Download the model
model_path = "model.pt"
state_dict = torch.load(model_path)

# Load into your ConvNet architecture
# (you'll need to define the ConvNet class from the training script)
model = ConvNet()
model.load_state_dict(state_dict)
model.eval()

# Make predictions
with torch.no_grad():
    predictions = model(images)

Dataset

The model was trained on the MNIST dataset, which contains 70,000 grayscale images of handwritten digits (0-9), each 28x28 pixels.

Model Card Authors

Generated automatically during training.