MNIST Teacher Model
A convolutional neural network trained on the MNIST dataset for handwritten digit classification.
Model Description
This is a ConvNet model trained on MNIST with the following architecture:
- Conv layer 1: 1 โ 32 channels, 3x3 kernel
- ReLU + MaxPool (2x2)
- Conv layer 2: 32 โ 64 channels, 3x3 kernel
- ReLU + MaxPool (2x2)
- Fully connected: 64 ร 7 ร 7 โ 128 โ 10 (output)
Training Details
Training Hyperparameters
- Batch size: 128
- Epochs: 10
- Learning rate: 0.001
- Weight decay: 0.0001
- Optimizer: AdamW
- Training set size: 50,000
- Validation set size: 10,000
- Test set size: 10,000
- Device: cuda
- Seed: 42
Results
- Test Accuracy: 0.9887
- Test Loss: 0.0402
Usage
import torch
from pathlib import Path
# Download the model
model_path = "model.pt"
state_dict = torch.load(model_path)
# Load into your ConvNet architecture
# (you'll need to define the ConvNet class from the training script)
model = ConvNet()
model.load_state_dict(state_dict)
model.eval()
# Make predictions
with torch.no_grad():
predictions = model(images)
Dataset
The model was trained on the MNIST dataset, which contains 70,000 grayscale images of handwritten digits (0-9), each 28x28 pixels.
Model Card Authors
Generated automatically during training.
- Downloads last month
- 13