MNIST Handwritten Digit Classifier

A classical machine learning approach to handwritten digit recognition using Logistic Regression on the MNIST dataset.

Model Description

This model classifies 28x28 grayscale images of handwritten digits (0-9) using a simple yet effective Logistic Regression classifier. The project serves as an introduction to image classification and the MNIST dataset.

Intended Uses

  • Educational: Learning image classification fundamentals
  • Benchmarking: Baseline for comparing more complex models
  • Research: Exploring classical ML on image data
  • Prototyping: Quick digit recognition experiments

Training Data

Dataset: ylecun/mnist

Split Images
Train 60,000
Test 10,000
Total 70,000

Data Characteristics

Property Value
Image Size 28 x 28 pixels
Channels 1 (Grayscale)
Classes 10 (digits 0-9)
Pixel Range 0-255 (raw), 0-1 (normalized)
Format PNG/NumPy arrays

Class Distribution

The dataset is relatively balanced across all 10 digit classes.

Model Architecture

Preprocessing Pipeline

Raw Image (28x28, uint8)
    ↓
Normalize to [0, 1] (divide by 255)
    ↓
Flatten to vector (784 dimensions)
    ↓
Logistic Regression Classifier
    ↓
Softmax Probabilities (10 classes)

Classifier Configuration

LogisticRegression(
    max_iter=100,
    solver='lbfgs',
    multi_class='multinomial',
    n_jobs=-1
)
Parameter Value Description
max_iter 100 Maximum iterations for convergence
solver lbfgs L-BFGS optimization algorithm
multi_class multinomial True multiclass (not OvR)
n_jobs -1 Use all CPU cores

Performance

Test Set Results

Metric Score
Accuracy ~92%
Macro F1 ~92%
Macro Precision ~92%
Macro Recall ~92%

Per-Class Performance

Digit Precision Recall F1-Score
0 ~0.95 ~0.97 ~0.96
1 ~0.95 ~0.97 ~0.96
2 ~0.91 ~0.89 ~0.90
3 ~0.89 ~0.90 ~0.90
4 ~0.92 ~0.92 ~0.92
5 ~0.88 ~0.87 ~0.87
6 ~0.94 ~0.95 ~0.94
7 ~0.93 ~0.91 ~0.92
8 ~0.88 ~0.87 ~0.88
9 ~0.89 ~0.90 ~0.90

Note: Performance varies slightly between runs

Common Confusion Pairs

  • 4 ↔ 9 (similar upper loops)
  • 3 ↔ 8 (curved shapes)
  • 5 ↔ 3 (similar strokes)
  • 7 ↔ 1 (vertical strokes)

Usage

Installation

pip install scikit-learn pandas numpy matplotlib seaborn pillow

Load and Preprocess Data

import pandas as pd
import numpy as np
from PIL import Image

# Load from Hugging Face
df_train = pd.read_parquet("hf://datasets/ylecun/mnist/mnist/train-00000-of-00001.parquet")
df_test = pd.read_parquet("hf://datasets/ylecun/mnist/mnist/test-00000-of-00001.parquet")

def extract_image(row):
    """Extract image as numpy array"""
    img_data = row['image']
    if isinstance(img_data, dict) and 'bytes' in img_data:
        from io import BytesIO
        img = Image.open(BytesIO(img_data['bytes']))
        return np.array(img)
    elif isinstance(img_data, Image.Image):
        return np.array(img_data)
    return np.array(img_data)

# Prepare data
X_train = np.array([extract_image(row) for _, row in df_train.iterrows()])
y_train = df_train['label'].values

# Normalize and flatten
X_train_flat = X_train.astype('float32').reshape(-1, 784) / 255.0

Train Model

from sklearn.linear_model import LogisticRegression

model = LogisticRegression(
    max_iter=100,
    solver='lbfgs',
    multi_class='multinomial',
    n_jobs=-1
)
model.fit(X_train_flat, y_train)

Inference

import joblib

# Load model
model = joblib.load('mnist_model.pkl')

# Predict single image
def predict_digit(image):
    """
    image: 28x28 numpy array or PIL Image
    returns: predicted digit (0-9)
    """
    if isinstance(image, Image.Image):
        image = np.array(image)

    # Preprocess
    image_flat = image.astype('float32').reshape(1, 784) / 255.0

    # Predict
    prediction = model.predict(image_flat)[0]
    probabilities = model.predict_proba(image_flat)[0]

    return prediction, probabilities

# Example
digit, probs = predict_digit(test_image)
print(f"Predicted: {digit} (confidence: {probs[digit]:.2%})")

Visualization

import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix
import seaborn as sns

# Confusion Matrix
y_pred = model.predict(X_test_flat)
cm = confusion_matrix(y_test, y_pred)

plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=range(10), yticklabels=range(10))
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix - MNIST')
plt.show()

Average Digit Visualization

# Compute mean image per digit
fig, axes = plt.subplots(2, 5, figsize=(12, 5))
for digit in range(10):
    ax = axes[digit // 5, digit % 5]
    mask = y_train == digit
    mean_img = X_train[mask].mean(axis=0)
    ax.imshow(mean_img, cmap='hot')
    ax.set_title(f'Digit: {digit}')
    ax.axis('off')
plt.tight_layout()
plt.show()

Limitations

  • Simple Model: Logistic Regression doesn't capture spatial relationships
  • No Data Augmentation: Sensitive to rotation, scaling, translation
  • Grayscale Only: Won't work with color images
  • Fixed Size: Requires exactly 28x28 input
  • Clean Data: Struggles with noisy or poorly centered digits

Comparison with Other Approaches

Model MNIST Accuracy
Logistic Regression ~92%
Random Forest ~97%
SVM (RBF kernel) ~98%
MLP (2 hidden layers) ~98%
CNN (LeNet-5) ~99%
Modern CNNs ~99.7%

Technical Specifications

Dependencies

scikit-learn>=1.0.0
pandas>=1.3.0
numpy>=1.20.0
matplotlib>=3.4.0
seaborn>=0.11.0
pillow>=8.0.0

Hardware Requirements

Task Hardware Time
Training CPU ~2-5 min
Inference CPU < 1ms per image
Memory RAM ~500MB

Files

MNIST/
β”œβ”€β”€ README_HF.md          # This model card
β”œβ”€β”€ mnist_exploration.ipynb  # Full exploration notebook
β”œβ”€β”€ mnist_model.pkl       # Trained model (generated)
└── figures/              # Visualizations (generated)

Citation

@article{lecun1998mnist,
  title={Gradient-based learning applied to document recognition},
  author={LeCun, Yann and Bottou, L{\'e}on and Bengio, Yoshua and Haffner, Patrick},
  journal={Proceedings of the IEEE},
  volume={86},
  number={11},
  pages={2278--2324},
  year={1998}
}

@misc{mnist_hf,
  title={MNIST Dataset},
  author={LeCun, Yann and Cortes, Corinna and Burges, Christopher J.C.},
  howpublished={Hugging Face Datasets},
  url={https://huggingface.co/datasets/ylecun/mnist}
}

License

MIT License

Acknowledgments

  • Yann LeCun for creating MNIST
  • Scikit-learn team for the ML library
  • Hugging Face for dataset hosting

Next Steps

For better performance, consider:

  1. More Complex Models: SVM, Random Forest, Neural Networks
  2. Deep Learning: CNNs with PyTorch/TensorFlow
  3. Data Augmentation: Rotation, scaling, elastic deformations
  4. Feature Engineering: HOG, SIFT features
  5. Ensemble Methods: Combine multiple classifiers
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train A0lgk/MNIST