MNIST Handwritten Digit Classifier

A classical machine learning approach to handwritten digit recognition using Logistic Regression on the MNIST dataset.

Model Description

This model classifies 28x28 grayscale images of handwritten digits (0-9) using a simple yet effective Logistic Regression classifier. The project serves as an introduction to image classification and the MNIST dataset.

Intended Uses

Educational: Learning image classification fundamentals
Benchmarking: Baseline for comparing more complex models
Research: Exploring classical ML on image data
Prototyping: Quick digit recognition experiments

Training Data

Dataset: ylecun/mnist

Split	Images
Train	60,000
Test	10,000
Total	70,000

Data Characteristics

Property	Value
Image Size	28 x 28 pixels
Channels	1 (Grayscale)
Classes	10 (digits 0-9)
Pixel Range	0-255 (raw), 0-1 (normalized)
Format	PNG/NumPy arrays

Class Distribution

The dataset is relatively balanced across all 10 digit classes.

Model Architecture

Preprocessing Pipeline

Raw Image (28x28, uint8)
    ↓
Normalize to [0, 1] (divide by 255)
    ↓
Flatten to vector (784 dimensions)
    ↓
Logistic Regression Classifier
    ↓
Softmax Probabilities (10 classes)

Classifier Configuration

LogisticRegression(
    max_iter=100,
    solver='lbfgs',
    multi_class='multinomial',
    n_jobs=-1
)

Parameter	Value	Description
max_iter	100	Maximum iterations for convergence
solver	lbfgs	L-BFGS optimization algorithm
multi_class	multinomial	True multiclass (not OvR)
n_jobs	-1	Use all CPU cores

Performance

Test Set Results

Metric	Score
Accuracy	~92%
Macro F1	~92%
Macro Precision	~92%
Macro Recall	~92%

Per-Class Performance

Digit	Precision	Recall	F1-Score
0	~0.95	~0.97	~0.96
1	~0.95	~0.97	~0.96
2	~0.91	~0.89	~0.90
3	~0.89	~0.90	~0.90
4	~0.92	~0.92	~0.92
5	~0.88	~0.87	~0.87
6	~0.94	~0.95	~0.94
7	~0.93	~0.91	~0.92
8	~0.88	~0.87	~0.88
9	~0.89	~0.90	~0.90

Note: Performance varies slightly between runs

Common Confusion Pairs

4 ↔ 9 (similar upper loops)
3 ↔ 8 (curved shapes)
5 ↔ 3 (similar strokes)
7 ↔ 1 (vertical strokes)

Usage

Installation

pip install scikit-learn pandas numpy matplotlib seaborn pillow

Load and Preprocess Data

import pandas as pd
import numpy as np
from PIL import Image

# Load from Hugging Face
df_train = pd.read_parquet("hf://datasets/ylecun/mnist/mnist/train-00000-of-00001.parquet")
df_test = pd.read_parquet("hf://datasets/ylecun/mnist/mnist/test-00000-of-00001.parquet")

def extract_image(row):
    """Extract image as numpy array"""
    img_data = row['image']
    if isinstance(img_data, dict) and 'bytes' in img_data:
        from io import BytesIO
        img = Image.open(BytesIO(img_data['bytes']))
        return np.array(img)
    elif isinstance(img_data, Image.Image):
        return np.array(img_data)
    return np.array(img_data)

# Prepare data
X_train = np.array([extract_image(row) for _, row in df_train.iterrows()])
y_train = df_train['label'].values

# Normalize and flatten
X_train_flat = X_train.astype('float32').reshape(-1, 784) / 255.0

Train Model

from sklearn.linear_model import LogisticRegression

model = LogisticRegression(
    max_iter=100,
    solver='lbfgs',
    multi_class='multinomial',
    n_jobs=-1
)
model.fit(X_train_flat, y_train)

Inference

import joblib

# Load model
model = joblib.load('mnist_model.pkl')

# Predict single image
def predict_digit(image):
    """
    image: 28x28 numpy array or PIL Image
    returns: predicted digit (0-9)
    """
    if isinstance(image, Image.Image):
        image = np.array(image)

    # Preprocess
    image_flat = image.astype('float32').reshape(1, 784) / 255.0

    # Predict
    prediction = model.predict(image_flat)[0]
    probabilities = model.predict_proba(image_flat)[0]

    return prediction, probabilities

# Example
digit, probs = predict_digit(test_image)
print(f"Predicted: {digit} (confidence: {probs[digit]:.2%})")

Visualization

import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix
import seaborn as sns

# Confusion Matrix
y_pred = model.predict(X_test_flat)
cm = confusion_matrix(y_test, y_pred)

plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=range(10), yticklabels=range(10))
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Confusion Matrix - MNIST')
plt.show()

Average Digit Visualization

# Compute mean image per digit
fig, axes = plt.subplots(2, 5, figsize=(12, 5))
for digit in range(10):
    ax = axes[digit // 5, digit % 5]
    mask = y_train == digit
    mean_img = X_train[mask].mean(axis=0)
    ax.imshow(mean_img, cmap='hot')
    ax.set_title(f'Digit: {digit}')
    ax.axis('off')
plt.tight_layout()
plt.show()

Limitations

Simple Model: Logistic Regression doesn't capture spatial relationships
No Data Augmentation: Sensitive to rotation, scaling, translation
Grayscale Only: Won't work with color images
Fixed Size: Requires exactly 28x28 input
Clean Data: Struggles with noisy or poorly centered digits

Comparison with Other Approaches

Model	MNIST Accuracy
Logistic Regression	~92%
Random Forest	~97%
SVM (RBF kernel)	~98%
MLP (2 hidden layers)	~98%
CNN (LeNet-5)	~99%
Modern CNNs	~99.7%

Technical Specifications

Dependencies

scikit-learn>=1.0.0
pandas>=1.3.0
numpy>=1.20.0
matplotlib>=3.4.0
seaborn>=0.11.0
pillow>=8.0.0

Hardware Requirements

Task	Hardware	Time
Training	CPU	~2-5 min
Inference	CPU	< 1ms per image
Memory	RAM	~500MB

Files

MNIST/
├── README_HF.md          # This model card
├── mnist_exploration.ipynb  # Full exploration notebook
├── mnist_model.pkl       # Trained model (generated)
└── figures/              # Visualizations (generated)

Citation

@article{lecun1998mnist,
  title={Gradient-based learning applied to document recognition},
  author={LeCun, Yann and Bottou, L{\'e}on and Bengio, Yoshua and Haffner, Patrick},
  journal={Proceedings of the IEEE},
  volume={86},
  number={11},
  pages={2278--2324},
  year={1998}
}

@misc{mnist_hf,
  title={MNIST Dataset},
  author={LeCun, Yann and Cortes, Corinna and Burges, Christopher J.C.},
  howpublished={Hugging Face Datasets},
  url={https://huggingface.co/datasets/ylecun/mnist}
}

License

MIT License

Acknowledgments

Yann LeCun for creating MNIST
Scikit-learn team for the ML library
Hugging Face for dataset hosting

Next Steps

For better performance, consider:

More Complex Models: SVM, Random Forest, Neural Networks
Deep Learning: CNNs with PyTorch/TensorFlow
Data Augmentation: Rotation, scaling, elastic deformations
Feature Engineering: HOG, SIFT features
Ensemble Methods: Combine multiple classifiers

Downloads last month: -

A0lgk
/

MNIST