--- language: en tags: - pytorch - computer-vision - image-classification - mnist - digit-recognition - cnn license: mit datasets: - mnist metrics: - accuracy model-index: - name: mnist-cnn-classifier results: - task: type: image-classification name: Image Classification dataset: name: MNIST type: mnist metrics: - type: accuracy value: 99.60 name: Test Accuracy - type: accuracy value: 99.27 name: Validation Accuracy --- # MNIST CNN Classifier A production-ready Convolutional Neural Network for handwritten digit recognition, achieving **99.60% accuracy** on the MNIST test set. ## Model Description This model uses a 4-layer CNN architecture with batch normalization and dropout for robust digit classification. It's designed for production use with comprehensive training, evaluation, and inference pipelines. **Key Features:** - 🎯 **99.60% test accuracy** on MNIST - 🏗️ **CNN Architecture**: 4 convolutional layers + 3 fully connected layers - ⚡ **Fast Inference**: ~5ms per image on CPU - 📦 **Lightweight**: Only 271K parameters - 🔧 **Production Ready**: Complete preprocessing and error handling ## Model Architecture ``` ConvNet( - Conv Block 1: Conv2d(1→32) + BatchNorm + ReLU + Conv2d(32→64) + BatchNorm + ReLU + MaxPool + Dropout - Conv Block 2: Conv2d(64→128) + BatchNorm + ReLU + Conv2d(128→128) + BatchNorm + ReLU + MaxPool + Dropout - FC Block 1: Linear(6272→256) + BatchNorm + ReLU + Dropout - FC Block 2: Linear(256→128) + BatchNorm + ReLU + Dropout - Output: Linear(128→10) ) ``` **Total Parameters:** 271,114 ## Training Details ### Training Data - **Dataset**: MNIST (60,000 training images) - **Split**: 54,000 train / 6,000 validation / 10,000 test - **Augmentation**: Random rotation (±10°), affine transforms, random erasing ### Training Hyperparameters - **Optimizer**: AdamW - **Learning Rate**: 0.001 with OneCycleLR scheduler - **Batch Size**: 128 - **Epochs**: 20 (early stopping after 17) - **Weight Decay**: 0.0001 - **Dropout**: 0.3 - **Gradient Clipping**: 1.0 ### Training Results | Metric | Value | |--------|-------| | Training Accuracy | 98.74% | | Validation Accuracy | 99.27% | | Test Accuracy | **99.60%** | | Training Time | ~85 minutes (CPU) | ### Per-Class Performance | Digit | Precision | Recall | F1-Score | Support | |-------|-----------|--------|----------|---------| | 0 | 1.00 | 1.00 | 1.00 | 980 | | 1 | 1.00 | 1.00 | 1.00 | 1135 | | 2 | 0.99 | 1.00 | 0.99 | 1032 | | 3 | 0.99 | 1.00 | 1.00 | 1010 | | 4 | 1.00 | 1.00 | 1.00 | 982 | | 5 | 1.00 | 0.99 | 0.99 | 892 | | 6 | 1.00 | 0.99 | 1.00 | 958 | | 7 | 0.99 | 0.99 | 0.99 | 1028 | | 8 | 1.00 | 1.00 | 1.00 | 974 | | 9 | 1.00 | 0.99 | 1.00 | 1009 | ## Usage ### Installation ```bash pip install torch torchvision pillow numpy ``` ### Quick Start ```python import torch from PIL import Image from torchvision import transforms # Load model device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = torch.load('best_model.pth', map_location=device) model.eval() # Preprocess image transform = transforms.Compose([ transforms.Resize((28, 28)), transforms.Grayscale(), transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)) ]) # Load and predict image = Image.open('digit.png') image_tensor = transform(image).unsqueeze(0).to(device) with torch.no_grad(): output = model(image_tensor) prediction = output.argmax(dim=1).item() confidence = torch.softmax(output, dim=1).max().item() print(f"Predicted digit: {prediction} (confidence: {confidence:.2%})") ``` ### Using the Inference Script ```bash # Single image python inference.py --model-path best_model.pth --image-path digit.png # Batch inference python inference.py --model-path best_model.pth --image-dir ./images/ ``` ## Training Your Own Model ```bash # Install requirements pip install -r requirements.txt # Train with default settings python improved_mnist_classifier.py --use-gpu # Train with custom settings python improved_mnist_classifier.py \ --epochs 20 \ --batch-size 128 \ --lr 0.001 \ --use-gpu \ --use-amp ``` ## Limitations and Biases - **Domain**: Only works for handwritten digits (0-9), not letters or symbols - **Image Format**: Expects 28×28 grayscale images or will resize - **Background**: Trained on white/light digits on dark background (MNIST format) - **Quality**: Performance may degrade on very blurry or distorted digits - **Real-world**: May need fine-tuning for specific use cases (checks, forms, etc.) ## Ethical Considerations This model is designed for digit recognition and should not be used for: - Automated decision-making without human oversight - Privacy-sensitive applications without proper consent - High-stakes scenarios without validation on domain-specific data ## Citation If you use this model, please cite: ```bibtex @misc{mnist-cnn-classifier, author = {Your Name}, title = {MNIST CNN Classifier: Production-Ready Digit Recognition}, year = {2026}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/your-username/mnist-cnn-classifier}} } ``` ## Model Card Authors - **Your Name** - [GitHub](https://github.com/your-username) | [LinkedIn](https://linkedin.com/in/your-profile) ## License MIT License - See LICENSE file for details ## Acknowledgments - MNIST dataset: LeCun et al. - PyTorch framework - Hugging Face for hosting