🧠 MyTorch Elite MNIST Model

Curated by Aryan Singh Chandel (Shiro) at Rustamji Institute of Technology (RJIT).

This repository hosts the Elite MyTorch MNIST Model, which achieved a remarkable 98.59% accuracy on the MNIST handwritten digit classification task.

The model is built entirely from scratch using NumPy, replicating core deep learning functionalities. It incorporates advanced features such as AdamW optimizer, Batch Normalization, Dropout, and Label Smoothing to achieve state-of-the-art performance within a custom framework.

πŸš€ Model Architecture

This is a multi-layer perceptron (MLP) with a "Diamond Architecture": 784 (Input) -> 1024 -> 2048 (Expansion) -> 512 (Bottleneck) -> 10 (Output)

Key components:

  • Linear Layers with Kaiming Initialization
  • Batch Normalization
  • ReLU Activation Functions
  • Dropout for regularization

⚑ Training Details

  • Optimizer: AdamW with Weight Decay
  • Loss Function: Cross-Entropy Loss with Label Smoothing
  • Learning Rate Scheduler: StepLR
  • Epochs: 15
  • Batch Size: 256
  • Data Augmentation: Ultra-light Gaussian noise

πŸ“Š Performance

  • Validation Accuracy: 98.59%
  • Confusion Matrix: (See visuals/final_heatmap.png in the main GitHub repo)

πŸ“¦ Files

  • best_model.pkl: The serialized Python pickle file containing the trained MyTorch model instance.

πŸ’‘ Usage

To load and use this model in your MyTorch project:

import pickle
import numpy as np

# Assuming you have MyTorch installed or in your Python path
# from mytorch.nn.sequential import Sequential # (and other modules)

# Load the model
with open('best_model.pkl', 'rb') as f:
    loaded_model = pickle.load(f)

# Example inference (assuming X_test is your preprocessed test data)
# predictions = loaded_model(X_test)
# print(np.argmax(predictions, axis=1))

πŸŽ“ Citation If you use this model in your research, please attribute: Chandel, A. S. (2026). MyTorch: Deep Learning from Scratch at RJIT.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support