ShiroOnigami23 commited on
Commit
fa4a185
·
verified ·
1 Parent(s): 8a9a9f6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: mit
5
+ library_name: numpy
6
+ tasks:
7
+ - image-classification
8
+ tags:
9
+ - mnist
10
+ - mytorch
11
+ - deep-learning
12
+ - from-scratch
13
+ - numpy-only
14
+ ---
15
+
16
+ # 🧠 MyTorch Elite MNIST Model
17
+
18
+ Curated by **Aryan Singh Chandel (Shiro)** at **Rustamji Institute of Technology (RJIT)**.
19
+
20
+ This repository hosts the **Elite MyTorch MNIST Model**, which achieved a remarkable **98.59% accuracy** on the MNIST handwritten digit classification task.
21
+
22
+ The model is built entirely from scratch using NumPy, replicating core deep learning functionalities. It incorporates advanced features such as AdamW optimizer, Batch Normalization, Dropout, and Label Smoothing to achieve state-of-the-art performance within a custom framework.
23
+
24
+ ## 🚀 Model Architecture
25
+ This is a multi-layer perceptron (MLP) with a "Diamond Architecture":
26
+ `784 (Input) -> 1024 -> 2048 (Expansion) -> 512 (Bottleneck) -> 10 (Output)`
27
+
28
+ Key components:
29
+ - **Linear Layers** with Kaiming Initialization
30
+ - **Batch Normalization**
31
+ - **ReLU** Activation Functions
32
+ - **Dropout** for regularization
33
+
34
+ ## ⚡ Training Details
35
+ - **Optimizer:** AdamW with Weight Decay
36
+ - **Loss Function:** Cross-Entropy Loss with Label Smoothing
37
+ - **Learning Rate Scheduler:** StepLR
38
+ - **Epochs:** 15
39
+ - **Batch Size:** 256
40
+ - **Data Augmentation:** Ultra-light Gaussian noise
41
+
42
+ ## 📊 Performance
43
+ - **Validation Accuracy:** 98.59%
44
+ - **Confusion Matrix:** (See `visuals/final_heatmap.png` in the main GitHub repo)
45
+
46
+ ## 📦 Files
47
+ - `best_model.pkl`: The serialized Python pickle file containing the trained MyTorch model instance.
48
+
49
+ ## 💡 Usage
50
+ To load and use this model in your MyTorch project:
51
+
52
+ ```python
53
+ import pickle
54
+ import numpy as np
55
+
56
+ # Assuming you have MyTorch installed or in your Python path
57
+ # from mytorch.nn.sequential import Sequential # (and other modules)
58
+
59
+ # Load the model
60
+ with open('best_model.pkl', 'rb') as f:
61
+ loaded_model = pickle.load(f)
62
+
63
+ # Example inference (assuming X_test is your preprocessed test data)
64
+ # predictions = loaded_model(X_test)
65
+ # print(np.argmax(predictions, axis=1))
66
+ ```
67
+
68
+ 🎓 Citation
69
+ If you use this model in your research, please attribute:
70
+ Chandel, A. S. (2026). MyTorch: Deep Learning from Scratch at RJIT.