Akimotorakiyu commited on
Commit
c482abd
·
verified ·
1 Parent(s): 8140041

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +173 -0
README.md ADDED
@@ -0,0 +1,173 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ tags:
7
+ - pytorch
8
+ - computer-vision
9
+ - image-classification
10
+ - mnist
11
+ - cnn
12
+ - custom-model
13
+ metrics:
14
+ - accuracy
15
+ ---
16
+
17
+ # MNIST CNN Classifier
18
+
19
+ A custom Convolutional Neural Network for MNIST digit classification, built with PyTorch and compatible with Hugging Face Transformers.
20
+
21
+ ## Model Description
22
+
23
+ This model implements a CNN architecture specifically designed for MNIST handwritten digit recognition. The model achieves over 98% accuracy on the MNIST test set and is fully compatible with the Hugging Face Transformers ecosystem.
24
+
25
+ ## Model Architecture
26
+
27
+ - **Input**: 1x28x28 grayscale images (MNIST digits)
28
+ - **Architecture**:
29
+ - 2 convolutional blocks (each with 2 conv layers + batch norm + ReLU + max pool + dropout)
30
+ - 2 fully connected layers (with batch norm and dropout)
31
+ - Output layer: 10 classes (digits 0-9)
32
+ - **Parameters**: ~1.68M trainable parameters
33
+ - **Activation**: ReLU
34
+ - **Normalization**: Batch normalization
35
+ - **Regularization**: Dropout (0.25 for conv layers, 0.5 for fc layers)
36
+
37
+ ## Training Details
38
+
39
+ - **Dataset**: MNIST (60,000 training, 10,000 test samples)
40
+ - **Optimizer**: Adam (lr=0.001)
41
+ - **Loss Function**: Cross-Entropy Loss
42
+ - **Batch Size**: 64
43
+ - **Epochs**: 10
44
+ - **Learning Rate Scheduling**: ReduceLROnPlateau
45
+ - **Data Augmentation**: None (basic MNIST preprocessing only)
46
+ - **Normalization**: MNIST standard (mean=0.1307, std=0.3081)
47
+
48
+ ## Performance
49
+
50
+ - **Test Accuracy**: >98%
51
+ - **Training Time**: ~5 minutes on single GPU
52
+ - **Model Size**: ~6.7MB (saved weights)
53
+
54
+ ## Usage
55
+
56
+ ### Using Hugging Face Transformers
57
+
58
+ ```python
59
+ from transformers import AutoModel, AutoImageProcessor
60
+ import torch
61
+ from PIL import Image
62
+ import numpy as np
63
+
64
+ # Load model and processor
65
+ model = AutoModel.from_pretrained("your-username/mnist-cnn-classifier")
66
+ processor = AutoImageProcessor.from_pretrained("your-username/mnist-cnn-classifier")
67
+
68
+ # Prepare image
69
+ image = Image.open("path/to/mnist_digit.png").convert("L") # Convert to grayscale
70
+ inputs = processor(images=image, return_tensors="pt")
71
+
72
+ # Forward pass
73
+ with torch.no_grad():
74
+ outputs = model(**inputs)
75
+ predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
76
+ predicted_class = torch.argmax(predictions, dim=-1).item()
77
+ confidence = predictions[0][predicted_class].item()
78
+
79
+ print(f"Predicted digit: {predicted_class} (confidence: {confidence:.4f})")
80
+ ```
81
+
82
+ ### Using PyTorch Directly
83
+
84
+ ```python
85
+ import torch
86
+ from modeling_mnist_cnn import MnistCNN
87
+ from configuration_mnist_cnn import MnistCnnConfig
88
+ from torchvision import transforms
89
+
90
+ # Load configuration and model
91
+ config = MnistCnnConfig()
92
+ model = MnistCNN(config)
93
+ model.load_state_dict(torch.load("best_model.pth", map_location="cpu"))
94
+ model.eval()
95
+
96
+ # Define transforms
97
+ transform = transforms.Compose([
98
+ transforms.ToTensor(),
99
+ transforms.Normalize((0.1307,), (0.3081,))
100
+ ])
101
+
102
+ # Load and preprocess image
103
+ from PIL import Image
104
+ image = Image.open("digit.png").convert("L")
105
+ input_tensor = transform(image).unsqueeze(0)
106
+
107
+ # Predict
108
+ with torch.no_grad():
109
+ output = model(input_tensor)
110
+ prediction = torch.argmax(output, dim=1).item()
111
+
112
+ print(f"Predicted digit: {prediction}")
113
+ ```
114
+
115
+ ## Intended Use
116
+
117
+ This model is designed for:
118
+ - Educational purposes and learning computer vision
119
+ - Benchmarking and comparison with other MNIST models
120
+ - Testing deployment pipelines
121
+ - Demonstrating custom model integration with Hugging Face
122
+
123
+ ## Limitations
124
+
125
+ - Trained only on MNIST dataset (handwritten digits 0-9)
126
+ - Not suitable for general character recognition
127
+ - Performance may vary on different writing styles not represented in MNIST
128
+ - Input must be 28x28 grayscale images
129
+
130
+ ## Ethical Considerations
131
+
132
+ This model was trained on a standard academic dataset and poses no significant ethical concerns. It should be used responsibly for educational and research purposes.
133
+
134
+ ## Training Data
135
+
136
+ The model was trained on the MNIST dataset, which is freely available for academic and research use. The dataset consists of:
137
+ - 60,000 training images
138
+ - 10,000 test images
139
+ - 28x28 pixel grayscale handwritten digits (0-9)
140
+
141
+ ## Technical Details
142
+
143
+ - **Framework**: PyTorch
144
+ - **Transformers Compatibility**: Yes
145
+ - **AutoClass Support**: Yes
146
+ - **Supported Tasks**: Image Classification
147
+ - **Input Format**: Images (PIL.Image.Image)
148
+ - **Output Format**: Class labels (0-9)
149
+
150
+ ## Model Files
151
+
152
+ - `pytorch_model.bin`: Trained model weights
153
+ - `config.json`: Model configuration
154
+ - `preprocessor_config.json`: Image preprocessing configuration
155
+ - `modeling_mnist_cnn.py`: Model architecture definition
156
+ - `configuration_mnist_cnn.py`: Configuration class
157
+
158
+ ## Citation
159
+
160
+ If you use this model in your research, please cite:
161
+
162
+ ```bibtex
163
+ @misc{mnist-cnn-classifier,
164
+ title={MNIST CNN Classifier},
165
+ author={Your Name},
166
+ year={2024},
167
+ url={https://huggingface.co/your-username/mnist-cnn-classifier}
168
+ }
169
+ ```
170
+
171
+ ## License
172
+
173
+ This model is released under the MIT License.