asnelt
/

mnistvit

Transformers

Model card Files Files and versions

xet

Community

asnelt commited on Jun 2, 2024

Commit

7d56fb2

verified ·

1 Parent(s): aa6df66

Update README and add initial model

Browse files

Files changed (2) hide show

README.md +85 -1
model.pt +3 -0

README.md CHANGED Viewed

@@ -1,3 +1,87 @@
 ---
-license: gpl-3.0
 ---

 ---
+license: mit
+datasets:
+- mnist
+metrics:
+- accuracy
 ---
+# Model Card for mnistvit
+A vision transformer (ViT) trained on MNIST with a PyTorch-only implementation,
+achieving 99.65% test set accuracy.
+## Model Details
+### Model Description
+The model is a vision transformer, as described in the original
+Dosovitskiy et al., ICLR 2021 paper.
+- **Developed by:** Arno Onken
+- **Model type:** Vision Transformer
+- **License:** MIT
+### Model Sources
+- **Python Package Index:**
+  [https://pypi.org/project/mnistvit/](https://pypi.org/project/mnistvit/)
+- **Paper:** [Dosovitskiy et al., ICLR 2021](https://openreview.net/forum?id=YicbFdNTTy)
+## Uses
+The model is intended to be used for learning about vision transformers.  It is small
+and trained on MNIST as a simple and well understood dataset.  Together with the
+mnistvit package code, the importance of various hyperparameters can be explored.
+## How to Get Started with the Model
+Install the mnistvit package, which provides code for training and running the model:
+```
+pip install mnistvit
+```
+Place the `model.pt` file from this repository in a directory of your choice and run
+Python from that directory.
+To evaluate the test set accuracy and loss of the model stored in `model.pt`:
+```
+python -m mnistvit --use-accuracy --use-loss
+```
+Individual images can be classified as well.  To predict the class of a digit image
+stored in a file `sample.jpg`:
+```
+python -m mnistvit --image-file sample.jpg
+```
+## Training Details
+### Training Data
+This model was trained on the 60,000 training set images of the
+[MNIST](https://huggingface.co/datasets/ylecun/mnist/) dataset.  Data augmentation was
+used in the form of random rotations, translations and scaling as detailed in the
+`mnistvit.preprocess` module.
+### Training Procedure
+- **Training regime:** fp32
+Hyperparameters were obtained from an 80:20 training set - validation set split of the
+original MNIST training set, running Ray Tune with Optuna as detailed in the
+`mnistvit.tune` module.  The resulting parameters were then set as default parameters in
+the `mnistvit.train` module.
+## Evaluation
+### Testing Data
+This model was evaluated on the 10,000 test set images of the
+[MNIST](https://huggingface.co/datasets/ylecun/mnist/) dataset.
+### Results
+Test set accuracy: 99.65%
+Test set cross entropy loss: 0.011

model.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cf45c0f6be01dba4df12f028f1a2a3013764c1ff00453d2fee52a92b6fac6527
+size 44466002