sdtemple
/

color-prediction-model

@@ -1,9 +1,71 @@
 ---
 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
 ---
 This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
 - Code: [More Information Needed]
 - Paper: [More Information Needed]

 ---
+license: mit
+datasets:
+- sdtemple/colored-shapes
+language:
+- en
+metrics:
+- precision
+- recall
+- roc_auc
+- accuracy
+pipeline_tag: image-classification
 tags:
 - model_hub_mixin
 - pytorch_model_hub_mixin
+- tutorial
 ---
+This model predicts the color (among 8 colors) of 1 shape (circle, rectangle, diamond, triangle) in a 224 x 224 x 3 image.
+This model is a part of a how to tutorial on fitting PyTorch models.
+The model is trained on 2000 examples for each color and shape combo (64,000 samples in total) simulated according to [https://github.com/sdtemple/zootopia3](https://github.com/sdtemple/zootopia3).
+The model is tested/evaluated on the dataset [https://huggingface.co/datasets/sdtemple/colored-shapes](https://huggingface.co/datasets/sdtemple/colored-shapes), which has slightly smaller shapes simulated (out of distribution) relative to the training data.  The metrics below can be +- a few points depending on random seed.
+- Accuracy: 97%
+- Min precision (red): 91%
+- Max precision (multiple): 100%
+- Min recall (multiple): 95%
+- Max recall (multiple): 100%
+- AUROC (all): >= 99.90%
+The model architecture is the following. In light experimentation, I found it important to have multiple convolutions and that too many parameters leads to noisy validation losses by epoch.
+```
+MyCNN(
+  (conv_block): Sequential(
+    (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+    (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
+    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
+    (3): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+    (4): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
+    (5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
+    (6): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+    (7): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
+    (8): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
+    (9): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+    (10): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
+    (11): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
+    (12): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
+    (13): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
+    (14): AvgPool2d(kernel_size=2, stride=2, padding=0)
+  )
+  (linear_block): Sequential(
+    (0): Linear(in_features=784, out_features=16, bias=True)
+    (1): BatchNorm1d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
+    (2): ReLU()
+    (3): Dropout(p=0.2, inplace=False)
+    (4): Linear(in_features=16, out_features=16, bias=True)
+    (5): BatchNorm1d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
+    (6): ReLU()
+    (7): Dropout(p=0.2, inplace=False)
+  )
+  (output_block): Linear(in_features=16, out_features=4, bias=True)
+)
+```
 This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
 - Code: [More Information Needed]
 - Paper: [More Information Needed]