Update README.md
Browse files
README.md
CHANGED
|
@@ -16,8 +16,6 @@ This repository provides Vision Transformer (ViT) models fine-tuned to detect ma
|
|
| 16 |
| `vit_small_patch32_224` | AugReg + IN21k + IN1k | 32×32 | ~22M |
|
| 17 |
| `vit_large_patch16_224` | ImageNet-21k | 16×16 | ~304M |
|
| 18 |
| `vit_large_patch32_224` | ImageNet-21k | 32×32 | ~304M |
|
| 19 |
-
| `vit_huge_patch14_224` | ImageNet-21k | 14×14 | ~632M |
|
| 20 |
-
|
| 21 |
|
| 22 |
---
|
| 23 |
|
|
@@ -58,6 +56,20 @@ They are well-suited for deployment in digital forensics, content moderation, an
|
|
| 58 |
|
| 59 |
---
|
| 60 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 61 |
## 📄 License
|
| 62 |
|
| 63 |
This model is licensed under the [CreativeML OpenRAIL-M License](https://huggingface.co/spaces/CompVis/stable-diffusion-license).
|
|
|
|
| 16 |
| `vit_small_patch32_224` | AugReg + IN21k + IN1k | 32×32 | ~22M |
|
| 17 |
| `vit_large_patch16_224` | ImageNet-21k | 16×16 | ~304M |
|
| 18 |
| `vit_large_patch32_224` | ImageNet-21k | 32×32 | ~304M |
|
|
|
|
|
|
|
| 19 |
|
| 20 |
---
|
| 21 |
|
|
|
|
| 56 |
|
| 57 |
---
|
| 58 |
|
| 59 |
+
## 📊 Results
|
| 60 |
+
|
| 61 |
+
| Vision Transformer Model | Train Accuracy | Validation Accuracy |
|
| 62 |
+
|-------------------------------|----------------|---------------------|
|
| 63 |
+
| **vit_large_patch16_224** | 94.89% | **91.22%** |
|
| 64 |
+
| vit_large_patch32_224 | 91.31% | 89.23% |
|
| 65 |
+
| vit_tiny_patch16_224 | 92.41% | 89.20% |
|
| 66 |
+
| vit_small_patch32_224 | 91.58% | 88.38% |
|
| 67 |
+
| vit_small_patch16_224 | 84.72% | 87.68% |
|
| 68 |
+
| vit_base_patch16_224 | 90.65% | 85.36% |
|
| 69 |
+
| vit_base_patch32_224 | 79.54% | 79.54% |
|
| 70 |
+
|
| 71 |
+
---
|
| 72 |
+
|
| 73 |
## 📄 License
|
| 74 |
|
| 75 |
This model is licensed under the [CreativeML OpenRAIL-M License](https://huggingface.co/spaces/CompVis/stable-diffusion-license).
|