Update README.md
Browse files
README.md
CHANGED
|
@@ -4,9 +4,11 @@ datasets:
|
|
| 4 |
- ILSVRC/imagenet-1k
|
| 5 |
---
|
| 6 |
|
| 7 |
-
# Vision Transformer (
|
| 8 |
|
| 9 |
-
Reproduction of the ViT-L/16 results from the [DINOv2 repo](https://github.com/facebookresearch/dinov2/blob/main/dinov2/configs/train/vitl16_short.yaml)
|
|
|
|
|
|
|
| 10 |
|
| 11 |
### How to use
|
| 12 |
|
|
|
|
| 4 |
- ILSVRC/imagenet-1k
|
| 5 |
---
|
| 6 |
|
| 7 |
+
# Vision Transformer (ViT) trained using DINOv2 on ImageNet-1K only
|
| 8 |
|
| 9 |
+
Reproduction of the ViT-L/16 results from the [DINOv2 repo](https://github.com/facebookresearch/dinov2/blob/main/dinov2/configs/train/vitl16_short.yaml) (which uses only ImageNet-1K in 224x224 resolution).
|
| 10 |
+
|
| 11 |
+
The [original](https://huggingface.co/facebook/dinov2-large) work uses the much larger LVD142M dataset and distills a larger model (g/14) into a L/14 model.
|
| 12 |
|
| 13 |
### How to use
|
| 14 |
|