File size: 1,615 Bytes
e96c476
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
license: mit
tags:
  - kornia
  - image-classification
  - backbone
---

# kornia/tiny_vit

Pretrained weights for **TinyViT**,
used as the encoder backbone in
[`kornia.models.SegmentAnything`](https://kornia.readthedocs.io/en/latest/models.html)
(MobileSAM) and available via
[`kornia.models.TinyViT`](https://kornia.readthedocs.io/en/latest/models.html).

TinyViT is a small Vision Transformer trained with knowledge distillation from large
teacher models on ImageNet-22K. ECCV 2022.

**Original repo:** [microsoft/Cream/TinyViT](https://github.com/microsoft/Cream/tree/main/TinyViT)

## Weights

| File | Params | Pre-training | Fine-tuning |
|------|--------|-------------|-------------|
| `tiny_vit_5m_22k_distill.pth` | 5M | ImageNet-22K | — |
| `tiny_vit_5m_22kto1k_distill.pth` | 5M | ImageNet-22K | ImageNet-1K 224 |
| `tiny_vit_11m_22k_distill.pth` | 11M | ImageNet-22K | — |
| `tiny_vit_11m_22kto1k_distill.pth` | 11M | ImageNet-22K | ImageNet-1K 224 |
| `tiny_vit_21m_22k_distill.pth` | 21M | ImageNet-22K | — |
| `tiny_vit_21m_22kto1k_distill.pth` | 21M | ImageNet-22K | ImageNet-1K 224 |
| `tiny_vit_21m_22kto1k_384_distill.pth` | 21M | ImageNet-22K | ImageNet-1K 384 |
| `tiny_vit_21m_22kto1k_512_distill.pth` | 21M | ImageNet-22K | ImageNet-1K 512 |

## Citation

```bibtex
@inproceedings{wu2022tinyvit,
    title     = {{TinyViT}: Fast Pretraining Distillation for Small Vision Transformers},
    author    = {Wu, Kan and Zhang, Jinnian and Peng, Houwen and Liu, Mengchen
                 and Xiao, Bin and Fu, Jianlong and Yuan, Lu},
    booktitle = {ECCV},
    year      = {2022}
}
```