| license: apache-2.0 | |
| tags: | |
| - self-supervised learning | |
| - vision | |
| - SiT | |
| inference: false | |
| # Model description | |
| SiT is a self-supervised learning model that combines masked image modeling and contrastive learning. The model is trained on ImageNet-1K. | |
| # Model Sources | |
| - https://github.com/Sara-Ahmed/SiT | |
| - https://arxiv.org/abs/2104.03602 | |
| # Model Card Authors | |
| Sara Atito, Muhammad Awais, Josef Kittler | |
| # How to use | |
| ```python | |
| from modeling_sit import ViTSiTForPreTraining | |
| # reload | |
| model = ViTSiTForPreTraining.from_pretrained("erow/SiT") | |
| ``` | |
| # BibTeX entry and citation info | |
| ``` | |
| @inproceedings{atito2023sit, | |
| title={SiT is all you need}, | |
| author={Atito, Sara and Awais, Muhammed and Nandam, Srinivasa and Kittler, Josef}, | |
| booktitle={2023 IEEE International Conference on Image Processing (ICIP)}, | |
| pages={2125--2129}, | |
| year={2023}, | |
| organization={IEEE} | |
| } | |
| ``` |