Add LiteRT converted beit_base_patch16_224

1e679ab verified 11 days ago

1.78 kB

library_name: litert
base_model: timm/beit_base_patch16_224.in22k_ft_in22k_in1k
tags:
  - vision
  - image-classification
datasets:
  - imagenet-1k

beit_base_patch16_224

Converted TIMM image classification model for LiteRT.

Source architecture: beit_base_patch16_224
File: model.tflite

Model Details

Model Type: Image classification / feature backbone
Model Stats:
- Params (M): 86.5
- GMACs: 17.6
- Activations (M): 23.9
- Image size: 224 x 224
Papers:
- BEiT: BERT Pre-Training of Image Transformers: https://arxiv.org/abs/2106.08254
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale: https://arxiv.org/abs/2010.11929v2
Dataset: ImageNet-1k
Pretrain Dataset: ImageNet-22k
Original: https://github.com/microsoft/unilm/tree/master/beit

Citation

@article{bao2021beit,
  title={Beit: Bert pre-training of image transformers},
  author={Bao, Hangbo and Dong, Li and Piao, Songhao and Wei, Furu},
  journal={arXiv preprint arXiv:2106.08254},
  year={2021}
}

@article{dosovitskiy2020vit,
  title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
  author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and  Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil},
  journal={ICLR},
  year={2021}
}

@misc{rw2019timm,
  author = {Ross Wightman},
  title = {PyTorch Image Models},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  doi = {10.5281/zenodo.4414861},
  howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
}