Instructions to use litert-community/beit_base_patch16_224 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LiteRT
How to use litert-community/beit_base_patch16_224 with LiteRT:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
Add LiteRT converted beit_base_patch16_224
Browse files- README.md +61 -0
- model.tflite +3 -0
README.md
ADDED
|
@@ -0,0 +1,61 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
library_name: litert
|
| 3 |
+
base_model: timm/beit_base_patch16_224.in22k_ft_in22k_in1k
|
| 4 |
+
tags:
|
| 5 |
+
- vision
|
| 6 |
+
- image-classification
|
| 7 |
+
datasets:
|
| 8 |
+
- imagenet-1k
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# beit_base_patch16_224
|
| 12 |
+
|
| 13 |
+
Converted TIMM image classification model for LiteRT.
|
| 14 |
+
|
| 15 |
+
- Source architecture: beit_base_patch16_224
|
| 16 |
+
- File: model.tflite
|
| 17 |
+
|
| 18 |
+
## Model Details
|
| 19 |
+
|
| 20 |
+
- **Model Type:** Image classification / feature backbone
|
| 21 |
+
- **Model Stats:**
|
| 22 |
+
- Params (M): 86.5
|
| 23 |
+
- GMACs: 17.6
|
| 24 |
+
- Activations (M): 23.9
|
| 25 |
+
- Image size: 224 x 224
|
| 26 |
+
- **Papers:**
|
| 27 |
+
- BEiT: BERT Pre-Training of Image Transformers: https://arxiv.org/abs/2106.08254
|
| 28 |
+
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale: https://arxiv.org/abs/2010.11929v2
|
| 29 |
+
- **Dataset:** ImageNet-1k
|
| 30 |
+
- **Pretrain Dataset:** ImageNet-22k
|
| 31 |
+
- **Original:** https://github.com/microsoft/unilm/tree/master/beit
|
| 32 |
+
|
| 33 |
+
## Citation
|
| 34 |
+
|
| 35 |
+
```bibtex
|
| 36 |
+
@article{bao2021beit,
|
| 37 |
+
title={Beit: Bert pre-training of image transformers},
|
| 38 |
+
author={Bao, Hangbo and Dong, Li and Piao, Songhao and Wei, Furu},
|
| 39 |
+
journal={arXiv preprint arXiv:2106.08254},
|
| 40 |
+
year={2021}
|
| 41 |
+
}
|
| 42 |
+
```
|
| 43 |
+
```bibtex
|
| 44 |
+
@article{dosovitskiy2020vit,
|
| 45 |
+
title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
|
| 46 |
+
author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil},
|
| 47 |
+
journal={ICLR},
|
| 48 |
+
year={2021}
|
| 49 |
+
}
|
| 50 |
+
```
|
| 51 |
+
```bibtex
|
| 52 |
+
@misc{rw2019timm,
|
| 53 |
+
author = {Ross Wightman},
|
| 54 |
+
title = {PyTorch Image Models},
|
| 55 |
+
year = {2019},
|
| 56 |
+
publisher = {GitHub},
|
| 57 |
+
journal = {GitHub repository},
|
| 58 |
+
doi = {10.5281/zenodo.4414861},
|
| 59 |
+
howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
|
| 60 |
+
}
|
| 61 |
+
```
|
model.tflite
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:75edc0de29257443e04c8cbc910ffac126bd3af276951e93eec580cc9cc217da
|
| 3 |
+
size 350019648
|