pvt_v2_b2

Converted TIMM image classification model for LiteRT.

  • Source architecture: pvt_v2_b2
  • Source checkpoint: timm/pvt_v2_b2.in1k
  • File: model.tflite
  • Input: float32 tensor in NCHW layout, shape [1, 3, 224, 224]
  • Output: ImageNet-1K logits, shape [1, 1000]

Runtime Status

  • CPU smoke test: passed with LiteRT CompiledModel.
  • GPU delegation: currently blocked for this model by rank-5 tensor patterns in the GPU backend, mostly RESHAPE, TRANSPOSE, and related window/attention operations. The model is published as CPU-ready while GPU support is being improved.

Model Details

Citation

@article{wang2021pvtv2,
  title={Pvtv2: Improved baselines with pyramid vision transformer},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Fan, Deng-Ping and Song, Kaitao and Liang, Ding and Lu, Tong and Luo, Ping and Shao, Ling},
  journal={Computational Visual Media},
  volume={8},
  number={3},
  pages={1--10},
  year={2022},
  publisher={Springer}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for litert-community/pvt_v2_b2

Finetuned
(1)
this model

Dataset used to train litert-community/pvt_v2_b2

Collection including litert-community/pvt_v2_b2

Paper for litert-community/pvt_v2_b2