File size: 1,487 Bytes
f014efc 259dea3 f014efc 259dea3 f014efc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
---
library_name: pytorch
---

TopFormer introduces a lightweight token–pyramid transformer that progressively merges local and global representations, achieving strong accuracy–efficiency trade-offs for mobile and edge vision tasks.
Original paper: [TopFormer: Token Pyramid Transformer for Mobile Vision](https://arxiv.org/abs/2204.05525)
# TopFormer-B
This model uses the TopFormer-Base variant, which balances representational capacity and computational efficiency through a hierarchical token pyramid. It is well suited for on-device image classification and as an efficient backbone for downstream tasks where low latency and power efficiency are critical.
Model Configuration:
- Reference implementation: [TopFormer](https://github.com/hustvl/TopFormer)
- Original Weight: [TopFormer-B_512x512_4x8_160k](https://drive.google.com/file/d/1m7CxYKWAyJzl5W3cj1vwsW4DfqAb_rqz/view?usp=sharing)
- Resolution: 3x512x512
- Dataset: ADE20K
- Support Cooper version:
- Cooper SDK: [2.5.2]
- Cooper Foundry: [2.2]
| Model | Device | Model Link |
| :-----: | :-----: | :-----: |
| TopFormer-B | N1-655 | [Model_Link](https://huggingface.co/Ambarella/TopFormer/blob/main/n1-655_topformer_base.bin) |
| TopFormer-B | CV72 | [Model_Link](https://huggingface.co/Ambarella/TopFormer/blob/main/cv72_topformer_base.bin) |
| TopFormer-B | CV75 | [Model_Link](https://huggingface.co/Ambarella/TopFormer/blob/main/cv75_topformer_base.bin) | |