metadata
library_name: pytorch
TopFormer introduces a lightweight token–pyramid transformer that progressively merges local and global representations, achieving strong accuracy–efficiency trade-offs for mobile and edge vision tasks.
Original paper: TopFormer: Token Pyramid Transformer for Mobile Vision
TopFormer-B
This model uses the TopFormer-Base variant, which balances representational capacity and computational efficiency through a hierarchical token pyramid. It is well suited for on-device image classification and as an efficient backbone for downstream tasks where low latency and power efficiency are critical.
Model Configuration:
- Reference implementation: TopFormer
- Original Weight: TopFormer-B_512x512_4x8_160k
- Resolution: 3x512x512
- Dataset: ADE20K
- Support Cooper version:
- Cooper SDK: [2.5.2]
- Cooper Foundry: [2.2]
| Model | Device | Model Link |
|---|---|---|
| TopFormer-B | N1-655 | Model_Link |
| TopFormer-B | CV72 | Model_Link |
| TopFormer-B | CV75 | Model_Link |
