Ambarella
/

TopFormer

Model card Files Files and versions

TopFormer / README.md

cooper_robot

Add release note for v1.1.0

259dea3 1 day ago

|

history blame contribute delete

1.49 kB

	---
	library_name: pytorch
	---

	![topformer_logo](resource/Topformer.png)

	TopFormer introduces a lightweight token–pyramid transformer that progressively merges local and global representations, achieving strong accuracy–efficiency trade-offs for mobile and edge vision tasks.

	Original paper: [TopFormer: Token Pyramid Transformer for Mobile Vision](https://arxiv.org/abs/2204.05525)

	# TopFormer-B

	This model uses the TopFormer-Base variant, which balances representational capacity and computational efficiency through a hierarchical token pyramid. It is well suited for on-device image classification and as an efficient backbone for downstream tasks where low latency and power efficiency are critical.

	Model Configuration:
	- Reference implementation: [TopFormer](https://github.com/hustvl/TopFormer)
	- Original Weight: [TopFormer-B_512x512_4x8_160k](https://drive.google.com/file/d/1m7CxYKWAyJzl5W3cj1vwsW4DfqAb_rqz/view?usp=sharing)
	- Resolution: 3x512x512
	- Dataset: ADE20K
	- Support Cooper version:
	- Cooper SDK: [2.5.2]
	- Cooper Foundry: [2.2]

	\| Model \| Device \| Model Link \|
	\| :-----: \| :-----: \| :-----: \|
	\| TopFormer-B \| N1-655 \| [Model_Link](https://huggingface.co/Ambarella/TopFormer/blob/main/n1-655_topformer_base.bin) \|
	\| TopFormer-B \| CV72 \| [Model_Link](https://huggingface.co/Ambarella/TopFormer/blob/main/cv72_topformer_base.bin) \|
	\| TopFormer-B \| CV75 \| [Model_Link](https://huggingface.co/Ambarella/TopFormer/blob/main/cv75_topformer_base.bin) \|