| library_name: pytorch | |
|  | |
| Swin Transformer V2 extends the original Swin Transformer with improved attention scaling and positional encoding, enabling stable training and strong performance on very large and high-resolution vision datasets. | |
| Original paper: [Swin Transformer V2: Scaling Up Capacity and Resolution](https://arxiv.org/abs/2111.09883) | |
| # Swin Transformer V2-Tiny | |
| This model uses the Swin Transformer V2-Tiny variant, a compact hierarchical transformer that applies shifted window self-attention for efficient computation. It is well suited for high-resolution image classification and as a backbone for dense vision tasks such as detection and segmentation. | |
| Model Configuration: | |
| - Reference implementation: [torchvision.models.swin_v2_t](https://pytorch.org/vision/0.20/models/swin_transformer.html) | |
| - Original Weight: [Swin_V2_T_Weights.IMAGENET1K_V1](https://download.pytorch.org/models/swin_v2_t-b137f0e2.pth) | |
| - Resolution: 3x256x256 | |
| - Support Cooper version: | |
| - Cooper SDK: [2.5.2] | |
| - Cooper Foundry: [2.2] | |
| | Model | Device | Model Link | | |
| | :-----: | :-----: | :-----: | | |
| | SwinV2-Tiny | N1-655 | [Model_Link](https://huggingface.co/Ambarella/SwinV2/blob/main/n1-655_swin_tiny_v2.bin) | | |
| | SwinV2-Tiny | CV72 | [Model_Link](https://huggingface.co/Ambarella/SwinV2/blob/main/cv72_swin_tiny_v2.bin) | | |
| | SwinV2-Tiny | CV75 | [Model_Link](https://huggingface.co/Ambarella/SwinV2/blob/main/cv75_swin_tiny_v2.bin) | | |