metadata
library_name: pytorch
Swin Transformer V2 extends the original Swin Transformer with improved attention scaling and positional encoding, enabling stable training and strong performance on very large and high-resolution vision datasets.
Original paper: Swin Transformer V2: Scaling Up Capacity and Resolution
Swin Transformer V2-Tiny
This model uses the Swin Transformer V2-Tiny variant, a compact hierarchical transformer that applies shifted window self-attention for efficient computation. It is well suited for high-resolution image classification and as a backbone for dense vision tasks such as detection and segmentation.
Model Configuration:
- Reference implementation: torchvision.models.swin_v2_t
- Original Weight: Swin_V2_T_Weights.IMAGENET1K_V1
- Resolution: 3x256x256
- Support Cooper version:
- Cooper SDK: [2.5.2]
- Cooper Foundry: [2.2]
| Model | Device | Model Link |
|---|---|---|
| SwinV2-Tiny | N1-655 | Model_Link |
| SwinV2-Tiny | CV72 | Model_Link |
| SwinV2-Tiny | CV75 | Model_Link |
