metadata
library_name: pytorch
TinyCLIP is a compact vision–language model that compresses CLIP through knowledge distillation, enabling efficient image–text representation learning with significantly lower compute and memory requirements.
Original paper: TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance, Wu et al., 2023
TinyCLIP-ViT8M16
This model uses the TinyCLIP variant, optimized for efficient image–text embedding generation while preserving strong zero-shot classification and retrieval performance. It is well suited for applications such as image retrieval, zero-shot classification, multimodal search, and edge vision-language deployments.
Model Configuration:
- Reference implementation: Official TinyCLIP source code
- Original Weight: TinyCLIP-ViT8M16
- Resolution: 3x224x224
- Support Cooper version:
- Cooper SDK: [2.5.4]
- Cooper Foundry: [2.3]
| Model | Device | Model Link |
|---|---|---|
| TinyCLIP-ViT8M16 Image Encoder | N1-655 | Model_Link |
| TinyCLIP-ViT8M16 Text Encoder | N1-655 | Model_Link |
| TinyCLIP-ViT8M16 Image encoder | CV7 | Model_Link |
| TinyCLIP-ViT8M16 Text Encoder | CV7 | Model_Link |
| TinyCLIP-ViT8M16 Image encoder | CV72 | Model_Link |
| TinyCLIP-ViT8M16 Text Encoder | CV72 | Model_Link |
| TinyCLIP-ViT8M16 Image encoder | CV75 | Model_Link |
| TinyCLIP-ViT8M16 Text Encoder | CV75 | Model_Link |
