| --- |
| library_name: pytorch |
| --- |
| |
|  |
|
|
| TinyCLIP is a compact vision–language model that compresses CLIP through knowledge distillation, enabling efficient image–text representation learning with significantly lower compute and memory requirements. |
|
|
| Original paper: [TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance, Wu et al., 2023](https://arxiv.org/abs/2309.12314) |
|
|
| # TinyCLIP-ViT8M16 |
|
|
| This model uses the TinyCLIP variant, optimized for efficient image–text embedding generation while preserving strong zero-shot classification and retrieval performance. It is well suited for applications such as image retrieval, zero-shot classification, multimodal search, and edge vision-language deployments. |
|
|
| Model Configuration: |
| - Reference implementation: [Official TinyCLIP source code](https://github.com/microsoft/Cream/tree/main/TinyCLIP) |
| - Original Weight: [TinyCLIP-ViT8M16](https://github.com/wkcn/TinyCLIP-model-zoo/releases/download/checkpoints/TinyCLIP-ViT-8M-16-Text-3M-YFCC15M.pt) |
| - Resolution: 3x224x224 |
| - Support Cooper version: |
| - Cooper SDK: [2.5.4] |
| - Cooper Foundry: [2.3] |
|
|
| | Model | Device | Model Link | |
| | :-----: | :-----: | :-----: | |
| | TinyCLIP-ViT8M16 Image Encoder | N1-655 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/n1-655_tinyclip_vit8m16_image_encoder_act16.bin) | |
| | TinyCLIP-ViT8M16 Text Encoder | N1-655 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/n1-655_tinyclip_vit8m16_text_encoder_act16.bin) | |
| | TinyCLIP-ViT8M16 Image encoder | CV7 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/cv7_tinyclip_vit8m16_image_encoder_act16.bin) | |
| | TinyCLIP-ViT8M16 Text Encoder | CV7 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/cv7_tinyclip_vit8m16_text_encoder_act16.bin) | |
| | TinyCLIP-ViT8M16 Image encoder | CV72 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/cv72_tinyclip_vit8m16_image_encoder_act16.bin) | |
| | TinyCLIP-ViT8M16 Text Encoder | CV72 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/cv72_tinyclip_vit8m16_text_encoder_act16.bin) | |
| | TinyCLIP-ViT8M16 Image encoder | CV75 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/cv75_tinyclip_vit8m16_image_encoder_act16.bin) | |
| | TinyCLIP-ViT8M16 Text Encoder | CV75 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/cv75_tinyclip_vit8m16_text_encoder_act16.bin) | |
|
|