PyTorch
TinyCLIP / README.md
cooper_robot
Add release note for v1.3.0
8cb365e
|
Raw
History Blame Contribute Delete
2.47 kB
---
library_name: pytorch
---
![tinyclip_logo](resource/TinyCLIP.png)
TinyCLIP is a compact vision–language model that compresses CLIP through knowledge distillation, enabling efficient image–text representation learning with significantly lower compute and memory requirements.
Original paper: [TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance, Wu et al., 2023](https://arxiv.org/abs/2309.12314)
# TinyCLIP-ViT8M16
This model uses the TinyCLIP variant, optimized for efficient image–text embedding generation while preserving strong zero-shot classification and retrieval performance. It is well suited for applications such as image retrieval, zero-shot classification, multimodal search, and edge vision-language deployments.
Model Configuration:
- Reference implementation: [Official TinyCLIP source code](https://github.com/microsoft/Cream/tree/main/TinyCLIP)
- Original Weight: [TinyCLIP-ViT8M16](https://github.com/wkcn/TinyCLIP-model-zoo/releases/download/checkpoints/TinyCLIP-ViT-8M-16-Text-3M-YFCC15M.pt)
- Resolution: 3x224x224
- Support Cooper version:
- Cooper SDK: [2.5.4]
- Cooper Foundry: [2.3]
| Model | Device | Model Link |
| :-----: | :-----: | :-----: |
| TinyCLIP-ViT8M16 Image Encoder | N1-655 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/n1-655_tinyclip_vit8m16_image_encoder_act16.bin) |
| TinyCLIP-ViT8M16 Text Encoder | N1-655 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/n1-655_tinyclip_vit8m16_text_encoder_act16.bin) |
| TinyCLIP-ViT8M16 Image encoder | CV7 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/cv7_tinyclip_vit8m16_image_encoder_act16.bin) |
| TinyCLIP-ViT8M16 Text Encoder | CV7 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/cv7_tinyclip_vit8m16_text_encoder_act16.bin) |
| TinyCLIP-ViT8M16 Image encoder | CV72 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/cv72_tinyclip_vit8m16_image_encoder_act16.bin) |
| TinyCLIP-ViT8M16 Text Encoder | CV72 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/cv72_tinyclip_vit8m16_text_encoder_act16.bin) |
| TinyCLIP-ViT8M16 Image encoder | CV75 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/cv75_tinyclip_vit8m16_image_encoder_act16.bin) |
| TinyCLIP-ViT8M16 Text Encoder | CV75 | [Model_Link](https://huggingface.co/Ambarella/TinyCLIP/blob/main/cv75_tinyclip_vit8m16_text_encoder_act16.bin) |