cooper_robot commited on
Commit
d659e57
·
1 Parent(s): 979f6ca

Add release note for v1.1.0

Browse files
Files changed (2) hide show
  1. README.md +30 -0
  2. resource/LongCLIP.png +3 -0
README.md ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: pytorch
3
+ ---
4
+
5
+ ![longclip_logo](resource/LongCLIP.png)
6
+
7
+ LongCLIP extends the CLIP vision–language framework to support significantly longer text inputs, enabling richer contextual understanding while preserving strong image–text alignment.
8
+
9
+ Original paper: [Long-CLIP: Unlocking Long-Text Capability in CLIP, Zhang et al., 2024](https://arxiv.org/abs/2403.15378)
10
+
11
+ # LongCLIP-B16
12
+
13
+ This model uses the LongCLIP B/16 variant, which is based on a ViT-Base backbone with 16×16 image patches and enhanced long-text encoding capacity. It is well suited for vision–language applications such as image retrieval, zero-shot classification, and multimodal reasoning where long textual prompts or descriptions are important.
14
+
15
+ Model Configuration:
16
+ - Reference implementation: [LongCLIP-B16](https://github.com/beichenzbc/Long-CLIP)
17
+ - Original Weight: [LongCLIP-B16](https://huggingface.co/BeichenZhang/LongCLIP-B/blob/main/longclip-B.pt)
18
+ - Resolution: 3x224x224
19
+ - Support Cooper version:
20
+ - Cooper SDK: [2.5.2]
21
+ - Cooper Foundry: [2.2]
22
+
23
+ | Model | Device | Model Link |
24
+ | :-----: | :-----: | :-----: |
25
+ | LongCLIP-B16 Image Encoder | N1-655 | [Model_Link](https://huggingface.co/Ambarella/LongCLIP/blob/main/n1-655_longclip_base_patch16_image_encoder.bin) |
26
+ | LongCLIP-B16 Text Encoder | N1-655 | [Model_Link](https://huggingface.co/Ambarella/LongCLIP/blob/main/n1-655_longclip_base_patch16_text_encoder.bin) |
27
+ | LongCLIP-B16 Image encoder | CV72 | [Model_Link](https://huggingface.co/Ambarella/LongCLIP/blob/main/cv72_longclip_base_patch16_image_encoder.bin) |
28
+ | LongCLIP-B16 Text Encoder | CV72 | [Model_Link](https://huggingface.co/Ambarella/LongCLIP/blob/main/cv72_longclip_base_patch16_text_encoder.bin) |
29
+ | LongCLIP-B16 Image encoder | CV75 | [Model_Link](https://huggingface.co/Ambarella/LongCLIP/blob/main/cv75_longclip_base_patch16_image_encoder.bin) |
30
+ | LongCLIP-B16 Text Encoder | CV75 | [Model_Link](https://huggingface.co/Ambarella/LongCLIP/blob/main/cv75_longclip_base_patch16_text_encoder.bin) |
resource/LongCLIP.png ADDED

Git LFS Details

  • SHA256: 361a2087d07b38eafdf908168b471305864673f6c50f819324bba56abcb0d812
  • Pointer size: 132 Bytes
  • Size of remote file: 1.21 MB