Add files using upload-large-folder tool
Browse files
README.md
CHANGED
|
@@ -14,6 +14,7 @@ tags:
|
|
| 14 |
## What’s different vs standard CLIP
|
| 15 |
- based on CLIP ViT-B/32 (from [laion/CLIP-ViT-B-32-laion2B-s34B-b79K](https://huggingface.co/laion/CLIP-ViT-B-32-laion2B-s34B-b79K))
|
| 16 |
- **Longer text context**: `max_position_embeddings=512` (vs the usual 77).
|
|
|
|
| 17 |
## Usage
|
| 18 |
|
| 19 |
```python
|
|
@@ -43,9 +44,6 @@ with torch.no_grad():
|
|
| 43 |
print(probs[0].tolist())
|
| 44 |
```
|
| 45 |
|
| 46 |
-
## Training data
|
| 47 |
-
- Train data: 1.67M image-caption pairs, caption regenerated by Qwen2.5-VL-72B (512-token max length), images sampled from LAION-2B.
|
| 48 |
-
|
| 49 |
## Zero-shot classification example
|
| 50 |
|
| 51 |
```python
|
|
|
|
| 14 |
## What’s different vs standard CLIP
|
| 15 |
- based on CLIP ViT-B/32 (from [laion/CLIP-ViT-B-32-laion2B-s34B-b79K](https://huggingface.co/laion/CLIP-ViT-B-32-laion2B-s34B-b79K))
|
| 16 |
- **Longer text context**: `max_position_embeddings=512` (vs the usual 77).
|
| 17 |
+
- Train data: 1.67M image-caption pairs, caption regenerated by Qwen2.5-VL-72B (512-token max length), images sampled from LAION-2B.
|
| 18 |
## Usage
|
| 19 |
|
| 20 |
```python
|
|
|
|
| 44 |
print(probs[0].tolist())
|
| 45 |
```
|
| 46 |
|
|
|
|
|
|
|
|
|
|
| 47 |
## Zero-shot classification example
|
| 48 |
|
| 49 |
```python
|