Unleash KO-CLIP
Browse files
README.md
CHANGED
|
@@ -1,5 +1,10 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
# CLIP-KO: Knocking Out Typographic Attacks in CLIP 💪🤖
|
| 5 |
### Less vulnerability, much better performance! 🤗
|
|
@@ -133,4 +138,10 @@ No more artifacts in attention heatmaps!
|
|
| 133 |
| Flickr8k | Img-Text Cos Sim (mean) ↑ | 0.3359 | **0.3368** |
|
| 134 |
| Flickr8k | Img-Text Cos Sim (std) | 0.0409 | 0.0619 |
|
| 135 |
| Flickr8k | Text-Text Cos Sim (mean) | 0.8021 | 0.7356 |
|
| 136 |
-
| Flickr8k | Text-Text Cos Sim (std) | 0.0857 | 0.1407 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
datasets:
|
| 4 |
+
- SPRIGHT-T2I/spright_coco
|
| 5 |
+
- zer0int/CLIP-KO-Adversarial-Train-Typo-Attack
|
| 6 |
+
base_model:
|
| 7 |
+
- openai/clip-vit-base-patch16
|
| 8 |
---
|
| 9 |
# CLIP-KO: Knocking Out Typographic Attacks in CLIP 💪🤖
|
| 10 |
### Less vulnerability, much better performance! 🤗
|
|
|
|
| 138 |
| Flickr8k | Img-Text Cos Sim (mean) ↑ | 0.3359 | **0.3368** |
|
| 139 |
| Flickr8k | Img-Text Cos Sim (std) | 0.0409 | 0.0619 |
|
| 140 |
| Flickr8k | Text-Text Cos Sim (mean) | 0.8021 | 0.7356 |
|
| 141 |
+
| Flickr8k | Text-Text Cos Sim (std) | 0.0857 | 0.1407 |
|
| 142 |
+
|
| 143 |
+
Attention head max salience visualization (code is on my GitHub!)
|
| 144 |
+

|
| 145 |
+
Reading words.
|
| 146 |
+

|
| 147 |
+
|