File size: 1,701 Bytes
6ef43da
fb58965
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6ef43da
d00718e
fb58965
d00718e
fb58965
d00718e
fb58965
2027b7a
fb58965
 
 
 
2027b7a
fb58965
d00718e
fb58965
d00718e
 
fb58965
 
2027b7a
 
fb58965
 
 
2027b7a
fb58965
 
 
 
2027b7a
fb58965
 
 
 
2027b7a
 
fb58965
d00718e
 
2027b7a
 
 
d00718e
fb58965
d00718e
 
 
fb58965
6ef43da
fb58965
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
language: en
tags:
- fashion
- clip
- multimodal
- image-search
- text-search
- embeddings
- contrastive-learning
license: mit
datasets:
- custom
metrics:
- accuracy
- cosine-similarity
library_name: transformers
---

# GAP-CLIP: Guaranteed Attribute Positioning in CLIP Embeddings

This model is part of the GAP-CLIP project for fashion search with guaranteed attribute positioning.

## Model Description

GAP-CLIP is a multi-modal search model for fashion that combines:
- **Color embeddings** (16 dimensions): Specialized for color representation
- **Hierarchy embeddings** (64 dimensions): Specialized for category classification
- **General CLIP embeddings** (432 dimensions): General visual-semantic understanding

**Total embedding size**: 512 dimensions

## Quick Start

```python
from transformers import CLIPProcessor, CLIPModel
from huggingface_hub import hf_hub_download
import torch

# Load model
model = CLIPModel.from_pretrained("Leacb4/gap-clip")
processor = CLIPProcessor.from_pretrained("laion/CLIP-ViT-B-32-laion2B-s34B-b79K")

# Process text
text = "red dress"
inputs = processor(text=[text], return_tensors="pt", padding=True)
text_features = model.get_text_features(**inputs)

# Extract subspaces
color_emb = text_features[:, :16]  # Color dimensions
hierarchy_emb = text_features[:, 16:80]  # Hierarchy dimensions
general_emb = text_features[:, 80:]  # General CLIP dimensions
```

## Citation

```bibtex
@misc{gap-clip-2024,
  title={GAP-CLIP: Guaranteed Attribute Positioning in CLIP Embeddings for Fashion Search},
  author={Sarfati, Lea Attia},
  year={2024},
  url={https://huggingface.co/Leacb4/gap-clip}
}
```

## License

MIT License - See LICENSE file for details.