File size: 2,723 Bytes
c12b347
 
 
 
 
 
ca8b4c4
c12b347
 
 
ca8b4c4
c12b347
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
561827a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
license: mit
tags:
  - clip
  - visual-search
  - image-retrieval
  - fashion
library_name: clip
datasets:
  - deepfashion
pipeline_tag: feature-extraction
model-index:
  - name: StyleFinder
    results:
      - task:
          type: image-retrieval
          name: Fashion Visual Search
        dataset:
          type: deepfashion
          name: DeepFashion In-shop Clothes Retrieval
        metrics:
          - type: recall@1
            value: 53.95
            name: Rank-1 Accuracy (RN50)
          - type: map
            value: 0.4265
            name: mAP (RN50)
          - type: recall@1
            value: 46.24
            name: Rank-1 Accuracy (ViT-B/16)
          - type: map
            value: 0.3481
            name: mAP (ViT-B/16)
---

# 👗 StyleFinder – AI-Powered Fashion Visual Search

**StyleFinder** is a deep learning-based image retrieval system fine-tuned on the DeepFashion In-shop Clothes dataset using [CLIP](https://openai.com/research/clip). It enables users to upload an image and retrieve visually similar fashion items using both zero-shot and fine-tuned CLIP variants.

---

## 🧠 Supported Models

| Model         | Stage        | Description                                  |
|---------------|--------------|----------------------------------------------|
| ViT-B/16      | Stage 3 v4   | Best fine-tuned transformer-based model      |
| RN50          | Stage 3 v3   | Best fine-tuned CNN-based model              |
| ViT-B/16      | Zero-shot    | Official OpenAI pretrained CLIP              |
| RN50          | Zero-shot    | Official OpenAI pretrained CLIP              |

---

## 📊 Evaluation Results

| Metric     | ViT-B/16 (v4) | RN50 (v3) |
|------------|---------------|-----------|
| Rank-1     | 46.24%        | **53.95%** |
| mAP        | 0.3481        | **0.4265** |

---

## 🖼️ Precomputed Gallery Features

Gallery embeddings are stored as `.pt` files for fast cosine similarity search.

| File Name                              | Description                       |
|----------------------------------------|-----------------------------------|
| `vitb16_stage3_v4_gallery.pt`          | Fine-tuned ViT-B/16 gallery       |
| `rn50_stage3_v3_gallery.pt`            | Fine-tuned RN50 gallery           |
| `vitb16_zeroshot_gallery.pt`           | Official CLIP ViT-B/16 gallery    |
| `rn50_zeroshot_gallery.pt`             | Official CLIP RN50 gallery        |

These are stored in the `gallery_features/` directory and can be loaded with `load_gallery_features()`.

---

## ⚙️ How to Use

### 🔹 Load a Model

```python
from model_loader import load_model
model, preprocess = load_model(arch="vitb16", stage="stage3")  # or rn50 / zeroshot