|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- clip |
|
|
- visual-search |
|
|
- image-retrieval |
|
|
- fashion |
|
|
library_name: clip |
|
|
datasets: |
|
|
- deepfashion |
|
|
pipeline_tag: feature-extraction |
|
|
model-index: |
|
|
- name: StyleFinder |
|
|
results: |
|
|
- task: |
|
|
type: image-retrieval |
|
|
name: Fashion Visual Search |
|
|
dataset: |
|
|
type: deepfashion |
|
|
name: DeepFashion In-shop Clothes Retrieval |
|
|
metrics: |
|
|
- type: recall@1 |
|
|
value: 53.95 |
|
|
name: Rank-1 Accuracy (RN50) |
|
|
- type: map |
|
|
value: 0.4265 |
|
|
name: mAP (RN50) |
|
|
- type: recall@1 |
|
|
value: 46.24 |
|
|
name: Rank-1 Accuracy (ViT-B/16) |
|
|
- type: map |
|
|
value: 0.3481 |
|
|
name: mAP (ViT-B/16) |
|
|
--- |
|
|
|
|
|
# ๐ StyleFinder โ AI-Powered Fashion Visual Search |
|
|
|
|
|
**StyleFinder** is a deep learning-based image retrieval system fine-tuned on the DeepFashion In-shop Clothes dataset using [CLIP](https://openai.com/research/clip). It enables users to upload an image and retrieve visually similar fashion items using both zero-shot and fine-tuned CLIP variants. |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ง Supported Models |
|
|
|
|
|
| Model | Stage | Description | |
|
|
|---------------|--------------|----------------------------------------------| |
|
|
| ViT-B/16 | Stage 3 v4 | Best fine-tuned transformer-based model | |
|
|
| RN50 | Stage 3 v3 | Best fine-tuned CNN-based model | |
|
|
| ViT-B/16 | Zero-shot | Official OpenAI pretrained CLIP | |
|
|
| RN50 | Zero-shot | Official OpenAI pretrained CLIP | |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ Evaluation Results |
|
|
|
|
|
| Metric | ViT-B/16 (v4) | RN50 (v3) | |
|
|
|------------|---------------|-----------| |
|
|
| Rank-1 | 46.24% | **53.95%** | |
|
|
| mAP | 0.3481 | **0.4265** | |
|
|
|
|
|
--- |
|
|
|
|
|
## ๐ผ๏ธ Precomputed Gallery Features |
|
|
|
|
|
Gallery embeddings are stored as `.pt` files for fast cosine similarity search. |
|
|
|
|
|
| File Name | Description | |
|
|
|----------------------------------------|-----------------------------------| |
|
|
| `vitb16_stage3_v4_gallery.pt` | Fine-tuned ViT-B/16 gallery | |
|
|
| `rn50_stage3_v3_gallery.pt` | Fine-tuned RN50 gallery | |
|
|
| `vitb16_zeroshot_gallery.pt` | Official CLIP ViT-B/16 gallery | |
|
|
| `rn50_zeroshot_gallery.pt` | Official CLIP RN50 gallery | |
|
|
|
|
|
These are stored in the `gallery_features/` directory and can be loaded with `load_gallery_features()`. |
|
|
|
|
|
--- |
|
|
|
|
|
## โ๏ธ How to Use |
|
|
|
|
|
### ๐น Load a Model |
|
|
|
|
|
```python |
|
|
from model_loader import load_model |
|
|
model, preprocess = load_model(arch="vitb16", stage="stage3") # or rn50 / zeroshot |
|
|
|