Babu Pallam commited on
Commit
561827a
ยท
1 Parent(s): c12b347

Add model card metadata for Hugging Face

Browse files
Files changed (1) hide show
  1. README.md +49 -25
README.md CHANGED
@@ -1,28 +1,3 @@
1
- # StyleFinder โ€“ Fashion Visual Search with CLIP
2
-
3
- This repository includes two fine-tuned CLIP models for image-based fashion retrieval:
4
-
5
- | Model | Stage | Rank-1 | mAP |
6
- |---------------|--------------|--------|-------|
7
- | ViT-B/16 | Stage 3 v4 | 46.24% | 0.3481|
8
- | ResNet-50 | Stage 3 v3 | 53.95% | 0.4265|
9
-
10
- ---
11
-
12
- ## ๐Ÿง  Model Details
13
-
14
- - **ViT-B/16 (Transformer-based, 512-dim):** Jointly fine-tuned using SupCon + ArcFace + BNNeck.
15
- - **RN50 (CNN-based, 1024-dim):** Fine-tuned with prompt-structured Stage 3 configuration.
16
- - Dataset: [DeepFashion โ€“ In-shop Clothes Retrieval](https://mmlab.ie.cuhk.edu.hk/projects/DeepFashion/InShopRetrieval.html)
17
-
18
- ---
19
-
20
- ## ๐Ÿ“ฆ How to Use
21
-
22
- ```python
23
- from model_loader import load_model
24
- model = load_model("vitb16") # or "rn50"
25
-
26
  ---
27
  license: mit
28
  tags:
@@ -58,3 +33,52 @@ model-index:
58
  value: 0.3481
59
  name: mAP (ViT-B/16)
60
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  tags:
 
33
  value: 0.3481
34
  name: mAP (ViT-B/16)
35
  ---
36
+
37
+ # ๐Ÿ‘— StyleFinder โ€“ AI-Powered Fashion Visual Search
38
+
39
+ **StyleFinder** is a deep learning-based image retrieval system fine-tuned on the DeepFashion In-shop Clothes dataset using [CLIP](https://openai.com/research/clip). It enables users to upload an image and retrieve visually similar fashion items using both zero-shot and fine-tuned CLIP variants.
40
+
41
+ ---
42
+
43
+ ## ๐Ÿง  Supported Models
44
+
45
+ | Model | Stage | Description |
46
+ |---------------|--------------|----------------------------------------------|
47
+ | ViT-B/16 | Stage 3 v4 | Best fine-tuned transformer-based model |
48
+ | RN50 | Stage 3 v3 | Best fine-tuned CNN-based model |
49
+ | ViT-B/16 | Zero-shot | Official OpenAI pretrained CLIP |
50
+ | RN50 | Zero-shot | Official OpenAI pretrained CLIP |
51
+
52
+ ---
53
+
54
+ ## ๐Ÿ“Š Evaluation Results
55
+
56
+ | Metric | ViT-B/16 (v4) | RN50 (v3) |
57
+ |------------|---------------|-----------|
58
+ | Rank-1 | 46.24% | **53.95%** |
59
+ | mAP | 0.3481 | **0.4265** |
60
+
61
+ ---
62
+
63
+ ## ๐Ÿ–ผ๏ธ Precomputed Gallery Features
64
+
65
+ Gallery embeddings are stored as `.pt` files for fast cosine similarity search.
66
+
67
+ | File Name | Description |
68
+ |----------------------------------------|-----------------------------------|
69
+ | `vitb16_stage3_v4_gallery.pt` | Fine-tuned ViT-B/16 gallery |
70
+ | `rn50_stage3_v3_gallery.pt` | Fine-tuned RN50 gallery |
71
+ | `vitb16_zeroshot_gallery.pt` | Official CLIP ViT-B/16 gallery |
72
+ | `rn50_zeroshot_gallery.pt` | Official CLIP RN50 gallery |
73
+
74
+ These are stored in the `gallery_features/` directory and can be loaded with `load_gallery_features()`.
75
+
76
+ ---
77
+
78
+ ## โš™๏ธ How to Use
79
+
80
+ ### ๐Ÿ”น Load a Model
81
+
82
+ ```python
83
+ from model_loader import load_model
84
+ model, preprocess = load_model(arch="vitb16", stage="stage3") # or rn50 / zeroshot