|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- facebook/dinov3-vitl16-pretrain-lvd1689m |
|
|
pipeline_tag: image-feature-extraction |
|
|
--- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# GR-Lite: Fashion Image Retrieval Model |
|
|
|
|
|
GR-Lite is a lightweight fashion image retrieval model fine-tuned from [DINOv3-ViT-L/16](https://huggingface.co/facebook/dinov3-vitl16-pretrain-lvd1689m). It extracts 1024-dimensional embeddings optimized for fashion product search and retrieval tasks. |
|
|
|
|
|
GR-Lite achieves state-of-the-art (SOTA) performance on LookBench and other fashion retrieval benchmarks.See the [paper](https://arxiv.org/abs/2601.14706) for detailed performance metrics and comparisons. |
|
|
|
|
|
|
|
|
## Resources |
|
|
|
|
|
- ๐ **Project Site**: [LookBench-Web](https://serendipityoneinc.github.io/look-bench-page/) |
|
|
- ๐ **Paper**: [LookBench: A Comprehensive Benchmark for Fashion Image Retrieval](https://arxiv.org/abs/2601.14706) |
|
|
- ๐๏ธ **Benchmark Dataset**: [LookBench on Hugging Face](https://huggingface.co/datasets/srpone/look-bench) |
|
|
- ๐ป **Code & Examples**: [look-bench Code](https://github.com/SerendipityOneInc/look-bench) |
|
|
|
|
|
## Usage |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
pip install torch huggingface_hub |
|
|
``` |
|
|
|
|
|
For full benchmarking capabilities: |
|
|
```bash |
|
|
pip install look-bench |
|
|
``` |
|
|
|
|
|
### Loading the Model |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from huggingface_hub import hf_hub_download |
|
|
from PIL import Image |
|
|
|
|
|
# Download the model checkpoint |
|
|
model_path = hf_hub_download( |
|
|
repo_id="srpone/gr-lite", |
|
|
filename="gr_lite.pt" |
|
|
) |
|
|
|
|
|
# Load the model |
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
model = torch.load(model_path, map_location=device) |
|
|
model.eval() |
|
|
|
|
|
print(f"Model loaded successfully on {device}") |
|
|
``` |
|
|
|
|
|
### Feature Extraction |
|
|
|
|
|
```python |
|
|
# Load an image |
|
|
image = Image.open("path/to/your/image.jpg").convert("RGB") |
|
|
|
|
|
# Extract features using the model's search method |
|
|
with torch.no_grad(): |
|
|
_, embeddings = model.search(image_paths=[image], feature_dim=1024) |
|
|
|
|
|
# Convert to numpy if needed |
|
|
if isinstance(embeddings, torch.Tensor): |
|
|
embeddings = embeddings.cpu().numpy() |
|
|
|
|
|
print(f"Feature shape: {embeddings.shape}") # (1, 1024) |
|
|
``` |
|
|
|
|
|
|
|
|
### Using with LookBench Dataset |
|
|
|
|
|
```python |
|
|
from datasets import load_dataset |
|
|
|
|
|
# Load LookBench dataset |
|
|
dataset = load_dataset("srpone/look-bench", "real_studio_flat") |
|
|
|
|
|
# Get query and gallery images |
|
|
query_image = dataset['query'][0]['image'] |
|
|
gallery_image = dataset['gallery'][0]['image'] |
|
|
|
|
|
# Extract features |
|
|
with torch.no_grad(): |
|
|
_, query_feat = model.search(image_paths=[query_image], feature_dim=256) |
|
|
_, gallery_feat = model.search(image_paths=[gallery_image], feature_dim=256) |
|
|
|
|
|
# Compute similarity |
|
|
import numpy as np |
|
|
query_norm = query_feat / np.linalg.norm(query_feat) |
|
|
gallery_norm = gallery_feat / np.linalg.norm(gallery_feat) |
|
|
similarity = np.dot(query_norm, gallery_norm.T) |
|
|
print(f"Similarity: {similarity[0][0]:.4f}") |
|
|
``` |
|
|
|
|
|
## Benchmark Performance |
|
|
|
|
|
GR-Lite is evaluated on the **LookBench** benchmark, which includes: |
|
|
|
|
|
- **Real Studio Flat**: Flat-lay product photos (Easy difficulty) |
|
|
- **AI-Gen Studio**: AI-generated lifestyle images (Medium difficulty) |
|
|
- **Real Streetlook**: Street fashion photos (Hard difficulty) |
|
|
- **AI-Gen Streetlook**: AI-generated street outfits (Hard difficulty) |
|
|
|
|
|
For detailed performance metrics, please refer to: |
|
|
- Paper: https://arxiv.org/abs/2601.14706 |
|
|
- Benchmark: https://huggingface.co/datasets/srpone/look-bench |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
Use the `look-bench` package to evaluate on LookBench: |
|
|
|
|
|
```python |
|
|
from look_bench import evaluate_model |
|
|
|
|
|
# Evaluate on all configs |
|
|
results = evaluate_model( |
|
|
model=model, |
|
|
model_name="gr-lite", |
|
|
dataset_configs=["real_studio_flat", "aigen_studio", "real_streetlook", "aigen_streetlook"] |
|
|
) |
|
|
|
|
|
print(results) |
|
|
``` |
|
|
|
|
|
## Model Card Authors |
|
|
|
|
|
Gensmo AI Team |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@article{gao2026lookbench, |
|
|
title={LookBench: A Live and Holistic Open Benchmark for Fashion Image Retrieval}, |
|
|
author={Chao Gao and Siqiao Xue and Yimin Peng and Jiwen Fu and Tingyi Gu and Shanshan Li and Fan Zhou}, |
|
|
year={2026}, |
|
|
url={https://arxiv.org/abs/2601.14706}, |
|
|
journal= {arXiv preprint arXiv:2601.14706}, |
|
|
} |
|
|
``` |
|
|
|