--- license: apache-2.0 language: - en base_model: - facebook/dinov3-vitl16-pretrain-lvd1689m pipeline_tag: image-feature-extraction --- # GR-Lite: Fashion Image Retrieval Model GR-Lite is a lightweight fashion image retrieval model fine-tuned from [DINOv3-ViT-L/16](https://huggingface.co/facebook/dinov3-vitl16-pretrain-lvd1689m). It extracts 1024-dimensional embeddings optimized for fashion product search and retrieval tasks. GR-Lite achieves state-of-the-art (SOTA) performance on LookBench and other fashion retrieval benchmarks.See the [paper](https://arxiv.org/abs/2601.14706) for detailed performance metrics and comparisons. ## Resources - 🌐 **Project Site**: [LookBench-Web](https://serendipityoneinc.github.io/look-bench-page/) - 📄 **Paper**: [LookBench: A Comprehensive Benchmark for Fashion Image Retrieval](https://arxiv.org/abs/2601.14706) - 🗃️ **Benchmark Dataset**: [LookBench on Hugging Face](https://huggingface.co/datasets/srpone/look-bench) - 💻 **Code & Examples**: [look-bench Code](https://github.com/SerendipityOneInc/look-bench) ## Usage ### Installation ```bash pip install torch huggingface_hub ``` For full benchmarking capabilities: ```bash pip install look-bench ``` ### Loading the Model ```python import torch from huggingface_hub import hf_hub_download from PIL import Image # Download the model checkpoint model_path = hf_hub_download( repo_id="srpone/gr-lite", filename="gr_lite.pt" ) # Load the model device = "cuda" if torch.cuda.is_available() else "cpu" model = torch.load(model_path, map_location=device) model.eval() print(f"Model loaded successfully on {device}") ``` ### Feature Extraction ```python # Load an image image = Image.open("path/to/your/image.jpg").convert("RGB") # Extract features using the model's search method with torch.no_grad(): _, embeddings = model.search(image_paths=[image], feature_dim=1024) # Convert to numpy if needed if isinstance(embeddings, torch.Tensor): embeddings = embeddings.cpu().numpy() print(f"Feature shape: {embeddings.shape}") # (1, 1024) ``` ### Using with LookBench Dataset ```python from datasets import load_dataset # Load LookBench dataset dataset = load_dataset("srpone/look-bench", "real_studio_flat") # Get query and gallery images query_image = dataset['query'][0]['image'] gallery_image = dataset['gallery'][0]['image'] # Extract features with torch.no_grad(): _, query_feat = model.search(image_paths=[query_image], feature_dim=256) _, gallery_feat = model.search(image_paths=[gallery_image], feature_dim=256) # Compute similarity import numpy as np query_norm = query_feat / np.linalg.norm(query_feat) gallery_norm = gallery_feat / np.linalg.norm(gallery_feat) similarity = np.dot(query_norm, gallery_norm.T) print(f"Similarity: {similarity[0][0]:.4f}") ``` ## Benchmark Performance GR-Lite is evaluated on the **LookBench** benchmark, which includes: - **Real Studio Flat**: Flat-lay product photos (Easy difficulty) - **AI-Gen Studio**: AI-generated lifestyle images (Medium difficulty) - **Real Streetlook**: Street fashion photos (Hard difficulty) - **AI-Gen Streetlook**: AI-generated street outfits (Hard difficulty) For detailed performance metrics, please refer to: - Paper: https://arxiv.org/abs/2601.14706 - Benchmark: https://huggingface.co/datasets/srpone/look-bench ## Evaluation Use the `look-bench` package to evaluate on LookBench: ```python from look_bench import evaluate_model # Evaluate on all configs results = evaluate_model( model=model, model_name="gr-lite", dataset_configs=["real_studio_flat", "aigen_studio", "real_streetlook", "aigen_streetlook"] ) print(results) ``` ## Model Card Authors Gensmo AI Team ## Citation If you use this model in your research, please cite: ```bibtex @article{gao2026lookbench, title={LookBench: A Live and Holistic Open Benchmark for Fashion Image Retrieval}, author={Chao Gao and Siqiao Xue and Yimin Peng and Jiwen Fu and Tingyi Gu and Shanshan Li and Fan Zhou}, year={2026}, url={https://arxiv.org/abs/2601.14706}, journal= {arXiv preprint arXiv:2601.14706}, } ```