---
sdk: gradio
---
👗 Ven: AI-Powered Community Fashion Marketplace
Ven is a community-driven platform designed to foster a circular economy by enabling neighbors to swap and share clothing items. By leveraging advanced Machine Learning, Ven automates the organization of community inventory and provides a seamless, intuitive recommendation experience.

SEE OUR PRESENTATION HERE -> https://youtu.be/znXkEe9WqvQ

🌟 Mission
Our goal is to reduce textile waste and strengthen community ties. Ven allows users to discover local fashion gems using AI that "understands" style, texture, and garment types without requiring manual tagging from users.

🚀 Features
Multimodal Search: Find clothes by uploading a photo or typing a description (e.g., "vintage denim jacket").
Automatic Categorization: Inventory is automatically grouped into 6 distinct style clusters using K-Means.
Smart Matching: Get the Top 3 most similar items from the community inventory based on semantic similarity.
🧠 Technical Workflow
1. Data & EDA
We utilized a curated subset of 5,050 images from the Fashionpedia dataset. Our Exploratory Data Analysis (EDA) revealed a high visual complexity with an average of ~7 items per image, necessitating a robust visual model.


![image](https://cdn-uploads.huggingface.co/production/uploads/69107cc5435394bfe93e0a2a/2e-ssSl8aaf8beSafQBHC.png)


2. Embeddings (The AI Brain)
To represent fashion items mathematically, we used the CLIP (Contrastive Language-Image Pre-training) model (clip-ViT-B-32).
Vector Space: Every item is converted into a normalized 512-dimensional vector.
Normalization: We apply 
L2
 normalization to ensure similarity is calculated based on aesthetic direction rather than image quality:
v
^
 = 
∥v∥ 
2
​	
 
v
​	
 
3. Clustering & Visualization
Algorithm: K-Means clustering was applied to group similar styles.
Dimensionality Reduction: We used t-SNE to project the high-dimensional data into a 2D "Style Map" for visual validation of cluster coherence.
4. Efficient Retrieval
For production readiness, we store our pre-computed embeddings in a Parquet file. This allows the Space to load the inventory instantly and perform Cosine Similarity matching in milliseconds.

![image](https://cdn-uploads.huggingface.co/production/uploads/69107cc5435394bfe93e0a2a/scP37cE8tglAMNWvBOe3G.png)

🛠️ Built With
Gradio: For the interactive web interface.
Hugging Face Datasets: For seamless data streaming.
Sentence-Transformers: For CLIP embedding generation.
Scikit-Learn: For Clustering (K-Means) and Similarity metrics.


👥 Authors
Gal Cohen & Matan Yehuda - Students at Reichman University.