Spaces:

Stylique
/

recomendation

Paused

App Files Files Community

Ali Mohsin commited on Nov 8, 2025

Commit

45b7274

1 Parent(s): aa9a482

10000 final fixes hopefully

Browse files

Files changed (10) hide show

API_DOCUMENTATION.md +0 -412
PRODUCTION_DEPLOYMENT.md +0 -310
PROJECT_SUMMARY.md +0 -261
QUICK_START_TRAINING.md +0 -229
README_HF_SETUP.md +0 -60
RECOMMENDATION_PIPELINE_EXPLAINED.md +340 -0
app.py +74 -18
inference.py +26 -1
utils/artifact_manager.py +4 -1
utils/image_utils.py +374 -0

API_DOCUMENTATION.md DELETED Viewed

@@ -1,412 +0,0 @@
-# Dressify API Documentation
-## Overview
-The Dressify API provides personalized outfit recommendations using advanced deep learning models. The API supports an expanded tag system for fine-grained control over recommendations.
-## Base URL
-```
-https://your-domain.com/api
-```
-## Authentication
-All endpoints (except `/health` and `/tags`) require an API key in the `X-API-Key` header:
-```http
-X-API-Key: your-api-key-here
-```
-## Endpoints
-### 1. Health Check
-**GET** `/health`
-Check API health and model status.
-**Response:**
-```json
-{
-  "status": "ok",
-  "device": "cuda",
-  "resnet": "resnet_v1",
-  "vit": "vit_v1"
-}
-```
----
-### 2. Get Available Tags
-**GET** `/tags`
-Get all available tag options for API integration.
-**Response:**
-```json
-{
-  "tag_categories": {
-    "occasion": ["casual", "business", "formal", ...],
-    "weather": ["any", "hot", "warm", "cold", ...],
-    "style": ["casual", "smart_casual", "formal", ...],
-    "color_preference": ["neutral", "monochromatic", ...],
-    ...
-  },
-  "description": "Available tags for personalized outfit recommendations",
-  "usage": {
-    "primary_tags": ["occasion", "weather", "style"],
-    "optional_tags": ["color_preference", "fit_preference", ...]
-  }
-}
-```
----
-### 3. Validate Tags
-**POST** `/tags/validate`
-Validate tag values before making a recommendation request.
-**Request Body:**
-```json
-{
-  "occasion": "formal",
-  "weather": "cold",
-  "style": "elegant",
-  "color_preference": "monochromatic"
-}
-```
-**Response:**
-```json
-{
-  "valid": true,
-  "errors": [],
-  "validated_tags": {
-    "occasion": "formal",
-    "weather": "cold",
-    "style": "elegant",
-    "color_preference": "monochromatic"
-  }
-}
-```
----
-### 4. Generate Embeddings
-**POST** `/embed`
-Generate embeddings for clothing item images.
-**Request Body:**
-```json
-{
-  "image_urls": ["https://example.com/image1.jpg"],
-  "images_base64": []
-}
-```
-**Response:**
-```json
-{
-  "embeddings": [[0.123, 0.456, ...]],
-  "model_version": "resnet_v1"
-}
-```
----
-### 5. Compose Outfits (Enhanced with Tags)
-**POST** `/compose`
-Generate personalized outfit recommendations with expanded tag support.
-#### Request Format 1: Tag-Based (Recommended)
-**Request Body:**
-```json
-{
-  "items": [
-    {
-      "id": "item_1",
-      "image_url": "https://example.com/shirt.jpg",
-      "category": "shirt",
-      "embedding": null
-    },
-    {
-      "id": "item_2",
-      "image_url": "https://example.com/pants.jpg",
-      "category": "pants",
-      "embedding": null
-    }
-  ],
-  "occasion": "formal",
-  "weather": "cold",
-  "style": "elegant",
-  "num_outfits": 5,
-  "color_preference": "monochromatic",
-  "fit_preference": "tailored",
-  "material_preference": "wool",
-  "season": "winter",
-  "time_of_day": "evening",
-  "personal_style": "sophisticated"
-}
-```
-#### Request Format 2: Context Dict (Legacy)
-**Request Body:**
-```json
-{
-  "items": [...],
-  "context": {
-    "occasion": "formal",
-    "weather": "cold",
-    "style": "elegant",
-    "num_outfits": 5
-  }
-}
-```
-#### Response:
-```json
-{
-  "outfits": [
-    {
-      "item_ids": ["item_1", "item_2", "item_3"],
-      "items": [
-        {
-          "id": "item_1",
-          "category": "jacket",
-          "category_type": "outerwear"
-        },
-        {
-          "id": "item_2",
-          "category": "shirt",
-          "category_type": "upper"
-        },
-        {
-          "id": "item_3",
-          "category": "pants",
-          "category_type": "bottom"
-        }
-      ],
-      "score": 1.85,
-      "base_score": 0.25,
-      "categories": ["jacket", "shirt", "pants"],
-      "category_types": ["outerwear", "upper", "bottom"],
-      "outfit_size": 3,
-      "is_valid": true,
-      "template": {
-        "name": "formal",
-        "style": "professional, elegant, sophisticated",
-        "style_score": 0.95,
-        "color_score": 0.88,
-        "colors": ["navy", "white", "gray"],
-        "accessory_limit": 4
-      }
-    }
-  ],
-  "version": "vit_v1",
-  "tags_processed": true,
-  "context_used": {
-    "occasion": "formal",
-    "weather": "cold",
-    "style": "elegant",
-    ...
-  }
-}
-```
----
-## Tag System
-### Primary Tags (High Priority)
-These tags have the highest influence on recommendations:
-- **occasion**: Event type (casual, business, formal, wedding, date, etc.)
-- **weather**: Weather conditions (any, hot, warm, cold, rain, snow, etc.)
-- **style**: Fashion aesthetic (casual, smart_casual, formal, elegant, etc.)
-### Secondary Tags (Medium Priority)
-These tags refine recommendations:
-- **color_preference**: Color scheme (neutral, monochromatic, bold, etc.)
-- **fit_preference**: Fit type (fitted, loose, tailored, etc.)
-- **material_preference**: Fabric type (cotton, wool, silk, etc.)
-- **personal_style**: Personal style (conservative, bold, timeless, etc.)
-### Tertiary Tags (Lower Priority)
-These provide additional context:
-- **season**: Current season (spring, summer, fall, winter)
-- **time_of_day**: When outfit will be worn (morning, afternoon, evening, night)
-- **budget**: Price range preference (luxury, premium, affordable, budget)
-- **age_group**: Age group (teen, young_adult, adult, mature)
-- **gender**: Gender preference (male, female, non_binary, unisex)
----
-## Tag Processing
-The API automatically:
-1. **Validates** all tag values
-2. **Resolves conflicts** between conflicting tags
-3. **Applies synergies** between complementary tags
-4. **Prioritizes** tags based on importance
-5. **Generates preferences** for the recommendation engine
-### Tag Conflicts
-Some tags conflict and cannot be used together:
-- `hot` conflicts with `cold`, `freezing`, `snow`
-- `formal` conflicts with `casual`, `sporty`
-- `loose` conflicts with `fitted`, `tight`
-### Tag Synergies
-Some tags work well together:
-- `formal` + `elegant` + `sophisticated` + `tailored`
-- `casual` + `comfortable` + `relaxed` + `practical`
-- `sporty` + `athletic` + `comfortable` + `moisture_wicking`
----
-## Example Usage
-### Python Example
-```python
-import requests
-API_KEY = "your-api-key"
-BASE_URL = "https://your-domain.com/api"
-# Prepare items
-items = [
-    {
-        "id": "shirt_1",
-        "image_url": "https://example.com/shirt.jpg",
-        "category": "shirt"
-    },
-    {
-        "id": "pants_1",
-        "image_url": "https://example.com/pants.jpg",
-        "category": "pants"
-    }
-]
-# Make recommendation request
-response = requests.post(
-    f"{BASE_URL}/compose",
-    json={
-        "items": items,
-        "occasion": "formal",
-        "weather": "cold",
-        "style": "elegant",
-        "num_outfits": 5,
-        "color_preference": "monochromatic",
-        "fit_preference": "tailored",
-        "material_preference": "wool"
-    },
-    headers={"X-API-Key": API_KEY}
-)
-result = response.json()
-outfits = result["outfits"]
-```
-### JavaScript Example
-```javascript
-const API_KEY = 'your-api-key';
-const BASE_URL = 'https://your-domain.com/api';
-const items = [
-  {
-    id: 'shirt_1',
-    image_url: 'https://example.com/shirt.jpg',
-    category: 'shirt'
-  },
-  {
-    id: 'pants_1',
-    image_url: 'https://example.com/pants.jpg',
-    category: 'pants'
-  }
-];
-fetch(`${BASE_URL}/compose`, {
-  method: 'POST',
-  headers: {
-    'Content-Type': 'application/json',
-    'X-API-Key': API_KEY
-  },
-  body: JSON.stringify({
-    items: items,
-    occasion: 'formal',
-    weather: 'cold',
-    style: 'elegant',
-    num_outfits: 5,
-    color_preference: 'monochromatic',
-    fit_preference: 'tailored',
-    material_preference: 'wool'
-  })
-})
-.then(response => response.json())
-.then(data => {
-  console.log('Outfits:', data.outfits);
-});
-```
----
-## Error Handling
-### Invalid Tags
-```json
-{
-  "error": "Invalid tags provided",
-  "errors": [
-    "Invalid value 'invalid_occasion' for category 'occasion'"
-  ],
-  "valid_tag_options": {
-    "occasion": ["casual", "business", "formal", ...]
-  }
-}
-```
-### Models Not Loaded
-```json
-{
-  "error": "Models not trained or loaded properly",
-  "details": ["ResNet: No trained weights found"],
-  "message": "Please ensure models are trained..."
-}
-```
----
-## Rate Limits
-- Default: 100 requests per minute
-- Burst: 10 requests per second
----
-## Support
-For API support, please contact: support@dressify.com

PRODUCTION_DEPLOYMENT.md DELETED Viewed

@@ -1,310 +0,0 @@
-# 🚀 Production Deployment Guide for Dressify
-## Overview
-This guide explains how to deploy Dressify as a production-ready outfit recommendation service using the official Polyvore dataset splits.
-## 🎯 Key Changes Made
-### 1. **Official Split Usage** ✅
-- **Before**: System tried to create random 70/15/15 splits
-- **After**: System uses official splits from `nondisjoint/` and `disjoint/` folders
-- **Benefit**: Reproducible, research-grade results
-### 2. **Robust Dataset Detection** 🔍
-- Automatically detects official splits in multiple locations
-- Falls back to metadata extraction if needed
-- No more random split creation by default
-### 3. **Production-Ready Startup** 🚀
-- Comprehensive error handling and diagnostics
-- Clear status reporting
-- Automatic dataset verification
-## 📁 Dataset Structure
-The system expects this structure after download:
-```
-data/Polyvore/
-├── images/                    # Extracted from images.zip
-├── nondisjoint/              # Official splits (preferred)
-│   ├── train.json           # 31.8 MB - Training outfits
-│   ├── valid.json           # 2.99 MB - Validation outfits
-│   └── test.json            # 5.97 MB - Test outfits
-├── disjoint/                 # Alternative official splits
-│   ├── train.json           # 9.65 MB - Training outfits
-│   ├── valid.json           # 1.72 MB - Validation outfits
-│   └── test.json            # 8.36 MB - Test outfits
-├── polyvore_item_metadata.json  # 105 MB - Item metadata
-├── polyvore_outfit_titles.json  # 6.97 MB - Outfit information
-└── categories.csv               # 4.91 KB - Category mappings
-```
-## 🚀 Deployment Steps
-### Step 1: Initial Setup
-```bash
-# Clone the repository
-git clone <your-repo>
-cd recomendation
-# Install dependencies
-pip install -r requirements.txt
-```
-### Step 2: Dataset Preparation
-```bash
-# Run the startup fix script
-python startup_fix.py
-```
-This script will:
-1. ✅ Download the Polyvore dataset from Hugging Face
-2. ✅ Extract images from images.zip
-3. ✅ Detect official splits in nondisjoint/ and disjoint/
-4. ✅ Create training splits from official data
-5. ✅ Verify all components are ready
-### Step 3: Verify Dataset
-```bash
-# Check dataset status
-python -c "
-from utils.data_fetch import check_dataset_structure
-import json
-structure = check_dataset_structure('data/Polyvore')
-print(json.dumps(structure, indent=2))
-"
-```
-Expected output:
-```json
-{
-  "status": "ready",
-  "images": {
-    "exists": true,
-    "count": 100000,
-    "extensions": [".jpg", ".jpeg", ".png"]
-  },
-  "splits": {
-    "nondisjoint": {
-      "train.json": {"exists": true, "size_mb": 31.8},
-      "valid.json": {"exists": true, "size_mb": 2.99},
-      "test.json": {"exists": true, "size_mb": 5.97}
-    }
-  }
-}
-```
-### Step 4: Launch Application
-```bash
-# Start the main application
-python app.py
-```
-The system will:
-1. 🔍 Check dataset status
-2. ✅ Load official splits
-3. 🚀 Launch Gradio interface
-4. 🎯 Be ready for training and inference
-## 🔧 Troubleshooting
-### Issue: "No official splits found"
-**Cause**: The dataset download didn't include the split files.
-**Solution**:
-```bash
-# Check what was downloaded
-ls -la data/Polyvore/
-# Re-run data fetcher
-python -c "
-from utils.data_fetch import ensure_dataset_ready
-ensure_dataset_ready()
-"
-```
-### Issue: "Dataset preparation failed"
-**Cause**: The prepare script couldn't parse the official splits.
-**Solution**:
-```bash
-# Check split file format
-head -20 data/Polyvore/nondisjoint/train.json
-# Run preparation manually
-python scripts/prepare_polyvore.py --root data/Polyvore
-```
-### Issue: "Out of memory during training"
-**Cause**: GPU memory insufficient for default batch sizes.
-**Solution**: Use the Advanced Training interface to reduce batch sizes:
-- ResNet: Reduce from 64 to 16-32
-- ViT: Reduce from 32 to 8-16
-- Enable mixed precision (AMP)
-## 🎯 Production Configuration
-### Environment Variables
-```bash
-export EXPORT_DIR="models/exports"
-export POLYVORE_ROOT="data/Polyvore"
-export CUDA_VISIBLE_DEVICES="0"  # Specify GPU
-```
-### Docker Deployment
-```bash
-# Build image
-docker build -t dressify .
-# Run container
-docker run -p 7860:7860 -p 8000:8000 \
-  -v $(pwd)/data:/app/data \
-  -v $(pwd)/models:/app/models \
-  dressify
-```
-### Hugging Face Space
-1. Upload the entire `recomendation/` folder
-2. Set Space type to "Gradio"
-3. The system auto-bootstraps on first run
-4. Uses official splits for production-quality results
-## 📊 Expected Performance
-### Dataset Statistics
-- **Total Images**: ~100,000 fashion items
-- **Training Outfits**: ~50,000 (nondisjoint) or ~20,000 (disjoint)
-- **Validation Outfits**: ~5,000 (nondisjoint) or ~2,000 (disjoint)
-- **Test Outfits**: ~10,000 (nondisjoint) or ~4,000 (disjoint)
-### Training Times (L4 GPU)
-- **ResNet Item Embedder**: 2-4 hours (20 epochs)
-- **ViT Outfit Encoder**: 1-2 hours (30 epochs)
-- **Total**: 3-6 hours for full training
-### Inference Performance
-- **Item Embedding**: < 50ms per image
-- **Outfit Generation**: < 100ms per outfit
-- **Memory Usage**: ~2-4 GB GPU VRAM
-## 🔬 Research vs Production
-### Research Mode
-```bash
-# Use disjoint splits (smaller, more challenging)
-python scripts/prepare_polyvore.py --root data/Polyvore
-# Automatically uses disjoint/ splits
-```
-### Production Mode
-```bash
-# Use nondisjoint splits (larger, more robust)
-python scripts/prepare_polyvore.py --root data/Polyvore
-# Automatically uses nondisjoint/ splits (default)
-```
-## 📝 Monitoring & Logging
-### Training Logs
-```bash
-# Check training progress
-tail -f models/exports/training.log
-# Monitor GPU usage
-nvidia-smi -l 1
-```
-### System Health
-```bash
-# Health check endpoint
-curl http://localhost:8000/health
-# Expected response
-{
-  "status": "ok",
-  "device": "cuda:0",
-  "resnet": "resnet50_v2",
-  "vit": "vit_outfit_v1"
-}
-```
-## 🚨 Emergency Procedures
-### Dataset Corruption
-```bash
-# Remove corrupted data
-rm -rf data/Polyvore/splits/
-# Re-run preparation
-python startup_fix.py
-```
-### Model Issues
-```bash
-# Remove corrupted models
-rm -rf models/exports/*.pth
-# Re-train from scratch
-python train_resnet.py --data_root data/Polyvore --epochs 20
-python train_vit_triplet.py --data_root data/Polyvore --epochs 30
-```
-### System Recovery
-```bash
-# Full system reset
-rm -rf data/Polyvore/
-rm -rf models/exports/
-# Fresh start
-python startup_fix.py
-```
-## ✅ Production Checklist
-- [ ] Dataset downloaded successfully (2.5GB+ images)
-- [ ] Official splits detected in nondisjoint/ or disjoint/
-- [ ] Training splits created in data/Polyvore/splits/
-- [ ] Models can be trained without errors
-- [ ] Inference service responds to health checks
-- [ ] Gradio interface loads successfully
-- [ ] Advanced training controls work
-- [ ] Model checkpoints can be saved/loaded
-## 🎉 Success Indicators
-When everything is working correctly, you should see:
-```
-✅ Dataset ready at: data/Polyvore
-📊 Images: 100000 files
-📋 polyvore_item_metadata.json: 105.0 MB
-📋 polyvore_outfit_titles.json: 6.97 MB
-🎯 Official splits found:
-   ✅ nondisjoint/train.json (31.8 MB)
-   ✅ nondisjoint/valid.json (2.99 MB)
-   ✅ nondisjoint/test.json (5.97 MB)
-🎉 Using official splits from dataset!
-✅ Dataset preparation completed successfully!
-✅ All required splits verified!
-🚀 Your Dressify system is ready to use!
-```
-## 📞 Support
-If you encounter issues:
-1. **Check the logs** for specific error messages
-2. **Verify dataset structure** matches expected layout
-3. **Run startup_fix.py** for automated diagnostics
-4. **Check GPU memory** and reduce batch sizes if needed
-5. **Ensure official splits** are present in nondisjoint/ or disjoint/
----
-**🎯 Your Dressify system is now production-ready with official dataset splits!**

PROJECT_SUMMARY.md DELETED Viewed

@@ -1,261 +0,0 @@
-# Dressify - Complete Project Summary
-## 🎯 Project Overview
-**Dressify** is a **production-ready, research-grade** outfit recommendation system that automatically downloads the Polyvore dataset, trains state-of-the-art models, and provides a sophisticated Gradio interface for wardrobe uploads and outfit generation.
-## 🏗️ System Architecture
-### Core Components
-1. **Data Pipeline** (`utils/data_fetch.py`)
-   - Automatic download of Stylique/Polyvore dataset from HF Hub
-   - Smart image extraction and organization
-   - Robust split detection (root, nondisjoint, disjoint)
-   - Fallback to deterministic 70/15/15 splits if official splits missing
-2. **Model Architecture**
-   - **ResNet Item Embedder** (`models/resnet_embedder.py`)
-     - ImageNet-pretrained ResNet50 backbone
-     - 512D projection head with L2 normalization
-     - Triplet loss training for item compatibility
-   - **ViT Outfit Encoder** (`models/vit_outfit.py`)
-     - 6-layer transformer encoder
-     - 8 attention heads, 4x feed-forward multiplier
-     - Outfit-level compatibility scoring
-     - Cosine distance triplet loss
-3. **Training Pipeline**
-   - **ResNet Training** (`train_resnet.py`)
-     - Semi-hard negative mining
-     - Mixed precision training with autocast
-     - Channels-last memory format for CUDA
-     - Automatic checkpointing and best model saving
-   - **ViT Training** (`train_vit_triplet.py`)
-     - Frozen ResNet embeddings as input
-     - Outfit-level triplet mining
-     - Validation with early stopping
-     - Comprehensive metrics logging
-4. **Inference Service** (`inference.py`)
-   - On-the-fly image embedding
-   - Slot-aware outfit composition
-   - Candidate generation with category constraints
-   - Compatibility scoring and ranking
-5. **Web Interface** (`app.py`)
-   - **Gradio UI**: Wardrobe upload, outfit generation, preview stitching
-   - **FastAPI**: REST endpoints for embedding and composition
-   - **Auto-bootstrap**: Background dataset prep and training
-   - **Status Dashboard**: Real-time progress monitoring
-## 🚀 Key Features
-### Research-Grade Training
-- **Triplet Loss**: Semi-hard negative mining for better embeddings
-- **Mixed Precision**: CUDA-optimized training with autocast
-- **Advanced Augmentation**: Random crop, flip, color jitter, random erasing
-- **Curriculum Learning**: Progressive difficulty increase (configurable)
-### Production-Ready Infrastructure
-- **Self-Contained**: No external dependencies or environment variables
-- **Auto-Recovery**: Handles missing splits, corrupted data gracefully
-- **Background Processing**: Non-blocking dataset preparation and training
-- **Model Versioning**: Automatic checkpoint management and best model saving
-### Advanced UI/UX
-- **Multi-File Upload**: Drag & drop wardrobe images with previews
-- **Category Editing**: Manual category assignment for better slot awareness
-- **Context Awareness**: Occasion, weather, style preferences
-- **Visual Output**: Stitched outfit previews + structured JSON data
-## 📊 Expected Performance
-### Training Metrics
-- **Item Embedder**: Triplet accuracy > 85%, validation loss < 0.1
-- **Outfit Encoder**: Compatibility AUC > 0.8, precision > 0.75
-- **Training Time**: ResNet ~2-4h, ViT ~1-2h on L4 GPU
-### Inference Performance
-- **Latency**: < 100ms per outfit on GPU, < 500ms on CPU
-- **Throughput**: 100+ outfits/second on modern GPU
-- **Memory**: ~2GB VRAM for full models, ~500MB for lightweight variants
-## 🔧 Configuration & Customization
-### Training Configs
-- **Item Training** (`configs/item.yaml`): Backbone, embedding dim, loss params
-- **Outfit Training** (`configs/outfit.yaml`): Transformer layers, attention heads
-- **Hardware Settings**: Mixed precision, channels-last, gradient clipping
-### Model Variants
-- **Lightweight**: MobileNetV3 + small transformer (CPU-friendly)
-- **Standard**: ResNet50 + medium transformer (balanced)
-- **Research**: ResNet101 + large transformer (high performance)
-## 🚀 Deployment Options
-### 1. Hugging Face Space (Recommended)
-```bash
-# Deploy to HF Space
-./scripts/deploy_space.sh
-# Customize Space settings
-SPACE_NAME=my-dressify SPACE_HARDWARE=gpu-t4 ./scripts/deploy_space.sh
-```
-### 2. Local Development
-```bash
-# Setup environment
-pip install -r requirements.txt
-# Launch app (auto-downloads dataset)
-python app.py
-# Manual training
-./scripts/train_item.sh
-./scripts/train_outfit.sh
-```
-### 3. Docker Deployment
-```bash
-# Build and run
-docker build -t dressify .
-docker run -p 7860:7860 -p 8000:8000 dressify
-```
-## 📁 Project Structure
-```
-recomendation/
-├── app.py                       # Main FastAPI + Gradio app
-├── inference.py                 # Inference service
-├── models/
-│   ├── resnet_embedder.py      # ResNet50 + projection
-│   └── vit_outfit.py           # Transformer encoder
-├── data/
-│   └── polyvore.py             # PyTorch datasets
-├── scripts/
-│   ├── prepare_polyvore.py     # Dataset preparation
-│   ├── train_item.sh           # ResNet training script
-│   ├── train_outfit.sh         # ViT training script
-│   └── deploy_space.sh         # HF Space deployment
-├── utils/
-│   ├── data_fetch.py           # HF dataset downloader
-│   ├── transforms.py            # Image transforms
-│   ├── triplet_mining.py       # Semi-hard negative mining
-│   ├── hf_utils.py             # HF Hub integration
-│   └── export.py               # Model export utilities
-├── configs/
-│   ├── item.yaml               # ResNet training config
-│   └── outfit.yaml             # ViT training config
-├── tests/
-│   └── test_system.py          # Comprehensive tests
-├── requirements.txt             # Dependencies
-├── Dockerfile                   # Container deployment
-└── README.md                    # Documentation
-```
-## 🧪 Testing & Validation
-### Smoke Tests
-```bash
-# Run comprehensive tests
-python -m pytest tests/test_system.py -v
-# Test individual components
-python -c "from models.resnet_embedder import ResNetItemEmbedder; print('✅ ResNet OK')"
-python -c "from models.vit_outfit import OutfitCompatibilityModel; print('✅ ViT OK')"
-```
-### Training Validation
-```bash
-# Quick training runs
-EPOCHS=1 BATCH_SIZE=8 ./scripts/train_item.sh
-EPOCHS=1 BATCH_SIZE=4 ./scripts/train_outfit.sh
-```
-## 🔬 Research Contributions
-### Novel Approaches
-1. **Hybrid Architecture**: ResNet embeddings + Transformer compatibility
-2. **Semi-Hard Mining**: Intelligent negative sample selection
-3. **Slot Awareness**: Category-constrained outfit composition
-4. **Auto-Bootstrap**: Self-contained dataset preparation and training
-### Technical Innovations
-- **Mixed Precision Training**: CUDA-optimized with autocast
-- **Channels-Last Memory**: Improved GPU memory efficiency
-- **Background Processing**: Non-blocking system initialization
-- **Robust Data Handling**: Graceful fallback for missing splits
-## 📈 Future Enhancements
-### Model Improvements
-- **Multi-Modal**: Text descriptions + visual features
-- **Attention Visualization**: Interpretable outfit compatibility
-- **Style Transfer**: Generate outfit variations
-- **Personalization**: User preference learning
-### System Features
-- **Real-Time Training**: Continuous model improvement
-- **A/B Testing**: Multiple model variants
-- **Performance Monitoring**: Automated quality metrics
-- **Scalable Deployment**: Multi-GPU, distributed training
-## 🤝 Integration Examples
-### Next.js + Supabase
-```typescript
-// Complete integration example in README.md
-// Database schema with RLS policies
-// API endpoints for wardrobe management
-// Real-time outfit recommendations
-```
-### API Usage
-```bash
-# Health check
-curl http://localhost:8000/health
-# Image embedding
-curl -X POST http://localhost:8000/embed \
-  -H "Content-Type: application/json" \
-  -d '{"images": ["base64_image_1"]}'
-# Outfit composition
-curl -X POST http://localhost:8000/compose \
-  -H "Content-Type: application/json" \
-  -d '{"items": [{"id": "item1", "embedding": [0.1, ...]}], "context": {"occasion": "casual"}}'
-```
-## 📚 Academic References
-### Core Technologies
-- **Triplet Loss**: FaceNet, Deep Metric Learning
-- **Transformer Architecture**: Attention Is All You Need, ViT
-- **Outfit Compatibility**: Fashion Recommendation Systems
-- **Dataset Preparation**: Polyvore, Fashion-MNIST
-### Research Papers
-- ResNet: Deep Residual Learning for Image Recognition
-- ViT: An Image is Worth 16x16 Words
-- Triplet Loss: FaceNet: A Unified Embedding for Face Recognition
-- Fashion AI: Learning Fashion Compatibility with Visual Similarity
-## 🎉 Conclusion
-**Dressify** represents a **complete, production-ready** outfit recommendation system that combines:
-- **Research Excellence**: State-of-the-art deep learning architectures
-- **Production Quality**: Robust error handling, auto-recovery, monitoring
-- **User Experience**: Intuitive interface, real-time feedback, visual output
-- **Developer Experience**: Comprehensive testing, clear documentation, easy deployment
-The system is designed to be **self-contained**, **scalable**, and **research-grade**, making it suitable for both academic research and commercial deployment. With automatic dataset preparation, intelligent training, and sophisticated inference, Dressify provides a complete solution for outfit recommendation that requires minimal setup and maintenance.
----
-**Built with ❤️ for the fashion AI community**

QUICK_START_TRAINING.md DELETED Viewed

@@ -1,229 +0,0 @@
-# 🚀 Quick Start: Advanced Training Interface
-## Overview
-The Dressify system now provides **comprehensive parameter control** for both ResNet and ViT training directly from the Gradio interface. You can tweak every aspect of model training without editing code!
-## 🎯 What You Can Control
-### ResNet Item Embedder
-- **Architecture**: Backbone (ResNet50/101), embedding dimension, dropout
-- **Training**: Epochs, batch size, learning rate, optimizer, weight decay, triplet margin
-- **Hardware**: Mixed precision, memory format, gradient clipping
-### ViT Outfit Encoder
-- **Architecture**: Transformer layers, attention heads, feed-forward multiplier, dropout
-- **Training**: Epochs, batch size, learning rate, optimizer, weight decay, triplet margin
-- **Strategy**: Mining strategy, augmentation level, random seed
-### Advanced Settings
-- **Learning Rate**: Warmup epochs, scheduler type, early stopping patience
-- **Optimization**: Mixed precision, channels-last memory, gradient clipping
-- **Reproducibility**: Random seed, deterministic training
-## 🚀 Quick Start Steps
-### 1. Launch the App
-```bash
-python app.py
-```
-### 2. Go to Advanced Training Tab
-- Click on the **"🔬 Advanced Training"** tab
-- You'll see comprehensive parameter controls organized in sections
-### 3. Choose Your Training Mode
-#### Quick Training (Basic)
-- Set ResNet epochs: 5-10
-- Set ViT epochs: 10-20
-- Click **"🚀 Start Quick Training"**
-#### Advanced Training (Custom)
-- Adjust **all parameters** to your liking
-- Click **"🎯 Start Advanced Training"**
-### 4. Monitor Progress
-- Watch the training log for real-time updates
-- Check the Status tab for system health
-- Download models from the Downloads tab when complete
-## 🔬 Parameter Tuning Examples
-### Fast Experimentation
-```yaml
-# Quick test (5-10 minutes)
-ResNet: epochs=5, batch_size=16, lr=1e-3
-ViT: epochs=10, batch_size=16, lr=5e-4
-```
-### Standard Training
-```yaml
-# Balanced quality (1-2 hours)
-ResNet: epochs=20, batch_size=64, lr=1e-3
-ViT: epochs=30, batch_size=32, lr=5e-4
-```
-### High Quality Training
-```yaml
-# Production models (4-6 hours)
-ResNet: epochs=50, batch_size=32, lr=5e-4
-ViT: epochs=100, batch_size=16, lr=1e-4
-```
-### Research Experiments
-```yaml
-# Maximum capacity
-ResNet: backbone=resnet101, embedding_dim=768
-ViT: layers=8, heads=12, mining_strategy=hardest
-```
-## 🎯 Key Parameters to Experiment With
-### High Impact (Try First)
-1. **Learning Rate**: 1e-4 to 1e-2
-2. **Batch Size**: 16 to 128
-3. **Triplet Margin**: 0.1 to 0.5
-4. **Epochs**: 5 to 100
-### Medium Impact
-1. **Embedding Dimension**: 256, 512, 768, 1024
-2. **Transformer Layers**: 4, 6, 8, 12
-3. **Optimizer**: AdamW, Adam, SGD, RMSprop
-### Fine-tuning
-1. **Weight Decay**: 1e-6 to 1e-1
-2. **Dropout**: 0.0 to 0.5
-3. **Attention Heads**: 4, 8, 16
-## 📊 Training Workflow
-### 1. **Start Simple** 🚀
-- Use default parameters first
-- Run quick training (5-10 epochs)
-- Verify system works
-### 2. **Experiment Systematically** 🔍
-- Change **one parameter at a time**
-- Start with learning rate and batch size
-- Document every change
-### 3. **Validate Results** ✅
-- Compare training curves
-- Check validation metrics
-- Ensure improvements are consistent
-### 4. **Scale Up** 📈
-- Use best parameters for longer training
-- Increase epochs gradually
-- Monitor for overfitting
-## 🧪 Monitoring Training
-### What to Watch
-- **Training Loss**: Should decrease steadily
-- **Validation Loss**: Should decrease without overfitting
-- **Training Time**: Per epoch timing
-- **GPU Memory**: VRAM usage
-### Success Signs
-- Smooth loss curves
-- Consistent improvement
-- Good generalization
-### Warning Signs
-- Loss spikes or plateaus
-- Validation loss increases
-- Training becomes unstable
-## 🔧 Advanced Features
-### Mixed Precision Training
-- **Enable**: Faster training, less memory
-- **Disable**: More stable, higher precision
-- **Default**: Enabled (recommended)
-### Triplet Mining Strategies
-- **Semi-hard**: Balanced difficulty (default)
-- **Hardest**: Maximum challenge
-- **Random**: Simple but less effective
-### Data Augmentation
-- **Minimal**: Basic transforms
-- **Standard**: Balanced augmentation (default)
-- **Aggressive**: Heavy augmentation
-## 📝 Best Practices
-### 1. **Document Everything** 📚
-- Save parameter combinations
-- Record training results
-- Note hardware specifications
-### 2. **Start Small** 🔬
-- Test with few epochs first
-- Validate promising combinations
-- Scale up gradually
-### 3. **Monitor Resources** 💻
-- Watch GPU memory usage
-- Check training time per epoch
-- Balance quality vs. speed
-### 4. **Save Checkpoints** 💾
-- Models are saved automatically
-- Keep intermediate checkpoints
-- Download final models
-## 🚨 Common Issues & Solutions
-### Training Too Slow
-- **Reduce batch size**
-- **Increase learning rate**
-- **Use mixed precision**
-- **Reduce embedding dimension**
-### Training Unstable
-- **Reduce learning rate**
-- **Increase batch size**
-- **Enable gradient clipping**
-- **Check data quality**
-### Out of Memory
-- **Reduce batch size**
-- **Reduce embedding dimension**
-- **Use mixed precision**
-- **Reduce transformer layers**
-### Poor Results
-- **Increase epochs**
-- **Adjust learning rate**
-- **Try different optimizers**
-- **Check data preprocessing**
-## 📚 Next Steps
-### 1. **Read the Full Guide**
-- See `TRAINING_PARAMETERS.md` for detailed explanations
-- Understand parameter impact and trade-offs
-### 2. **Run Experiments**
-- Start with quick training
-- Experiment with different parameters
-- Document your findings
-### 3. **Optimize for Your Use Case**
-- Balance quality vs. speed
-- Consider hardware constraints
-- Aim for reproducible results
-### 4. **Share Results**
-- Document successful configurations
-- Share insights with the community
-- Contribute to best practices
----
-**🎉 You're ready to start experimenting!**
-*Remember: Start simple, change one thing at a time, and document everything. Happy training! 🚀*

README_HF_SETUP.md DELETED Viewed

@@ -1,60 +0,0 @@
-# Hugging Face Setup Guide
-## 🔐 Setting Up Hugging Face Authentication
-### 1. Get Your HF Token
-- Go to https://huggingface.co/settings/tokens
-- Create a new token with **Write** permissions
-- Copy the token (starts with `hf_...`)
-### 2. Set Environment Variables
-#### Option A: In Hugging Face Spaces (Recommended)
-1. Go to your Space settings
-2. Add these secrets:
-   - `HF_TOKEN`: Your Hugging Face token
-   - `HF_USERNAME`: Your Hugging Face username (e.g., "Stylique")
-#### Option B: Local Development
-```bash
-export HF_TOKEN="hf_your_token_here"
-export HF_USERNAME="your_username"
-```
-### 3. Verify Setup
-```bash
-source setup_hf.sh
-```
-## 🚀 What Happens Next
-Once environment variables are set, the system will automatically:
-- ✅ Authenticate with Hugging Face
-- ✅ Upload trained models to `{HF_USERNAME}/dressify-models`
-- ✅ Upload datasets to `{HF_USERNAME}/Dressify-Helper`
-- ✅ Create repositories if they don't exist
-## 🔒 Security Notes
-- **Never commit tokens to git**
-- **Use environment variables or HF Spaces secrets**
-- **Tokens are automatically masked in logs**
-## 📁 Repository Structure
-After successful upload:
-```
-{HF_USERNAME}/dressify-models/
-├── resnet_item_embedder_best.pth
-├── vit_outfit_model_best.pth
-├── resnet_metrics.json
-└── vit_metrics.json
-{HF_USERNAME}/Dressify-Helper/
-├── train.json
-├── valid.json
-├── test.json
-├── outfit_triplets_train.json
-├── outfit_triplets_valid.json
-└── outfit_triplets_test.json
-```

RECOMMENDATION_PIPELINE_EXPLAINED.md ADDED Viewed

	@@ -0,0 +1,340 @@

+# 🎯 How Dressify Recommendations Actually Work
+## ✅ **YES - Both ResNet and ViT are used during inference!**
+This document explains the complete recommendation pipeline and proves that both deep learning models are actively used.
+---
+## 📊 **Complete Recommendation Pipeline**
+### **Step 1: Image Input & Category Detection**
+**Location:** `inference.py:356-384`
+```python
+# User uploads wardrobe images
+items = [
+    {"id": "item_0", "image": <PIL.Image>, "category": None},
+    {"id": "item_1", "image": <PIL.Image>, "category": None},
+    ...
+]
+# For each item:
+for item in items:
+    # 1. Auto-detect category using CLIP (if available)
+    category = self._detect_category_with_clip(item["image"])
+    # OR fallback to filename-based detection
+    # 2. Generate embedding if not provided
+    if embedding is None:
+        embedding = self.embed_images([item["image"]])[0]
+```
+**What happens:**
+- Each clothing item image is processed
+- Category is detected (shirt, pants, shoes, etc.) using CLIP or filename
+- If no embedding exists, it's generated using **ResNet**
+---
+### **Step 2: ResNet Generates Item Embeddings** ⭐
+**Location:** `inference.py:313-337` → `embed_images()`
+```python
+@torch.inference_mode()
+def embed_images(self, images: List[Image.Image]) -> List[np.ndarray]:
+    # Transform images to tensor
+    batch = torch.stack([self.transform(img) for img in images])
+    batch = batch.to(self.device, memory_format=torch.channels_last)
+    # ✅ RESNET IS CALLED HERE!
+    use_amp = (self.device == "cuda")
+    with torch.autocast(device_type=("cuda" if use_amp else "cpu"), enabled=use_amp):
+        emb = self.resnet(batch)  # <-- RESNET FORWARD PASS
+    # Normalize embeddings
+    emb = nn.functional.normalize(emb, dim=-1)
+    result = [e.detach().cpu().numpy().astype(np.float32) for e in emb]
+    return result
+```
+**What ResNet does:**
+- Takes raw clothing item images (224x224 RGB)
+- Passes through ResNet50 backbone (pretrained on ImageNet)
+- Generates **512-dimensional embeddings** for each item
+- These embeddings capture visual features (color, texture, style, pattern)
+**Example:**
+- Input: Image of a blue shirt → ResNet → Output: `[0.123, -0.456, 0.789, ...]` (512-dim vector)
+---
+### **Step 3: Tag Processing & Context Building**
+**Location:** `inference.py:490-545`
+```python
+# Process user tags (occasion, weather, style, etc.)
+processed_tags = self.tag_processor.process_tags(context)
+# Build outfit template based on tags
+template = outfit_templates[outfit_style].copy()
+# Apply weather/occasion modifications
+# Generate constraints (min_items, max_items, accessory_limit)
+```
+**What happens:**
+- User preferences (formal, cold weather, elegant style) are processed
+- Outfit templates are selected and modified
+- Constraints are generated (e.g., formal requires 4-5 items, needs outerwear)
+---
+### **Step 4: Candidate Outfit Generation**
+**Location:** `inference.py:910-1092`
+```python
+# Generate many candidate outfit combinations
+candidates = []
+for _ in range(num_samples):  # Typically 50-100+ candidates
+    subset = []
+    # Strategy-based generation:
+    # - Strategy 0: Core outfit (shirt + pants + shoes + accessories)
+    # - Strategy 1: Accessory-focused
+    # - Strategy 2: Flexible combination
+    # Add items based on context (formal, casual, etc.)
+    if occasion == "formal" and outerwear:
+        subset.append(jacket)
+        subset.append(shirt)
+        subset.append(pants)
+        subset.append(shoes)
+    candidates.append(subset)
+```
+**What happens:**
+- System generates **50-100+ candidate outfit combinations**
+- Each candidate is a list of item indices (e.g., `[0, 3, 7, 12]`)
+- Candidates are generated using:
+  - Category pools (uppers, bottoms, shoes, outerwear, accessories)
+  - Context-aware strategies (formal vs casual)
+  - Randomization for variety
+---
+### **Step 5: ViT Scores Outfit Compatibility** ⭐⭐
+**Location:** `inference.py:1094-1103` → `score_subset()`
+```python
+def score_subset(idx_subset: List[int]) -> float:
+    # Get embeddings for items in this outfit
+    embs = torch.tensor(
+        np.stack([proc_items[i]["embedding"] for i in idx_subset], axis=0),
+        dtype=torch.float32,
+        device=self.device,
+    )  # Shape: (N, 512) where N = number of items in outfit
+    embs = embs.unsqueeze(0)  # Shape: (1, N, 512) - batch dimension
+    # ✅ VIT IS CALLED HERE!
+    s = self.vit.score_compatibility(embs).item()  # <-- VIT FORWARD PASS
+    return float(s)
+```
+**What ViT does:**
+- Takes **multiple item embeddings** (e.g., jacket, shirt, pants, shoes)
+- Passes through **Vision Transformer encoder**:
+  - Transformer processes the sequence of item embeddings
+  - Learns relationships between items (do they go together?)
+  - Outputs a **compatibility score** (higher = better match)
+**ViT Architecture:**
+```python
+# From models/vit_outfit.py
+class OutfitCompatibilityModel(nn.Module):
+    def forward(self, tokens: torch.Tensor) -> torch.Tensor:
+        # tokens: (B, N, D) - batch of outfits, each with N items, D-dim embeddings
+        h = self.encoder(tokens)      # Transformer encoder
+        pooled = h.mean(dim=1)        # Average pooling across items
+        score = self.compatibility_head(pooled)  # Final compatibility score
+        return score.squeeze(-1)
+```
+**Example:**
+- Input: `[jacket_emb, shirt_emb, pants_emb, shoes_emb]` (4 items × 512 dims)
+- ViT Processing: Transformer analyzes relationships
+- Output: `0.85` (high compatibility score)
+---
+### **Step 6: Scoring & Ranking**
+**Location:** `inference.py:1266-1274`
+```python
+# Score all valid candidates
+scored = []
+for subset in valid_candidates:
+    base_score = score_subset(subset)  # <-- ViT score (0.0 to 1.0+)
+    # Apply penalties and bonuses
+    adjusted_score = calculate_outfit_penalty(subset, base_score)
+    # - Penalties: missing categories, duplicates, wrong context
+    # - Bonuses: color harmony, style coherence, complete sets
+    scored.append((subset, adjusted_score, base_score))
+# Sort by adjusted score (highest first)
+scored.sort(key=lambda x: x[1], reverse=True)
+```
+**What happens:**
+- Each candidate outfit gets:
+  1. **Base score from ViT** (0.0 to ~1.0+)
+  2. **Penalties** (e.g., -500 if formal without jacket)
+  3. **Bonuses** (e.g., +0.6 for color harmony, +0.4 for style coherence)
+- Final score = base_score + penalties + bonuses
+- Outfits are ranked by final score
+---
+### **Step 7: Final Selection & Deduplication**
+**Location:** `inference.py:1276-1300`
+```python
+# Remove duplicate outfits
+seen_outfits = set()
+unique_scored = []
+for subset, adjusted_score, base_score in scored:
+    normalized = normalize_outfit(subset)  # Sort item IDs
+    if normalized not in seen_outfits:
+        seen_outfits.add(normalized)
+        unique_scored.append((subset, adjusted_score, base_score))
+# Select top N with randomization
+topk = unique_scored[:num_outfits]
+```
+**What happens:**
+- Duplicate outfits (same items, different order) are removed
+- Top N outfits are selected
+- Some randomization is added for variety
+---
+## 🔍 **Proof: Both Models Are Used**
+### **Evidence 1: ResNet Usage**
+```python
+# Line 330 in inference.py
+emb = self.resnet(batch)  # ✅ ResNet forward pass
+```
+- Called in `embed_images()` method
+- Generates embeddings for every clothing item
+- **Called during inference** when items don't have pre-computed embeddings
+### **Evidence 2: ViT Usage**
+```python
+# Line 1102 in inference.py
+s = self.vit.score_compatibility(embs).item()  # ✅ ViT forward pass
+```
+- Called in `score_subset()` function
+- Scores **every candidate outfit** (50-100+ times per recommendation request)
+- **Called during inference** to rank outfit combinations
+### **Evidence 3: Model Loading**
+```python
+# Lines 49-50, 285-286 in inference.py
+self.resnet, self.resnet_loaded = self._load_resnet()
+self.vit, self.vit_loaded = self._load_vit()
+# Models are loaded and set to eval mode
+if self.resnet_loaded:
+    self.resnet = self.resnet.to(self.device).eval()
+if self.vit_loaded:
+    self.vit = self.vit.to(self.device).eval()
+```
+---
+## 📈 **Complete Flow Diagram**
+```
+User Input
+    ↓
+[Upload Images] → [CLIP Category Detection]
+    ↓
+[ResNet Embedding Generation] ← ✅ RESNET USED HERE
+    ↓
+[512-dim Embeddings for Each Item]
+    ↓
+[Tag Processing] → [Context Building]
+    ↓
+[Candidate Generation] → [50-100+ Outfit Combinations]
+    ↓
+[ViT Compatibility Scoring] ← ✅ VIT USED HERE (50-100+ times)
+    ↓
+[Penalty/Bonus Adjustment]
+    ↓
+[Ranking & Deduplication]
+    ↓
+[Top N Recommendations]
+```
+---
+## 🎯 **Key Points**
+1. **ResNet is used:**
+   - Generates embeddings for each clothing item
+   - Called once per item (or uses cached embeddings)
+   - Output: 512-dimensional feature vectors
+2. **ViT is used:**
+   - Scores compatibility of outfit combinations
+   - Called **50-100+ times** per recommendation request (once per candidate)
+   - Output: Compatibility score (0.0 to ~1.0+)
+3. **Both models work together:**
+   - ResNet provides item-level understanding
+   - ViT provides outfit-level compatibility
+   - Together they create personalized, context-aware recommendations
+4. **The system is NOT just rule-based:**
+   - Deep learning models (ResNet + ViT) provide the core intelligence
+   - Rules and heuristics (penalties/bonuses) refine the results
+   - Tags and context guide the generation process
+---
+## 🔬 **Technical Details**
+### **ResNet Architecture:**
+- **Backbone:** ResNet50 (pretrained on ImageNet)
+- **Input:** 224×224 RGB images
+- **Output:** 512-dimensional embeddings
+- **Purpose:** Extract visual features from clothing items
+### **ViT Architecture:**
+- **Encoder:** Transformer with 4-6 layers, 8 attention heads
+- **Input:** Sequence of item embeddings (variable length, 2-6 items)
+- **Output:** Single compatibility score
+- **Purpose:** Learn which items go well together
+### **Training:**
+- **ResNet:** Trained with triplet loss on item pairs
+- **ViT:** Trained with triplet loss on outfit triplets (positive, anchor, negative)
+- **Both:** Use early stopping, best model checkpointing
+---
+## ✅ **Conclusion**
+**YES - Both ResNet and ViT are actively used during inference!**
+- **ResNet** generates item embeddings (visual understanding)
+- **ViT** scores outfit compatibility (relationship learning)
+- Together they create intelligent, personalized recommendations
+The system is a **true deep learning pipeline**, not just rule-based filtering!

app.py CHANGED Viewed

@@ -17,6 +17,15 @@ import json
 from inference import InferenceService
 from utils.data_fetch import ensure_dataset_ready
 from utils.tag_system import get_all_tag_options, validate_tags, TagProcessor
 # Global state
 BOOT_STATUS = "starting"
@@ -335,6 +344,18 @@ def get_tags() -> dict:
         }
     }
 @app.post("/tags/validate")
 def validate_request_tags(tags: Dict[str, Any], x_api_key: Optional[str] = Header(None)) -> dict:
     """
@@ -389,20 +410,52 @@ def test_recommend() -> dict:
 @app.post("/embed")
 def embed(req: EmbedRequest, x_api_key: Optional[str] = Header(None)) -> dict:
     require_api_key(x_api_key)
     images: List[Image.Image] = []
     if req.image_urls:
         for url in req.image_urls:
-            resp = requests.get(url, timeout=20)
-            resp.raise_for_status()
-            images.append(Image.open(io.BytesIO(resp.content)).convert("RGB"))
     if req.images_base64:
         for b64 in req.images_base64:
-            images.append(Image.open(io.BytesIO(base64.b64decode(b64))).convert("RGB"))
     if not images:
-        raise HTTPException(status_code=400, detail="No images provided")
     embs = service.embed_images(images)
-    return {"embeddings": [e.tolist() for e in embs], "model_version": service.resnet_version}
 @app.post("/compose")
@@ -498,14 +551,11 @@ def artifacts() -> dict:
 # --------- Gradio UI ---------
 def _load_images_from_files(files: List[str]) -> List[Image.Image]:
-    images: List[Image.Image] = []
-    for fp in files:
-        try:
-            with Image.open(fp) as im:
-                images.append(im.convert("RGB"))
-        except Exception:
-            continue
-    return images
 def gradio_embed(files: List[str]):
@@ -870,9 +920,9 @@ def start_training_simple(dataset_size: str, res_epochs: int, vit_epochs: int):
             # Train ResNet first and wait for completion
             log_message += f"\n🚀 Starting ResNet training on {dataset_size} samples...\n"
             resnet_result = subprocess.run([
-                "python", "train_resnet.py", "--data_root", DATASET_ROOT, "--epochs", str(res_epochs),
                 "--batch_size", "4", "--lr", "1e-3", "--early_stopping_patience", "3",
-                "--out", os.path.join(export_dir, "resnet_item_embedder.pth")
             ] + dataset_args, capture_output=True, text=True, check=False)
             if resnet_result.returncode == 0:
@@ -897,7 +947,7 @@ def start_training_simple(dataset_size: str, res_epochs: int, vit_epochs: int):
             log_message += f"\n🚀 Starting ViT training on {dataset_size} samples...\n"
             vit_result = subprocess.run([
-                "python", "train_vit_triplet.py", "--data_root", DATASET_ROOT, "--epochs", str(vit_epochs),
                 "--batch_size", "4", "--lr", "5e-4", "--early_stopping_patience", "5",
                 "--max_samples", "5000", "--triplet_margin", "0.5", "--gradient_clip", "1.0",
                 "--warmup_epochs", "2", "--export", os.path.join(export_dir, "vit_outfit_model.pth")
@@ -956,8 +1006,14 @@ with gr.Blocks(fill_height=True, title="Dressify - Advanced Outfit Recommendatio
     with gr.Tab("🎨 Recommend"):
         gr.Markdown("### 🎯 Personalized Outfit Recommendations\n*Upload your wardrobe and customize recommendations with advanced tag preferences*")
-        inp2 = gr.Files(label="Upload wardrobe images", file_types=["image"], file_count="multiple")
         with gr.Accordion("🎯 Primary Tags (Required)", open=True):
             with gr.Row():

 from inference import InferenceService
 from utils.data_fetch import ensure_dataset_ready
 from utils.tag_system import get_all_tag_options, validate_tags, TagProcessor
+from utils.image_utils import (
+    load_images_from_files,
+    load_image_from_bytes,
+    load_image_from_url,
+    is_image_file,
+    get_supported_formats,
+    get_supported_extensions,
+    ensure_rgb_image
+)
 # Global state
 BOOT_STATUS = "starting"
         }
     }
+@app.get("/image-formats")
+def get_image_formats() -> dict:
+    """
+    Get all supported image formats for API integration.
+    """
+    return {
+        "supported_formats": get_supported_formats(),
+        "supported_extensions": get_supported_extensions(),
+        "description": "All major image formats are supported including JPG, PNG, WEBP, GIF, BMP, TIFF, and more",
+        "note": "Images are automatically converted to RGB mode for model processing"
+    }
 @app.post("/tags/validate")
 def validate_request_tags(tags: Dict[str, Any], x_api_key: Optional[str] = Header(None)) -> dict:
     """
 @app.post("/embed")
 def embed(req: EmbedRequest, x_api_key: Optional[str] = Header(None)) -> dict:
+    """
+    Generate embeddings for images with comprehensive format support.
+    Supports JPG, PNG, WEBP, GIF, BMP, TIFF, and other major formats.
+    """
     require_api_key(x_api_key)
     images: List[Image.Image] = []
+    errors = []
+    # Load from URLs
     if req.image_urls:
         for url in req.image_urls:
+            img = load_image_from_url(url, timeout=20, convert_to_rgb=True, raise_on_error=False)
+            if img is not None:
+                images.append(img)
+            else:
+                errors.append(f"Failed to load image from URL: {url}")
+    # Load from base64
     if req.images_base64:
         for b64 in req.images_base64:
+            try:
+                image_bytes = base64.b64decode(b64)
+                img = load_image_from_bytes(image_bytes, convert_to_rgb=True, raise_on_error=False)
+                if img is not None:
+                    images.append(img)
+                else:
+                    errors.append("Failed to load image from base64")
+            except Exception as e:
+                errors.append(f"Error decoding base64 image: {str(e)}")
     if not images:
+        error_msg = "No images provided or all images failed to load"
+        if errors:
+            error_msg += f". Errors: {', '.join(errors[:3])}"
+        raise HTTPException(status_code=400, detail=error_msg)
+    # Ensure all images are RGB
+    images = [ensure_rgb_image(img) for img in images]
     embs = service.embed_images(images)
+    return {
+        "embeddings": [e.tolist() for e in embs],
+        "model_version": service.resnet_version,
+        "images_loaded": len(images),
+        "errors": errors if errors else None
+    }
 @app.post("/compose")
 # --------- Gradio UI ---------
 def _load_images_from_files(files: List[str]) -> List[Image.Image]:
+    """
+    Load images from file paths with comprehensive format support.
+    Supports JPG, PNG, WEBP, GIF, BMP, TIFF, and other major formats.
+    """
+    return load_images_from_files(files, convert_to_rgb=True, skip_errors=True)
 def gradio_embed(files: List[str]):
             # Train ResNet first and wait for completion
             log_message += f"\n🚀 Starting ResNet training on {dataset_size} samples...\n"
             resnet_result = subprocess.run([
+                        "python", "train_resnet.py", "--data_root", DATASET_ROOT, "--epochs", str(res_epochs),
                 "--batch_size", "4", "--lr", "1e-3", "--early_stopping_patience", "3",
+                        "--out", os.path.join(export_dir, "resnet_item_embedder.pth")
             ] + dataset_args, capture_output=True, text=True, check=False)
             if resnet_result.returncode == 0:
             log_message += f"\n🚀 Starting ViT training on {dataset_size} samples...\n"
             vit_result = subprocess.run([
+                        "python", "train_vit_triplet.py", "--data_root", DATASET_ROOT, "--epochs", str(vit_epochs),
                 "--batch_size", "4", "--lr", "5e-4", "--early_stopping_patience", "5",
                 "--max_samples", "5000", "--triplet_margin", "0.5", "--gradient_clip", "1.0",
                 "--warmup_epochs", "2", "--export", os.path.join(export_dir, "vit_outfit_model.pth")
     with gr.Tab("🎨 Recommend"):
         gr.Markdown("### 🎯 Personalized Outfit Recommendations\n*Upload your wardrobe and customize recommendations with advanced tag preferences*")
+        gr.Markdown(f"**Supported Formats:** {', '.join(get_supported_extensions())} (JPG, PNG, WEBP, GIF, BMP, TIFF, and more)")
+        inp2 = gr.Files(
+            label="Upload wardrobe images",
+            file_types=["image"],
+            file_count="multiple",
+            type="filepath"
+        )
         with gr.Accordion("🎯 Primary Tags (Required)", open=True):
             with gr.Row():

inference.py CHANGED Viewed

@@ -16,6 +16,7 @@ from utils.transforms import build_inference_transform
 from models.resnet_embedder import ResNetItemEmbedder
 from models.vit_outfit import OutfitCompatibilityModel
 from utils.tag_system import TagProcessor, get_all_tag_options, validate_tags
 def _get_device() -> str:
@@ -312,6 +313,10 @@ class InferenceService:
     @torch.inference_mode()
     def embed_images(self, images: List[Image.Image]) -> List[np.ndarray]:
         print(f"🔍 DEBUG: embed_images called with {len(images)} images")
         if len(images) == 0:
             print("🔍 DEBUG: No images provided, returning empty list")
@@ -321,9 +326,27 @@ class InferenceService:
         if self.resnet is None:
             print("🔍 DEBUG: ResNet model is None, returning empty list")
             return []
         try:
-            batch = torch.stack([self.transform(img) for img in images])
             batch = batch.to(self.device, memory_format=torch.channels_last)
             use_amp = (self.device == "cuda")
             with torch.autocast(device_type=("cuda" if use_amp else "cpu"), enabled=use_amp):
@@ -334,6 +357,8 @@ class InferenceService:
             return result
         except Exception as e:
             print(f"🔍 DEBUG: Error in embed_images: {e}")
             return []
     @torch.inference_mode()

 from models.resnet_embedder import ResNetItemEmbedder
 from models.vit_outfit import OutfitCompatibilityModel
 from utils.tag_system import TagProcessor, get_all_tag_options, validate_tags
+from utils.image_utils import ensure_rgb_image, validate_image_format
 def _get_device() -> str:
     @torch.inference_mode()
     def embed_images(self, images: List[Image.Image]) -> List[np.ndarray]:
+        """
+        Generate embeddings for images with comprehensive format support.
+        All images are validated and converted to RGB before processing.
+        """
         print(f"🔍 DEBUG: embed_images called with {len(images)} images")
         if len(images) == 0:
             print("🔍 DEBUG: No images provided, returning empty list")
         if self.resnet is None:
             print("🔍 DEBUG: ResNet model is None, returning empty list")
             return []
+        # Validate and convert all images to RGB
+        processed_images = []
+        for i, img in enumerate(images):
+            is_valid, error_msg = validate_image_format(img)
+            if not is_valid:
+                print(f"⚠️ Skipping invalid image {i}: {error_msg}")
+                continue
+            # Ensure RGB mode (required for ResNet)
+            rgb_img = ensure_rgb_image(img)
+            processed_images.append(rgb_img)
+        if len(processed_images) == 0:
+            print("⚠️ No valid images after processing")
+            return []
+        print(f"🔍 DEBUG: Processing {len(processed_images)} valid images")
         try:
+            batch = torch.stack([self.transform(img) for img in processed_images])
             batch = batch.to(self.device, memory_format=torch.channels_last)
             use_amp = (self.device == "cuda")
             with torch.autocast(device_type=("cuda" if use_amp else "cpu"), enabled=use_amp):
             return result
         except Exception as e:
             print(f"🔍 DEBUG: Error in embed_images: {e}")
+            import traceback
+            traceback.print_exc()
             return []
     @torch.inference_mode()

utils/artifact_manager.py CHANGED Viewed

@@ -90,7 +90,10 @@ class ArtifactManager:
             images_dir = os.path.join(self.data_dir, "images")
             if os.path.exists(images_dir):
                 try:
-                    image_files = [f for f in os.listdir(images_dir) if f.lower().endswith(('.jpg', '.jpeg', '.png', '.webp'))]
                     info["images_count"] = len(image_files)
                 except:
                     pass

             images_dir = os.path.join(self.data_dir, "images")
             if os.path.exists(images_dir):
                 try:
+                    # Support all major image formats
+                    from utils.image_utils import get_supported_extensions
+                    supported_exts = tuple(ext.lower() for ext in get_supported_extensions())
+                    image_files = [f for f in os.listdir(images_dir) if f.lower().endswith(supported_exts)]
                     info["images_count"] = len(image_files)
                 except:
                     pass

utils/image_utils.py ADDED Viewed

	@@ -0,0 +1,374 @@

+"""
+Comprehensive Image Format Support Utilities
+This module provides robust image loading and processing that supports
+all major image formats including JPG, PNG, WEBP, GIF, BMP, TIFF, etc.
+"""
+import io
+from typing import List, Optional, Tuple, Union
+from pathlib import Path
+from PIL import Image, ImageFile, UnidentifiedImageError
+import requests
+# Enable PIL to load truncated images
+ImageFile.LOAD_TRUNCATED_IMAGES = True
+# Supported image formats
+SUPPORTED_FORMATS = {
+    # Raster formats
+    'JPEG', 'JPG',  # JPEG
+    'PNG',          # PNG
+    'WEBP',         # WebP
+    'GIF',          # GIF (static frames)
+    'BMP',          # Bitmap
+    'TIFF', 'TIF',  # TIFF
+    'ICO',          # Icon
+    'PCX',          # PC Paintbrush
+    'PPM',          # Portable Pixmap
+    'PBM',          # Portable Bitmap
+    'PGM',          # Portable Graymap
+    'XBM',          # X Bitmap
+    'XPM',          # X Pixmap
+    # Additional formats if available
+    'HEIF', 'HEIC', # HEIF/HEIC (if pillow-heif installed)
+    'AVIF',         # AVIF (if pillow-avif-plugin installed)
+}
+# File extensions mapping
+EXTENSION_TO_FORMAT = {
+    '.jpg': 'JPEG',
+    '.jpeg': 'JPEG',
+    '.png': 'PNG',
+    '.webp': 'WEBP',
+    '.gif': 'GIF',
+    '.bmp': 'BMP',
+    '.tiff': 'TIFF',
+    '.tif': 'TIFF',
+    '.ico': 'ICO',
+    '.pcx': 'PCX',
+    '.ppm': 'PPM',
+    '.pbm': 'PBM',
+    '.pgm': 'PGM',
+    '.xbm': 'XBM',
+    '.xpm': 'XPM',
+    '.heif': 'HEIF',
+    '.heic': 'HEIC',
+    '.avif': 'AVIF',
+}
+def is_image_file(filepath: Union[str, Path]) -> bool:
+    """
+    Check if a file is a supported image format based on extension.
+    Args:
+        filepath: Path to the file
+    Returns:
+        True if the file appears to be a supported image format
+    """
+    path = Path(filepath)
+    ext = path.suffix.lower()
+    return ext in EXTENSION_TO_FORMAT
+def get_image_format(filepath: Union[str, Path]) -> Optional[str]:
+    """
+    Get the image format from file extension.
+    Args:
+        filepath: Path to the file
+    Returns:
+        Format name (e.g., 'JPEG', 'PNG') or None if unknown
+    """
+    path = Path(filepath)
+    ext = path.suffix.lower()
+    return EXTENSION_TO_FORMAT.get(ext)
+def load_image_from_file(
+    filepath: Union[str, Path],
+    convert_to_rgb: bool = True,
+    raise_on_error: bool = False
+) -> Optional[Image.Image]:
+    """
+    Load an image from a file path, supporting all major formats.
+    Args:
+        filepath: Path to the image file
+        convert_to_rgb: Convert image to RGB mode (required for models)
+        raise_on_error: If True, raise exception on error; if False, return None
+    Returns:
+        PIL Image object or None if loading failed
+    """
+    try:
+        path = Path(filepath)
+        # Check if file exists
+        if not path.exists():
+            if raise_on_error:
+                raise FileNotFoundError(f"Image file not found: {filepath}")
+            return None
+        # Check if it's a supported format
+        if not is_image_file(path):
+            if raise_on_error:
+                raise ValueError(f"Unsupported image format: {filepath}")
+            print(f"⚠️ Skipping unsupported format: {filepath}")
+            return None
+        # Open and load image
+        with Image.open(path) as img:
+            # Verify it's actually an image
+            img.verify()
+        # Re-open for actual use (verify() closes the file)
+        img = Image.open(path)
+        # Convert to RGB if needed (required for deep learning models)
+        if convert_to_rgb:
+            if img.mode != 'RGB':
+                # Handle different modes
+                if img.mode in ('RGBA', 'LA', 'P'):
+                    # Create white background for transparency
+                    background = Image.new('RGB', img.size, (255, 255, 255))
+                    if img.mode == 'P':
+                        img = img.convert('RGBA')
+                    if img.mode in ('RGBA', 'LA'):
+                        background.paste(img, mask=img.split()[-1])  # Use alpha channel as mask
+                    img = background
+                else:
+                    img = img.convert('RGB')
+        return img
+    except UnidentifiedImageError:
+        error_msg = f"❌ Cannot identify image format: {filepath}"
+        if raise_on_error:
+            raise ValueError(error_msg)
+        print(error_msg)
+        return None
+    except Exception as e:
+        error_msg = f"❌ Error loading image {filepath}: {str(e)}"
+        if raise_on_error:
+            raise
+        print(error_msg)
+        return None
+def load_image_from_bytes(
+    image_bytes: bytes,
+    convert_to_rgb: bool = True,
+    raise_on_error: bool = False
+) -> Optional[Image.Image]:
+    """
+    Load an image from bytes, supporting all major formats.
+    Args:
+        image_bytes: Image data as bytes
+        convert_to_rgb: Convert image to RGB mode (required for models)
+        raise_on_error: If True, raise exception on error; if False, return None
+    Returns:
+        PIL Image object or None if loading failed
+    """
+    try:
+        # Open from bytes
+        img = Image.open(io.BytesIO(image_bytes))
+        # Verify it's actually an image
+        img.verify()
+        # Re-open for actual use
+        img = Image.open(io.BytesIO(image_bytes))
+        # Convert to RGB if needed
+        if convert_to_rgb:
+            if img.mode != 'RGB':
+                if img.mode in ('RGBA', 'LA', 'P'):
+                    background = Image.new('RGB', img.size, (255, 255, 255))
+                    if img.mode == 'P':
+                        img = img.convert('RGBA')
+                    if img.mode in ('RGBA', 'LA'):
+                        background.paste(img, mask=img.split()[-1])
+                    img = background
+                else:
+                    img = img.convert('RGB')
+        return img
+    except UnidentifiedImageError:
+        error_msg = "❌ Cannot identify image format from bytes"
+        if raise_on_error:
+            raise ValueError(error_msg)
+        print(error_msg)
+        return None
+    except Exception as e:
+        error_msg = f"❌ Error loading image from bytes: {str(e)}"
+        if raise_on_error:
+            raise
+        print(error_msg)
+        return None
+def load_image_from_url(
+    url: str,
+    timeout: int = 20,
+    convert_to_rgb: bool = True,
+    raise_on_error: bool = False
+) -> Optional[Image.Image]:
+    """
+    Load an image from a URL, supporting all major formats.
+    Args:
+        url: URL to the image
+        timeout: Request timeout in seconds
+        convert_to_rgb: Convert image to RGB mode (required for models)
+        raise_on_error: If True, raise exception on error; if False, return None
+    Returns:
+        PIL Image object or None if loading failed
+    """
+    try:
+        resp = requests.get(url, timeout=timeout, stream=True)
+        resp.raise_for_status()
+        # Check content type
+        content_type = resp.headers.get('Content-Type', '').lower()
+        if not any(fmt in content_type for fmt in ['image', 'jpeg', 'png', 'webp', 'gif']):
+            if raise_on_error:
+                raise ValueError(f"URL does not point to an image: {url}")
+            print(f"⚠️ URL does not appear to be an image: {url}")
+            return None
+        # Load from bytes
+        return load_image_from_bytes(resp.content, convert_to_rgb, raise_on_error)
+    except requests.RequestException as e:
+        error_msg = f"❌ Error fetching image from URL {url}: {str(e)}"
+        if raise_on_error:
+            raise
+        print(error_msg)
+        return None
+    except Exception as e:
+        error_msg = f"❌ Error loading image from URL {url}: {str(e)}"
+        if raise_on_error:
+            raise
+        print(error_msg)
+        return None
+def load_images_from_files(
+    filepaths: List[Union[str, Path]],
+    convert_to_rgb: bool = True,
+    skip_errors: bool = True
+) -> List[Image.Image]:
+    """
+    Load multiple images from file paths, supporting all major formats.
+    Args:
+        filepaths: List of paths to image files
+        convert_to_rgb: Convert images to RGB mode (required for models)
+        skip_errors: If True, skip files that fail to load; if False, raise on first error
+    Returns:
+        List of PIL Image objects (only successfully loaded images)
+    """
+    images = []
+    loaded_count = 0
+    failed_count = 0
+    for fp in filepaths:
+        img = load_image_from_file(fp, convert_to_rgb, raise_on_error=not skip_errors)
+        if img is not None:
+            images.append(img)
+            loaded_count += 1
+        else:
+            failed_count += 1
+    if failed_count > 0:
+        print(f"⚠️ Loaded {loaded_count} images, {failed_count} failed")
+    return images
+def validate_image_format(img: Image.Image) -> Tuple[bool, Optional[str]]:
+    """
+    Validate that an image is in a supported format and ready for processing.
+    Args:
+        img: PIL Image object
+    Returns:
+        Tuple of (is_valid, error_message)
+    """
+    if img is None:
+        return False, "Image is None"
+    if not hasattr(img, 'mode'):
+        return False, "Invalid image object"
+    # Check if format is supported
+    if hasattr(img, 'format') and img.format:
+        if img.format not in SUPPORTED_FORMATS:
+            return False, f"Unsupported format: {img.format}"
+    # Check if image has valid size
+    if img.size[0] == 0 or img.size[1] == 0:
+        return False, "Image has zero dimensions"
+    return True, None
+def ensure_rgb_image(img: Image.Image) -> Image.Image:
+    """
+    Ensure an image is in RGB mode, converting if necessary.
+    Args:
+        img: PIL Image object
+    Returns:
+        RGB mode PIL Image
+    """
+    if img.mode == 'RGB':
+        return img
+    if img.mode in ('RGBA', 'LA', 'P'):
+        # Handle transparency
+        background = Image.new('RGB', img.size, (255, 255, 255))
+        if img.mode == 'P':
+            img = img.convert('RGBA')
+        if img.mode in ('RGBA', 'LA'):
+            if img.mode == 'RGBA':
+                background.paste(img, mask=img.split()[-1])
+            else:
+                background.paste(img, mask=img.split()[-1])
+        return background
+    else:
+        return img.convert('RGB')
+def get_supported_formats() -> List[str]:
+    """
+    Get list of all supported image formats.
+    Returns:
+        List of format names
+    """
+    return sorted(list(SUPPORTED_FORMATS))
+def get_supported_extensions() -> List[str]:
+    """
+    Get list of all supported file extensions.
+    Returns:
+        List of file extensions (with dots)
+    """
+    return sorted(list(EXTENSION_TO_FORMAT.keys()))