Spaces:
Sleeping
A newer version of the Gradio SDK is available:
6.3.0
title: HelloWorld
emoji: 🌖
colorFrom: gray
colorTo: pink
sdk: gradio
sdk_version: 5.47.2
app_file: app.py
pinned: false
short_description: trial project for HF
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
Fashion Item Classifier
A Gradio-based web application that classifies fashion items from image URLs using CLIP (Contrastive Language-Image Pre-training) model.
Steps to Create This Hugging Face Space
Based on the guide from Marqo's blog post, here are the steps followed:
1. Create an Account
- Head to Hugging Face and create an account
- Follow the sign-up process with your details
2. Confirm Your Email Address
- Check your email to confirm your account
- This enables access to all Hugging Face features, including Spaces
3. Head to Spaces
- After confirming email, log in and click on Spaces in the main navigation bar
- This is where you manage and deploy your models and apps
4. Create a New Space
- Click Create New Space
- Configure the following settings:
- Owner: Your Hugging Face account name
- Space name: Choose a descriptive name (e.g., 'fashion-classifier')
- Short Description: Optional description of your project
- License: Optional
- Space SDK: Select Gradio
- Gradio template: Keep as Blank
- Space hardware: Use CPU basic • 2 CPU • 16 GB • FREE for free tier
- Privacy: Select Public to share with others
- Click Create Space
5. Install Git
- If you don't have Git, download it from Git's official page
- Install Git for your operating system
- Verify installation by running:
git --version
6. Clone the Hugging Face Space
git clone https://huggingface.co/spaces/your-username/your-space
Replace your-username and your-space with your actual username and space name.
7. Open the Folder in VSCode
- Navigate to the cloned folder
- Open it in Visual Studio Code (VSCode)
- Initially, you'll only have
.gitattributesandREADME.mdfiles
8. Create an app.py File
- Create a new file named
app.pyin VSCode - This contains the main application code for your fashion item classifier
9. Add Dependencies
- Create a
requirements.txtfile - List all required Python packages for your application
10. Test Your App Locally
Create a virtual environment and test locally:
# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Run the app
python app.py
11. Upload to Hugging Face Hub
- Create a
.gitignorefile to exclude unnecessary files (likevenv/) - Commit and push your code:
git add .
git commit -m "Initial commit"
git push origin main
Development Challenges and Solutions
Problem 1: PyTorch Meta Tensor Error
Issue: The original Marqo/marqo-fashionSigLIP model encountered a meta tensor error:
NotImplementedError: Cannot copy out of meta tensor; no data!
Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to()
when moving module from meta to a different device.
Root Cause: This error occurred due to compatibility issues between the custom SigLIP model and newer versions of PyTorch/transformers. The model was being initialized with meta tensors (tensors without actual data) and the open_clip library was trying to move them to a device using the deprecated .to() method.
Attempted Solutions:
- Environment Variables: Tried setting
PYTORCH_CUDA_ALLOC_CONFand disabling meta device initialization - Model Parameters: Attempted using
torch_dtype=torch.float32,device_map="cpu", andlow_cpu_mem_usage=False - Accelerate Library: Installed the
acceleratelibrary as required by the error messages - PyTorch Version Downgrade: Attempted to downgrade PyTorch to version 2.1.0 (not available for Windows)
Final Solution: Replaced the problematic model with the standard OpenAI CLIP model:
- Original Model:
Marqo/marqo-fashionSigLIP(custom SigLIP implementation) - Final Model:
openai/clip-vit-base-patch32(standard CLIP model)
Problem 2: Model Architecture Differences
Issue: The code structure needed to be adapted for the different model architecture.
Solution: Updated the prediction function to use CLIP's unified text-image processing:
- Before: Separate text preprocessing and feature extraction using
get_text_features()andget_image_features() - After: Combined processing using
processor(images=image, text=fashion_items)andmodel(**inputs)
Problem 3: Windows Command Compatibility
Issue: The original tutorial used Unix/Linux commands (source venv/bin/activate) which don't work on Windows PowerShell.
Solution: Used Windows-compatible commands:
- Virtual Environment Activation: Used direct Python execution via
venv\Scripts\python.exeinstead of activating the environment - Package Installation:
venv\Scripts\python.exe -m pip install -r requirements.txt
Final Model Choice: OpenAI CLIP
Selected Model: openai/clip-vit-base-patch32
Reasons for Selection:
- Stability: Well-tested and widely used in production environments
- Compatibility: Full compatibility with current PyTorch and transformers versions
- Performance: Excellent performance on image-text classification tasks
- Documentation: Extensive documentation and community support
- Simplicity: Straightforward implementation without custom code requirements
Trade-offs:
- Specialization: Less specialized for fashion items compared to the original SigLIP model
- Accuracy: May have slightly lower accuracy on fashion-specific classifications
- Model Size: Standard CLIP model size vs. potentially optimized SigLIP
The final implementation successfully classifies fashion items into categories: 'top', 'trousers', and 'bottom' using image URLs.