helloWorld / README.md
saadmannan's picture
Update README.md
f5d6db6 verified

A newer version of the Gradio SDK is available: 6.3.0

Upgrade
metadata
title: HelloWorld
emoji: 🌖
colorFrom: gray
colorTo: pink
sdk: gradio
sdk_version: 5.47.2
app_file: app.py
pinned: false
short_description: trial project for HF

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

Fashion Item Classifier

A Gradio-based web application that classifies fashion items from image URLs using CLIP (Contrastive Language-Image Pre-training) model.

Steps to Create This Hugging Face Space

Based on the guide from Marqo's blog post, here are the steps followed:

1. Create an Account

  • Head to Hugging Face and create an account
  • Follow the sign-up process with your details

2. Confirm Your Email Address

  • Check your email to confirm your account
  • This enables access to all Hugging Face features, including Spaces

3. Head to Spaces

  • After confirming email, log in and click on Spaces in the main navigation bar
  • This is where you manage and deploy your models and apps

4. Create a New Space

  • Click Create New Space
  • Configure the following settings:
    • Owner: Your Hugging Face account name
    • Space name: Choose a descriptive name (e.g., 'fashion-classifier')
    • Short Description: Optional description of your project
    • License: Optional
    • Space SDK: Select Gradio
    • Gradio template: Keep as Blank
    • Space hardware: Use CPU basic • 2 CPU • 16 GB • FREE for free tier
    • Privacy: Select Public to share with others
  • Click Create Space

5. Install Git

  • If you don't have Git, download it from Git's official page
  • Install Git for your operating system
  • Verify installation by running: git --version

6. Clone the Hugging Face Space

git clone https://huggingface.co/spaces/your-username/your-space

Replace your-username and your-space with your actual username and space name.

7. Open the Folder in VSCode

  • Navigate to the cloned folder
  • Open it in Visual Studio Code (VSCode)
  • Initially, you'll only have .gitattributes and README.md files

8. Create an app.py File

  • Create a new file named app.py in VSCode
  • This contains the main application code for your fashion item classifier

9. Add Dependencies

  • Create a requirements.txt file
  • List all required Python packages for your application

10. Test Your App Locally

Create a virtual environment and test locally:

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Run the app
python app.py

11. Upload to Hugging Face Hub

  • Create a .gitignore file to exclude unnecessary files (like venv/)
  • Commit and push your code:
git add .
git commit -m "Initial commit"
git push origin main

Development Challenges and Solutions

Problem 1: PyTorch Meta Tensor Error

Issue: The original Marqo/marqo-fashionSigLIP model encountered a meta tensor error:

NotImplementedError: Cannot copy out of meta tensor; no data! 
Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() 
when moving module from meta to a different device.

Root Cause: This error occurred due to compatibility issues between the custom SigLIP model and newer versions of PyTorch/transformers. The model was being initialized with meta tensors (tensors without actual data) and the open_clip library was trying to move them to a device using the deprecated .to() method.

Attempted Solutions:

  1. Environment Variables: Tried setting PYTORCH_CUDA_ALLOC_CONF and disabling meta device initialization
  2. Model Parameters: Attempted using torch_dtype=torch.float32, device_map="cpu", and low_cpu_mem_usage=False
  3. Accelerate Library: Installed the accelerate library as required by the error messages
  4. PyTorch Version Downgrade: Attempted to downgrade PyTorch to version 2.1.0 (not available for Windows)

Final Solution: Replaced the problematic model with the standard OpenAI CLIP model:

  • Original Model: Marqo/marqo-fashionSigLIP (custom SigLIP implementation)
  • Final Model: openai/clip-vit-base-patch32 (standard CLIP model)

Problem 2: Model Architecture Differences

Issue: The code structure needed to be adapted for the different model architecture.

Solution: Updated the prediction function to use CLIP's unified text-image processing:

  • Before: Separate text preprocessing and feature extraction using get_text_features() and get_image_features()
  • After: Combined processing using processor(images=image, text=fashion_items) and model(**inputs)

Problem 3: Windows Command Compatibility

Issue: The original tutorial used Unix/Linux commands (source venv/bin/activate) which don't work on Windows PowerShell.

Solution: Used Windows-compatible commands:

  • Virtual Environment Activation: Used direct Python execution via venv\Scripts\python.exe instead of activating the environment
  • Package Installation: venv\Scripts\python.exe -m pip install -r requirements.txt

Final Model Choice: OpenAI CLIP

Selected Model: openai/clip-vit-base-patch32

Reasons for Selection:

  1. Stability: Well-tested and widely used in production environments
  2. Compatibility: Full compatibility with current PyTorch and transformers versions
  3. Performance: Excellent performance on image-text classification tasks
  4. Documentation: Extensive documentation and community support
  5. Simplicity: Straightforward implementation without custom code requirements

Trade-offs:

  • Specialization: Less specialized for fashion items compared to the original SigLIP model
  • Accuracy: May have slightly lower accuracy on fashion-specific classifications
  • Model Size: Standard CLIP model size vs. potentially optimized SigLIP

The final implementation successfully classifies fashion items into categories: 'top', 'trousers', and 'bottom' using image URLs.