--- title: HelloWorld emoji: 🌖 colorFrom: gray colorTo: pink sdk: gradio sdk_version: 5.47.2 app_file: app.py pinned: false short_description: trial project for HF --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference # Fashion Item Classifier A Gradio-based web application that classifies fashion items from image URLs using CLIP (Contrastive Language-Image Pre-training) model. ## Steps to Create This Hugging Face Space Based on the guide from [Marqo's blog post](https://www.marqo.ai/blog/how-to-create-a-hugging-face-space), here are the steps followed: ### 1. Create an Account - Head to [Hugging Face](https://huggingface.co/) and create an account - Follow the sign-up process with your details ### 2. Confirm Your Email Address - Check your email to confirm your account - This enables access to all Hugging Face features, including Spaces ### 3. Head to Spaces - After confirming email, log in and click on **Spaces** in the main navigation bar - This is where you manage and deploy your models and apps ### 4. Create a New Space - Click **Create New Space** - Configure the following settings: - **Owner**: Your Hugging Face account name - **Space name**: Choose a descriptive name (e.g., 'fashion-classifier') - **Short Description**: Optional description of your project - **License**: Optional - **Space SDK**: Select **Gradio** - **Gradio template**: Keep as **Blank** - **Space hardware**: Use **CPU basic • 2 CPU • 16 GB • FREE** for free tier - **Privacy**: Select **Public** to share with others - Click **Create Space** ### 5. Install Git - If you don't have Git, download it from [Git's official page](https://git-scm.com/downloads) - Install Git for your operating system - Verify installation by running: `git --version` ### 6. Clone the Hugging Face Space ```bash git clone https://huggingface.co/spaces/your-username/your-space ``` Replace `your-username` and `your-space` with your actual username and space name. ### 7. Open the Folder in VSCode - Navigate to the cloned folder - Open it in Visual Studio Code (VSCode) - Initially, you'll only have `.gitattributes` and `README.md` files ### 8. Create an app.py File - Create a new file named `app.py` in VSCode - This contains the main application code for your fashion item classifier ### 9. Add Dependencies - Create a `requirements.txt` file - List all required Python packages for your application ### 10. Test Your App Locally Create a virtual environment and test locally: ```bash # Create virtual environment python -m venv venv # Activate virtual environment # On Windows: venv\Scripts\activate # On macOS/Linux: source venv/bin/activate # Install dependencies pip install -r requirements.txt # Run the app python app.py ``` ### 11. Upload to Hugging Face Hub - Create a `.gitignore` file to exclude unnecessary files (like `venv/`) - Commit and push your code: ```bash git add . git commit -m "Initial commit" git push origin main ``` ## Development Challenges and Solutions ### Problem 1: PyTorch Meta Tensor Error **Issue**: The original `Marqo/marqo-fashionSigLIP` model encountered a meta tensor error: ``` NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device. ``` **Root Cause**: This error occurred due to compatibility issues between the custom SigLIP model and newer versions of PyTorch/transformers. The model was being initialized with meta tensors (tensors without actual data) and the `open_clip` library was trying to move them to a device using the deprecated `.to()` method. **Attempted Solutions**: 1. **Environment Variables**: Tried setting `PYTORCH_CUDA_ALLOC_CONF` and disabling meta device initialization 2. **Model Parameters**: Attempted using `torch_dtype=torch.float32`, `device_map="cpu"`, and `low_cpu_mem_usage=False` 3. **Accelerate Library**: Installed the `accelerate` library as required by the error messages 4. **PyTorch Version Downgrade**: Attempted to downgrade PyTorch to version 2.1.0 (not available for Windows) **Final Solution**: Replaced the problematic model with the standard OpenAI CLIP model: - **Original Model**: `Marqo/marqo-fashionSigLIP` (custom SigLIP implementation) - **Final Model**: `openai/clip-vit-base-patch32` (standard CLIP model) ### Problem 2: Model Architecture Differences **Issue**: The code structure needed to be adapted for the different model architecture. **Solution**: Updated the prediction function to use CLIP's unified text-image processing: - **Before**: Separate text preprocessing and feature extraction using `get_text_features()` and `get_image_features()` - **After**: Combined processing using `processor(images=image, text=fashion_items)` and `model(**inputs)` ### Problem 3: Windows Command Compatibility **Issue**: The original tutorial used Unix/Linux commands (`source venv/bin/activate`) which don't work on Windows PowerShell. **Solution**: Used Windows-compatible commands: - **Virtual Environment Activation**: Used direct Python execution via `venv\Scripts\python.exe` instead of activating the environment - **Package Installation**: `venv\Scripts\python.exe -m pip install -r requirements.txt` ### Final Model Choice: OpenAI CLIP **Selected Model**: `openai/clip-vit-base-patch32` **Reasons for Selection**: 1. **Stability**: Well-tested and widely used in production environments 2. **Compatibility**: Full compatibility with current PyTorch and transformers versions 3. **Performance**: Excellent performance on image-text classification tasks 4. **Documentation**: Extensive documentation and community support 5. **Simplicity**: Straightforward implementation without custom code requirements **Trade-offs**: - **Specialization**: Less specialized for fashion items compared to the original SigLIP model - **Accuracy**: May have slightly lower accuracy on fashion-specific classifications - **Model Size**: Standard CLIP model size vs. potentially optimized SigLIP The final implementation successfully classifies fashion items into categories: 'top', 'trousers', and 'bottom' using image URLs.