File size: 6,219 Bytes
980bcf6
 
 
 
 
 
 
 
 
f5d6db6
980bcf6
 
 
97e38b1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
---
title: HelloWorld
emoji: 🌖
colorFrom: gray
colorTo: pink
sdk: gradio
sdk_version: 5.47.2
app_file: app.py
pinned: false
short_description: trial project for HF
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

# Fashion Item Classifier

A Gradio-based web application that classifies fashion items from image URLs using CLIP (Contrastive Language-Image Pre-training) model.

## Steps to Create This Hugging Face Space

Based on the guide from [Marqo's blog post](https://www.marqo.ai/blog/how-to-create-a-hugging-face-space), here are the steps followed:

### 1. Create an Account
- Head to [Hugging Face](https://huggingface.co/) and create an account
- Follow the sign-up process with your details

### 2. Confirm Your Email Address
- Check your email to confirm your account
- This enables access to all Hugging Face features, including Spaces

### 3. Head to Spaces
- After confirming email, log in and click on **Spaces** in the main navigation bar
- This is where you manage and deploy your models and apps

### 4. Create a New Space
- Click **Create New Space**
- Configure the following settings:
  - **Owner**: Your Hugging Face account name
  - **Space name**: Choose a descriptive name (e.g., 'fashion-classifier')
  - **Short Description**: Optional description of your project
  - **License**: Optional
  - **Space SDK**: Select **Gradio**
  - **Gradio template**: Keep as **Blank**
  - **Space hardware**: Use **CPU basic • 2 CPU • 16 GB • FREE** for free tier
  - **Privacy**: Select **Public** to share with others
- Click **Create Space**

### 5. Install Git
- If you don't have Git, download it from [Git's official page](https://git-scm.com/downloads)
- Install Git for your operating system
- Verify installation by running: `git --version`

### 6. Clone the Hugging Face Space
```bash
git clone https://huggingface.co/spaces/your-username/your-space
```
Replace `your-username` and `your-space` with your actual username and space name.

### 7. Open the Folder in VSCode
- Navigate to the cloned folder
- Open it in Visual Studio Code (VSCode)
- Initially, you'll only have `.gitattributes` and `README.md` files

### 8. Create an app.py File
- Create a new file named `app.py` in VSCode
- This contains the main application code for your fashion item classifier

### 9. Add Dependencies
- Create a `requirements.txt` file
- List all required Python packages for your application

### 10. Test Your App Locally
Create a virtual environment and test locally:
```bash
# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Run the app
python app.py
```

### 11. Upload to Hugging Face Hub
- Create a `.gitignore` file to exclude unnecessary files (like `venv/`)
- Commit and push your code:
```bash
git add .
git commit -m "Initial commit"
git push origin main
```

## Development Challenges and Solutions

### Problem 1: PyTorch Meta Tensor Error
**Issue**: The original `Marqo/marqo-fashionSigLIP` model encountered a meta tensor error:
```
NotImplementedError: Cannot copy out of meta tensor; no data! 
Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() 
when moving module from meta to a different device.
```

**Root Cause**: This error occurred due to compatibility issues between the custom SigLIP model and newer versions of PyTorch/transformers. The model was being initialized with meta tensors (tensors without actual data) and the `open_clip` library was trying to move them to a device using the deprecated `.to()` method.

**Attempted Solutions**:
1. **Environment Variables**: Tried setting `PYTORCH_CUDA_ALLOC_CONF` and disabling meta device initialization
2. **Model Parameters**: Attempted using `torch_dtype=torch.float32`, `device_map="cpu"`, and `low_cpu_mem_usage=False`
3. **Accelerate Library**: Installed the `accelerate` library as required by the error messages
4. **PyTorch Version Downgrade**: Attempted to downgrade PyTorch to version 2.1.0 (not available for Windows)

**Final Solution**: Replaced the problematic model with the standard OpenAI CLIP model:
- **Original Model**: `Marqo/marqo-fashionSigLIP` (custom SigLIP implementation)
- **Final Model**: `openai/clip-vit-base-patch32` (standard CLIP model)

### Problem 2: Model Architecture Differences
**Issue**: The code structure needed to be adapted for the different model architecture.

**Solution**: Updated the prediction function to use CLIP's unified text-image processing:
- **Before**: Separate text preprocessing and feature extraction using `get_text_features()` and `get_image_features()`
- **After**: Combined processing using `processor(images=image, text=fashion_items)` and `model(**inputs)`

### Problem 3: Windows Command Compatibility
**Issue**: The original tutorial used Unix/Linux commands (`source venv/bin/activate`) which don't work on Windows PowerShell.

**Solution**: Used Windows-compatible commands:
- **Virtual Environment Activation**: Used direct Python execution via `venv\Scripts\python.exe` instead of activating the environment
- **Package Installation**: `venv\Scripts\python.exe -m pip install -r requirements.txt`

### Final Model Choice: OpenAI CLIP
**Selected Model**: `openai/clip-vit-base-patch32`

**Reasons for Selection**:
1. **Stability**: Well-tested and widely used in production environments
2. **Compatibility**: Full compatibility with current PyTorch and transformers versions
3. **Performance**: Excellent performance on image-text classification tasks
4. **Documentation**: Extensive documentation and community support
5. **Simplicity**: Straightforward implementation without custom code requirements

**Trade-offs**:
- **Specialization**: Less specialized for fashion items compared to the original SigLIP model
- **Accuracy**: May have slightly lower accuracy on fashion-specific classifications
- **Model Size**: Standard CLIP model size vs. potentially optimized SigLIP

The final implementation successfully classifies fashion items into categories: 'top', 'trousers', and 'bottom' using image URLs.