Spaces:

saadmannan
/

helloWorld

Sleeping

App Files Files Community

saadmann18 commited on Sep 28, 2025

Commit

97e38b1

1 Parent(s): 980bcf6

initial commit

Browse files

Files changed (4) hide show

.gitignore +1 -0
README.md +138 -0
app.py +47 -0
requirements.txt +9 -0

.gitignore ADDED Viewed

	@@ -0,0 +1 @@


1	+ venv

README.md CHANGED Viewed

@@ -11,3 +11,141 @@ short_description: https://www.marqo.ai/blog/how-to-create-a-hugging-face-space
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
+# Fashion Item Classifier
+A Gradio-based web application that classifies fashion items from image URLs using CLIP (Contrastive Language-Image Pre-training) model.
+## Steps to Create This Hugging Face Space
+Based on the guide from [Marqo's blog post](https://www.marqo.ai/blog/how-to-create-a-hugging-face-space), here are the steps followed:
+### 1. Create an Account
+- Head to [Hugging Face](https://huggingface.co/) and create an account
+- Follow the sign-up process with your details
+### 2. Confirm Your Email Address
+- Check your email to confirm your account
+- This enables access to all Hugging Face features, including Spaces
+### 3. Head to Spaces
+- After confirming email, log in and click on **Spaces** in the main navigation bar
+- This is where you manage and deploy your models and apps
+### 4. Create a New Space
+- Click **Create New Space**
+- Configure the following settings:
+  - **Owner**: Your Hugging Face account name
+  - **Space name**: Choose a descriptive name (e.g., 'fashion-classifier')
+  - **Short Description**: Optional description of your project
+  - **License**: Optional
+  - **Space SDK**: Select **Gradio**
+  - **Gradio template**: Keep as **Blank**
+  - **Space hardware**: Use **CPU basic • 2 CPU • 16 GB • FREE** for free tier
+  - **Privacy**: Select **Public** to share with others
+- Click **Create Space**
+### 5. Install Git
+- If you don't have Git, download it from [Git's official page](https://git-scm.com/downloads)
+- Install Git for your operating system
+- Verify installation by running: `git --version`
+### 6. Clone the Hugging Face Space
+```bash
+git clone https://huggingface.co/spaces/your-username/your-space
+```
+Replace `your-username` and `your-space` with your actual username and space name.
+### 7. Open the Folder in VSCode
+- Navigate to the cloned folder
+- Open it in Visual Studio Code (VSCode)
+- Initially, you'll only have `.gitattributes` and `README.md` files
+### 8. Create an app.py File
+- Create a new file named `app.py` in VSCode
+- This contains the main application code for your fashion item classifier
+### 9. Add Dependencies
+- Create a `requirements.txt` file
+- List all required Python packages for your application
+### 10. Test Your App Locally
+Create a virtual environment and test locally:
+```bash
+# Create virtual environment
+python -m venv venv
+# Activate virtual environment
+# On Windows:
+venv\Scripts\activate
+# On macOS/Linux:
+source venv/bin/activate
+# Install dependencies
+pip install -r requirements.txt
+# Run the app
+python app.py
+```
+### 11. Upload to Hugging Face Hub
+- Create a `.gitignore` file to exclude unnecessary files (like `venv/`)
+- Commit and push your code:
+```bash
+git add .
+git commit -m "Initial commit"
+git push origin main
+```
+## Development Challenges and Solutions
+### Problem 1: PyTorch Meta Tensor Error
+**Issue**: The original `Marqo/marqo-fashionSigLIP` model encountered a meta tensor error:
+```
+NotImplementedError: Cannot copy out of meta tensor; no data!
+Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to()
+when moving module from meta to a different device.
+```
+**Root Cause**: This error occurred due to compatibility issues between the custom SigLIP model and newer versions of PyTorch/transformers. The model was being initialized with meta tensors (tensors without actual data) and the `open_clip` library was trying to move them to a device using the deprecated `.to()` method.
+**Attempted Solutions**:
+1. **Environment Variables**: Tried setting `PYTORCH_CUDA_ALLOC_CONF` and disabling meta device initialization
+2. **Model Parameters**: Attempted using `torch_dtype=torch.float32`, `device_map="cpu"`, and `low_cpu_mem_usage=False`
+3. **Accelerate Library**: Installed the `accelerate` library as required by the error messages
+4. **PyTorch Version Downgrade**: Attempted to downgrade PyTorch to version 2.1.0 (not available for Windows)
+**Final Solution**: Replaced the problematic model with the standard OpenAI CLIP model:
+- **Original Model**: `Marqo/marqo-fashionSigLIP` (custom SigLIP implementation)
+- **Final Model**: `openai/clip-vit-base-patch32` (standard CLIP model)
+### Problem 2: Model Architecture Differences
+**Issue**: The code structure needed to be adapted for the different model architecture.
+**Solution**: Updated the prediction function to use CLIP's unified text-image processing:
+- **Before**: Separate text preprocessing and feature extraction using `get_text_features()` and `get_image_features()`
+- **After**: Combined processing using `processor(images=image, text=fashion_items)` and `model(**inputs)`
+### Problem 3: Windows Command Compatibility
+**Issue**: The original tutorial used Unix/Linux commands (`source venv/bin/activate`) which don't work on Windows PowerShell.
+**Solution**: Used Windows-compatible commands:
+- **Virtual Environment Activation**: Used direct Python execution via `venv\Scripts\python.exe` instead of activating the environment
+- **Package Installation**: `venv\Scripts\python.exe -m pip install -r requirements.txt`
+### Final Model Choice: OpenAI CLIP
+**Selected Model**: `openai/clip-vit-base-patch32`
+**Reasons for Selection**:
+1. **Stability**: Well-tested and widely used in production environments
+2. **Compatibility**: Full compatibility with current PyTorch and transformers versions
+3. **Performance**: Excellent performance on image-text classification tasks
+4. **Documentation**: Extensive documentation and community support
+5. **Simplicity**: Straightforward implementation without custom code requirements
+**Trade-offs**:
+- **Specialization**: Less specialized for fashion items compared to the original SigLIP model
+- **Accuracy**: May have slightly lower accuracy on fashion-specific classifications
+- **Model Size**: Standard CLIP model size vs. potentially optimized SigLIP
+The final implementation successfully classifies fashion items into categories: 'top', 'trousers', and 'bottom' using image URLs.

app.py ADDED Viewed

	@@ -0,0 +1,47 @@

+import gradio as gr
+from transformers import CLIPProcessor, CLIPModel
+import torch
+import requests
+from PIL import Image
+from io import BytesIO
+fashion_items = ['top', 'trousers', 'bottom']
+# Load model and processor - using standard CLIP model instead
+model_name = "openai/clip-vit-base-patch32"
+model = CLIPModel.from_pretrained(model_name)
+processor = CLIPProcessor.from_pretrained(model_name)
+# CLIP processes text and images together, so no need for separate text preprocessing
+# Prediction function
+def predict_from_url(url):
+    # Check if the URL is empty
+    if not url:
+        return {"Error": "Please input a URL"}
+    try:
+        image = Image.open(BytesIO(requests.get(url).content))
+    except Exception as e:
+        return {"Error": f"Failed to load image: {str(e)}"}
+    inputs = processor(images=image, text=fashion_items, return_tensors="pt", padding=True)
+    with torch.no_grad():
+        outputs = model(**inputs)
+        logits_per_image = outputs.logits_per_image
+        text_probs = logits_per_image.softmax(dim=-1)
+    return {fashion_items[i]: float(text_probs[0, i]) for i in range(len(fashion_items))}
+# Gradio interface
+demo = gr.Interface(
+    fn=predict_from_url,
+    inputs=gr.Textbox(label="Enter Image URL"),
+    outputs=gr.Label(label="Classification Results"),
+    title="Fashion Item Classifier",
+    allow_flagging="never"
+)
+# Launch the interface
+demo.launch()

requirements.txt ADDED Viewed

	@@ -0,0 +1,9 @@

+transformers
+torch
+requests
+Pillow
+open_clip_torch
+ftfy
+# This is only needed for local deployment
+gradio