ysfad's picture
Implement proper ML model hosting with Hugging Face Hub integration
e1a6bed
metadata
title: OpenCLIP Waste Classifier
emoji: ♻️
colorFrom: green
colorTo: blue
sdk: docker
sdk_version: 20.10.7
app_file: app.py
pinned: false

🗂️ AI Waste Classification System

A finetuned CLIP model for waste classification achieving 91.33% accuracy on 30 waste categories.

🚀 Proper ML Model Hosting on Hugging Face

What NOT to do:

  • Don't use Git LFS for Hugging Face Spaces
  • Don't commit large model files to git repositories
  • Don't use traditional git hosting for ML models

The RIGHT way:

  1. Host models on Hugging Face Model Hub
  2. Download models at runtime in your Space
  3. Use huggingface_hub library for model management
  4. Separate code (git) from models (HF Hub)

📋 Quick Start

1. Setup Environment

pip install -r requirements.txt

2. Download Dataset

python download_dataset.py

3. Finetune Model

python finetune_clip.py --epochs 15 --batch_size 16 --lr 5e-6

4. Upload to Hugging Face Hub

# Login to Hugging Face
huggingface-cli login

# Upload your finetuned model
python upload_to_hf.py --repo_id "your-username/waste-clip-finetuned"

5. Update App Configuration

# In app.py, update the model ID:
HF_MODEL_ID = "your-username/waste-clip-finetuned"

6. Deploy to Hugging Face Spaces

git add .
git commit -m "Add waste classification app"
git push origin main

🏗️ Architecture

Model Details

  • Base Model: OpenAI CLIP ViT-B/16
  • Pretrained: LAION-2B (34B parameters)
  • Finetuned: 30 waste categories
  • Accuracy: 91.33% validation accuracy
  • Size: ~1.2GB

Classes (30 Categories)

aerosol_cans, aluminum_food_cans, aluminum_soda_cans, 
cardboard_boxes, cardboard_packaging, clothing, 
coffee_grounds, disposable_plastic_cups, eggshells, 
food_waste, glass_beverage_bottles, glass_cosmetic_containers, 
glass_food_jars, magazines, newspaper, office_paper, 
paper_cups, plastic_bottle_caps, plastic_bottles, 
plastic_clothing_hangers, plastic_containers, plastic_cutlery, 
plastic_shopping_bags, shoes, steel_food_cans, styrofoam_cups, 
styrofoam_food_containers, tea_bags, tissues, wooden_utensils

🤗 Hugging Face Integration

Model Loading Priority:

  1. Local file (for development)
  2. Hugging Face Hub (production)
  3. Pretrained fallback (if finetuned unavailable)

Example Usage:

from clip_waste_classifier.finetuned_classifier import FinetunedCLIPWasteClassifier

# Load from Hugging Face Hub
classifier = FinetunedCLIPWasteClassifier(
    hf_model_id="your-username/waste-clip-finetuned"
)

# Classify image
result = classifier.classify_image("path/to/image.jpg")
print(f"Predicted: {result['predicted_item']} ({result['best_confidence']:.3f})")

📊 Dataset


🔧 Development Setup

Project Structure

mc-waste/
├── clip_waste_classifier/
│   ├── finetuned_classifier.py     # Main classifier with HF integration
│   └── openclip_classifier.py      # Pretrained fallback
├── app.py                          # Gradio interface
├── finetune_clip.py               # Training script
├── upload_to_hf.py                # HF upload utility
├── database.csv                    # Disposal instructions
├── requirements.txt                # Dependencies
└── README.md                      # This file

Key Features

  • Smart model loading (HF Hub → Local → Fallback)
  • Automatic failover to pretrained if finetuned unavailable
  • Real-time classification with confidence scores
  • Disposal instructions from curated database
  • Modern Gradio UI with detailed results

🚀 Deployment Options

Hugging Face Spaces (Recommended)

  1. Upload model to HF Model Hub
  2. Create Space with this code
  3. Set HF_MODEL_ID in app.py
  4. Deploy automatically

Local Development

python app.py
# Visit: http://localhost:7860

Docker Deployment

FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 7860
CMD ["python", "app.py"]

📈 Performance

Metric Value
Validation Accuracy 91.33%
Training Epochs 15
Batch Size 16
Learning Rate 5e-6
Model Size 1.2GB
Inference Time ~200ms

🛠️ Troubleshooting

Model Loading Issues

# Check model availability
classifier = FinetunedCLIPWasteClassifier(hf_model_id="your-model-id")
info = classifier.get_model_info()
print(f"Model type: {info['model_type']}")

Gradio Import Error

pip install gradio==3.50.2

Memory Issues

  • Use CPU-only inference
  • Reduce batch size for training
  • Clear cache: rm -rf hf_cache/

🌍 Environmental Impact

This system helps improve recycling efficiency by:

  • ♻️ Accurate waste classification
  • 📋 Proper disposal instructions
  • 🌱 Reducing contamination in recycling streams
  • 📊 Data-driven waste management

📄 License

MIT License - see LICENSE for details.


🤝 Contributing

  1. Fork the repository
  2. Create feature branch (git checkout -b feature/improvement)
  3. Commit changes (git commit -am 'Add improvement')
  4. Push to branch (git push origin feature/improvement)
  5. Create Pull Request

📧 Contact

For questions about model hosting, deployment, or collaboration:


🎯 Ready to deploy? Follow the Hugging Face model hosting guide above!