ysfad's picture
Implement proper ML model hosting with Hugging Face Hub integration
e1a6bed
---
title: OpenCLIP Waste Classifier
emoji: "♻️"
colorFrom: green
colorTo: blue
sdk: docker
sdk_version: "20.10.7"
app_file: app.py
pinned: false
---
# 🗂️ AI Waste Classification System
A **finetuned CLIP model** for waste classification achieving **91.33% accuracy** on 30 waste categories.
## 🚀 **Proper ML Model Hosting on Hugging Face**
### ❌ **What NOT to do:**
- **Don't use Git LFS** for Hugging Face Spaces
- **Don't commit large model files** to git repositories
- **Don't use traditional git hosting** for ML models
### ✅ **The RIGHT way:**
1. **Host models on Hugging Face Model Hub**
2. **Download models at runtime** in your Space
3. **Use `huggingface_hub` library** for model management
4. **Separate code (git) from models (HF Hub)**
---
## 📋 **Quick Start**
### **1. Setup Environment**
```bash
pip install -r requirements.txt
```
### **2. Download Dataset**
```bash
python download_dataset.py
```
### **3. Finetune Model**
```bash
python finetune_clip.py --epochs 15 --batch_size 16 --lr 5e-6
```
### **4. Upload to Hugging Face Hub**
```bash
# Login to Hugging Face
huggingface-cli login
# Upload your finetuned model
python upload_to_hf.py --repo_id "your-username/waste-clip-finetuned"
```
### **5. Update App Configuration**
```python
# In app.py, update the model ID:
HF_MODEL_ID = "your-username/waste-clip-finetuned"
```
### **6. Deploy to Hugging Face Spaces**
```bash
git add .
git commit -m "Add waste classification app"
git push origin main
```
---
## 🏗️ **Architecture**
### **Model Details**
- **Base Model:** OpenAI CLIP ViT-B/16
- **Pretrained:** LAION-2B (34B parameters)
- **Finetuned:** 30 waste categories
- **Accuracy:** 91.33% validation accuracy
- **Size:** ~1.2GB
### **Classes (30 Categories)**
```
aerosol_cans, aluminum_food_cans, aluminum_soda_cans,
cardboard_boxes, cardboard_packaging, clothing,
coffee_grounds, disposable_plastic_cups, eggshells,
food_waste, glass_beverage_bottles, glass_cosmetic_containers,
glass_food_jars, magazines, newspaper, office_paper,
paper_cups, plastic_bottle_caps, plastic_bottles,
plastic_clothing_hangers, plastic_containers, plastic_cutlery,
plastic_shopping_bags, shoes, steel_food_cans, styrofoam_cups,
styrofoam_food_containers, tea_bags, tissues, wooden_utensils
```
---
## 🤗 **Hugging Face Integration**
### **Model Loading Priority:**
1. **Local file** (for development)
2. **Hugging Face Hub** (production)
3. **Pretrained fallback** (if finetuned unavailable)
### **Example Usage:**
```python
from clip_waste_classifier.finetuned_classifier import FinetunedCLIPWasteClassifier
# Load from Hugging Face Hub
classifier = FinetunedCLIPWasteClassifier(
hf_model_id="your-username/waste-clip-finetuned"
)
# Classify image
result = classifier.classify_image("path/to/image.jpg")
print(f"Predicted: {result['predicted_item']} ({result['best_confidence']:.3f})")
```
---
## 📊 **Dataset**
- **Source:** [Kaggle - Recyclable and Household Waste Classification](https://www.kaggle.com/datasets/alistairking/recyclable-and-household-waste-classification)
- **Images:** 15,000 total (500 per category)
- **Split:** 70% train, 10% validation, 20% test
- **Types:** 250 synthetic + 250 real-world images per category
---
## 🔧 **Development Setup**
### **Project Structure**
```
mc-waste/
├── clip_waste_classifier/
│ ├── finetuned_classifier.py # Main classifier with HF integration
│ └── openclip_classifier.py # Pretrained fallback
├── app.py # Gradio interface
├── finetune_clip.py # Training script
├── upload_to_hf.py # HF upload utility
├── database.csv # Disposal instructions
├── requirements.txt # Dependencies
└── README.md # This file
```
### **Key Features**
-**Smart model loading** (HF Hub → Local → Fallback)
-**Automatic failover** to pretrained if finetuned unavailable
-**Real-time classification** with confidence scores
-**Disposal instructions** from curated database
-**Modern Gradio UI** with detailed results
---
## 🚀 **Deployment Options**
### **Hugging Face Spaces (Recommended)**
1. Upload model to HF Model Hub
2. Create Space with this code
3. Set `HF_MODEL_ID` in `app.py`
4. Deploy automatically
### **Local Development**
```bash
python app.py
# Visit: http://localhost:7860
```
### **Docker Deployment**
```dockerfile
FROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 7860
CMD ["python", "app.py"]
```
---
## 📈 **Performance**
| Metric | Value |
|--------|-------|
| **Validation Accuracy** | 91.33% |
| **Training Epochs** | 15 |
| **Batch Size** | 16 |
| **Learning Rate** | 5e-6 |
| **Model Size** | 1.2GB |
| **Inference Time** | ~200ms |
---
## 🛠️ **Troubleshooting**
### **Model Loading Issues**
```python
# Check model availability
classifier = FinetunedCLIPWasteClassifier(hf_model_id="your-model-id")
info = classifier.get_model_info()
print(f"Model type: {info['model_type']}")
```
### **Gradio Import Error**
```bash
pip install gradio==3.50.2
```
### **Memory Issues**
- Use CPU-only inference
- Reduce batch size for training
- Clear cache: `rm -rf hf_cache/`
---
## 🌍 **Environmental Impact**
This system helps improve recycling efficiency by:
- ♻️ **Accurate waste classification**
- 📋 **Proper disposal instructions**
- 🌱 **Reducing contamination** in recycling streams
- 📊 **Data-driven waste management**
---
## 📄 **License**
MIT License - see [LICENSE](LICENSE) for details.
---
## 🤝 **Contributing**
1. Fork the repository
2. Create feature branch (`git checkout -b feature/improvement`)
3. Commit changes (`git commit -am 'Add improvement'`)
4. Push to branch (`git push origin feature/improvement`)
5. Create Pull Request
---
## 📧 **Contact**
For questions about **model hosting**, **deployment**, or **collaboration**:
- **GitHub Issues:** [Create an issue](https://github.com/your-username/mc-waste/issues)
- **Hugging Face:** [Model page](https://huggingface.co/your-username/waste-clip-finetuned)
---
**🎯 Ready to deploy? Follow the [Hugging Face model hosting guide](#-proper-ml-model-hosting-on-hugging-face) above!**