--- title: OpenCLIP Waste Classifier emoji: "♻️" colorFrom: green colorTo: blue sdk: docker sdk_version: "20.10.7" app_file: app.py pinned: false --- # 🗂️ AI Waste Classification System A **finetuned CLIP model** for waste classification achieving **91.33% accuracy** on 30 waste categories. ## 🚀 **Proper ML Model Hosting on Hugging Face** ### ❌ **What NOT to do:** - **Don't use Git LFS** for Hugging Face Spaces - **Don't commit large model files** to git repositories - **Don't use traditional git hosting** for ML models ### ✅ **The RIGHT way:** 1. **Host models on Hugging Face Model Hub** 2. **Download models at runtime** in your Space 3. **Use `huggingface_hub` library** for model management 4. **Separate code (git) from models (HF Hub)** --- ## 📋 **Quick Start** ### **1. Setup Environment** ```bash pip install -r requirements.txt ``` ### **2. Download Dataset** ```bash python download_dataset.py ``` ### **3. Finetune Model** ```bash python finetune_clip.py --epochs 15 --batch_size 16 --lr 5e-6 ``` ### **4. Upload to Hugging Face Hub** ```bash # Login to Hugging Face huggingface-cli login # Upload your finetuned model python upload_to_hf.py --repo_id "your-username/waste-clip-finetuned" ``` ### **5. Update App Configuration** ```python # In app.py, update the model ID: HF_MODEL_ID = "your-username/waste-clip-finetuned" ``` ### **6. Deploy to Hugging Face Spaces** ```bash git add . git commit -m "Add waste classification app" git push origin main ``` --- ## 🏗️ **Architecture** ### **Model Details** - **Base Model:** OpenAI CLIP ViT-B/16 - **Pretrained:** LAION-2B (34B parameters) - **Finetuned:** 30 waste categories - **Accuracy:** 91.33% validation accuracy - **Size:** ~1.2GB ### **Classes (30 Categories)** ``` aerosol_cans, aluminum_food_cans, aluminum_soda_cans, cardboard_boxes, cardboard_packaging, clothing, coffee_grounds, disposable_plastic_cups, eggshells, food_waste, glass_beverage_bottles, glass_cosmetic_containers, glass_food_jars, magazines, newspaper, office_paper, paper_cups, plastic_bottle_caps, plastic_bottles, plastic_clothing_hangers, plastic_containers, plastic_cutlery, plastic_shopping_bags, shoes, steel_food_cans, styrofoam_cups, styrofoam_food_containers, tea_bags, tissues, wooden_utensils ``` --- ## 🤗 **Hugging Face Integration** ### **Model Loading Priority:** 1. **Local file** (for development) 2. **Hugging Face Hub** (production) 3. **Pretrained fallback** (if finetuned unavailable) ### **Example Usage:** ```python from clip_waste_classifier.finetuned_classifier import FinetunedCLIPWasteClassifier # Load from Hugging Face Hub classifier = FinetunedCLIPWasteClassifier( hf_model_id="your-username/waste-clip-finetuned" ) # Classify image result = classifier.classify_image("path/to/image.jpg") print(f"Predicted: {result['predicted_item']} ({result['best_confidence']:.3f})") ``` --- ## 📊 **Dataset** - **Source:** [Kaggle - Recyclable and Household Waste Classification](https://www.kaggle.com/datasets/alistairking/recyclable-and-household-waste-classification) - **Images:** 15,000 total (500 per category) - **Split:** 70% train, 10% validation, 20% test - **Types:** 250 synthetic + 250 real-world images per category --- ## 🔧 **Development Setup** ### **Project Structure** ``` mc-waste/ ├── clip_waste_classifier/ │ ├── finetuned_classifier.py # Main classifier with HF integration │ └── openclip_classifier.py # Pretrained fallback ├── app.py # Gradio interface ├── finetune_clip.py # Training script ├── upload_to_hf.py # HF upload utility ├── database.csv # Disposal instructions ├── requirements.txt # Dependencies └── README.md # This file ``` ### **Key Features** - ✅ **Smart model loading** (HF Hub → Local → Fallback) - ✅ **Automatic failover** to pretrained if finetuned unavailable - ✅ **Real-time classification** with confidence scores - ✅ **Disposal instructions** from curated database - ✅ **Modern Gradio UI** with detailed results --- ## 🚀 **Deployment Options** ### **Hugging Face Spaces (Recommended)** 1. Upload model to HF Model Hub 2. Create Space with this code 3. Set `HF_MODEL_ID` in `app.py` 4. Deploy automatically ### **Local Development** ```bash python app.py # Visit: http://localhost:7860 ``` ### **Docker Deployment** ```dockerfile FROM python:3.9-slim WORKDIR /app COPY requirements.txt . RUN pip install -r requirements.txt COPY . . EXPOSE 7860 CMD ["python", "app.py"] ``` --- ## 📈 **Performance** | Metric | Value | |--------|-------| | **Validation Accuracy** | 91.33% | | **Training Epochs** | 15 | | **Batch Size** | 16 | | **Learning Rate** | 5e-6 | | **Model Size** | 1.2GB | | **Inference Time** | ~200ms | --- ## 🛠️ **Troubleshooting** ### **Model Loading Issues** ```python # Check model availability classifier = FinetunedCLIPWasteClassifier(hf_model_id="your-model-id") info = classifier.get_model_info() print(f"Model type: {info['model_type']}") ``` ### **Gradio Import Error** ```bash pip install gradio==3.50.2 ``` ### **Memory Issues** - Use CPU-only inference - Reduce batch size for training - Clear cache: `rm -rf hf_cache/` --- ## 🌍 **Environmental Impact** This system helps improve recycling efficiency by: - ♻️ **Accurate waste classification** - 📋 **Proper disposal instructions** - 🌱 **Reducing contamination** in recycling streams - 📊 **Data-driven waste management** --- ## 📄 **License** MIT License - see [LICENSE](LICENSE) for details. --- ## 🤝 **Contributing** 1. Fork the repository 2. Create feature branch (`git checkout -b feature/improvement`) 3. Commit changes (`git commit -am 'Add improvement'`) 4. Push to branch (`git push origin feature/improvement`) 5. Create Pull Request --- ## 📧 **Contact** For questions about **model hosting**, **deployment**, or **collaboration**: - **GitHub Issues:** [Create an issue](https://github.com/your-username/mc-waste/issues) - **Hugging Face:** [Model page](https://huggingface.co/your-username/waste-clip-finetuned) --- **🎯 Ready to deploy? Follow the [Hugging Face model hosting guide](#-proper-ml-model-hosting-on-hugging-face) above!**