# Hugging Face Spaces Deployment Guide ## Quick Start for Download-at-Runtime This guide walks you through deploying your WASH CFM Topic Classifier to Hugging Face Spaces using the Download-at-Runtime strategy. ## Prerequisites 1. ✅ Your model files are ready (`.safetensors`, config files, tokenizer files) 2. ✅ You have a Hugging Face account 3. ✅ You've created a public model repository on Hugging Face Hub ## Step 1: Upload Model to Hugging Face Hub ### 1.1 Create Model Repository 1. Go to [huggingface.co/new](https://huggingface.co/new) 2. Select **"Model"** tab 3. Repository name: `wash-cfm-classifier` (or your preferred name) 4. Make it **Public** (required for Spaces) 5. Click **"Create a new model"** ### 1.2 Upload Model Files Upload these files to your repository: ``` 📁 wash-cfm-classifier/ ├── model.safetensors (~400-500MB) ├── config.json (~1KB) ├── tokenizer.json (~2-3MB) ├── tokenizer_config.json (~1KB) └── special_tokens_map.json (~1KB) ``` **Methods to upload:** - **Web Interface**: Drag and drop files - **Git LFS**: For command-line users - **Python Script**: Use `huggingface_hub` library ### 1.3 Add Model Card (Optional but Recommended) Create a `README.md` in your repository: ```markdown # WASH CFM Topic Classifier A fine-tuned ModernBERT model for classifying WASH (Water, Sanitation, and Hygiene) feedback into topic categories. ## Usage ```python from transformers import pipeline classifier = pipeline("text-classification", model="your-username/wash-cfm-classifier") result = classifier("The water pump is broken") ``` ## Model Details - **Base Model**: modernbert-large - **Fine-tuned on**: WASH CFM feedback data - **Task**: Multi-label text classification - **Labels**: [Add your actual labels] ``` ## Step 2: Update Your Application Code ### 2.1 Configuration In `app.py`, update the configuration section: ```python # CONFIGURATION SECTION HF_REPO_ID = "your-username/wash-cfm-classifier" # ← Replace with your repo HF_MODEL_CACHE_DIR = "./model_cache" # Cache directory ``` ### 2.2 Verify Dependencies Ensure `requirements.txt` includes: ```txt huggingface_hub>=0.16.0 torch>=2.0.0 transformers>=4.30.0 gradio>=4.0.0 ``` ## Step 3: Create Hugging Face Space ### 3.1 Create New Space 1. Go to [huggingface.co/spaces](https://huggingface.co/spaces) 2. Click **"Create new Space"** 3. Fill in details: - **Space name**: `wash-cfm-classifier` (or your choice) - **License**: `apache-2.0` (or your preference) - **Hardware**: `CPU basic` (sufficient for this model) - **Visibility**: `Public` ### 3.2 Choose SDK Select **"Gradio"** as your SDK. ### 3.3 Upload Files Upload these files to your Space repository: ``` 📁 wash-cfm-classifier-space/ ├── app.py # Your main application ├── requirements.txt # Dependencies ├── README.md # Space documentation └── .gitattributes # Optional: for large file handling ``` ## Step 4: Space Configuration ### 4.1 Hardware Recommendations For your model size (~500MB): - **CPU Basic**: ✅ Sufficient (free tier) - **CPU Upgrade**: ⚡ Faster inference - **GPU**: 🚀 Only if needed for larger models ### 4.2 Environment Variables (Optional) In Space Settings → Environment, add: ``` HF_HOME=/tmp/.cache/huggingface TRANSFORMERS_CACHE=/tmp/.cache/transformers ``` This ensures cache directories have sufficient space. ### 4.3 Build Logs Monitor the **"Logs"** tab for: - ✅ Successful dependency installation - ✅ Model download progress - ✅ Application startup ## Step 5: Testing and Validation ### 5.1 First Run The first run will: 1. **Install dependencies** (~2-3 minutes) 2. **Download model** (~1-2 minutes, depending on connection) 3. **Start application** (~30 seconds) **Expected timeline**: 5-7 minutes for first successful run. ### 5.2 Subsequent Runs After caching: - **Startup time**: ~10-15 seconds - **Prediction time**: <1 second per request ### 5.3 Verification Checklist - [ ] Space builds successfully (green ✅ in status) - [ ] Model downloads without errors - [ ] Web interface loads - [ ] Sample predictions work - [ ] Performance is acceptable ## Step 6: Optimization and Monitoring ### 6.1 Performance Monitoring Monitor these metrics: - **Build time**: First deployment duration - **Download time**: Model download duration - **Inference time**: Response latency - **Memory usage**: RAM consumption ### 6.2 Common Issues and Solutions #### Issue: "Model download timeout" ```bash # Solution: Use faster hardware tier or optimize cache HF_MODEL_CACHE_DIR = "/tmp/model_cache" ``` #### Issue: "Out of memory" ```bash # Solution: Use smaller hardware or optimize model loading device = torch.device("cpu") # Force CPU if GPU memory insufficient ``` #### Issue: "Repository not found" ```python # Solution: Verify repository ID and visibility HF_REPO_ID = "exact-username/exact-repo-name" # Case sensitive ``` ### 6.3 Space Management **Regular maintenance:** - Monitor disk usage in cache directory - Update model versions by changing repository revision - Scale hardware based on usage patterns **Version updates:** - Update model in your Hub repository - Space automatically uses latest version (or specify revision) ## Step 7: Production Considerations ### 7.1 Security - ✅ Use public repositories for Spaces - ✅ Validate model integrity - ✅ Implement proper error handling - ✅ Monitor for unusual access patterns ### 7.2 Reliability - ✅ Implement retry logic for downloads - ✅ Add fallback mechanisms - ✅ Monitor network connectivity - ✅ Set up alerts for failures ### 7.3 Scalability - **Multiple Spaces**: Same model, different interfaces - **Load Balancing**: Distribute across multiple hardware tiers - **Caching Strategy**: Optimize for your usage patterns ## Troubleshooting Guide ### Build Failures | Error | Solution | |-------|----------| | `pip install failed` | Check requirements.txt syntax | | `torch install failed` | Verify Python version compatibility | | `Memory limit exceeded` | Reduce model size or upgrade hardware | ### Runtime Failures | Error | Solution | |-------|----------| | `Download interrupted` | Network issues - will auto-resume | | `Model not found` | Verify repository ID and visibility | | `CUDA out of memory` | Use CPU fallback or upgrade hardware | ### Performance Issues | Issue | Solution | |-------|----------| | Slow first run | Normal - model download required | | High memory usage | Consider hardware upgrade | | Slow predictions | Optimize model or upgrade hardware | ## Success Metrics Your deployment is successful when: - ✅ Space builds without errors - ✅ Model downloads and loads successfully - ✅ Web interface is responsive - ✅ Predictions are accurate and fast - ✅ Resource usage is within limits ## Next Steps 1. **Monitor Performance**: Track usage and optimize as needed 2. **User Feedback**: Collect feedback and iterate 3. **Feature Updates**: Add new features or model improvements 4. **Scaling**: Consider multiple spaces or hardware upgrades --- **🎉 Congratulations!** Your WASH CFM Topic Classifier is now deployed to Hugging Face Spaces with Download-at-Runtime functionality, bypassing the 1GB storage limit while maintaining excellent performance. For additional help, consult: - [Hugging Face Spaces Documentation](https://huggingface.co/docs/spaces) - [huggingface_hub Documentation](https://huggingface.co/docs/huggingface_hub) - [Community Forum](https://discuss.huggingface.co/)