Spaces:
Sleeping
Hugging Face Spaces Deployment Guide
Quick Start for Download-at-Runtime
This guide walks you through deploying your WASH CFM Topic Classifier to Hugging Face Spaces using the Download-at-Runtime strategy.
Prerequisites
- β
Your model files are ready (
.safetensors, config files, tokenizer files) - β You have a Hugging Face account
- β You've created a public model repository on Hugging Face Hub
Step 1: Upload Model to Hugging Face Hub
1.1 Create Model Repository
- Go to huggingface.co/new
- Select "Model" tab
- Repository name:
wash-cfm-classifier(or your preferred name) - Make it Public (required for Spaces)
- Click "Create a new model"
1.2 Upload Model Files
Upload these files to your repository:
π wash-cfm-classifier/
βββ model.safetensors (~400-500MB)
βββ config.json (~1KB)
βββ tokenizer.json (~2-3MB)
βββ tokenizer_config.json (~1KB)
βββ special_tokens_map.json (~1KB)
Methods to upload:
- Web Interface: Drag and drop files
- Git LFS: For command-line users
- Python Script: Use
huggingface_hublibrary
1.3 Add Model Card (Optional but Recommended)
Create a README.md in your repository:
# WASH CFM Topic Classifier
A fine-tuned ModernBERT model for classifying WASH (Water, Sanitation, and Hygiene) feedback into topic categories.
## Usage
```python
from transformers import pipeline
classifier = pipeline("text-classification",
model="your-username/wash-cfm-classifier")
result = classifier("The water pump is broken")
Model Details
- Base Model: modernbert-large
- Fine-tuned on: WASH CFM feedback data
- Task: Multi-label text classification
- Labels: [Add your actual labels]
## Step 2: Update Your Application Code
### 2.1 Configuration
In `app.py`, update the configuration section:
```python
# CONFIGURATION SECTION
HF_REPO_ID = "your-username/wash-cfm-classifier" # β Replace with your repo
HF_MODEL_CACHE_DIR = "./model_cache" # Cache directory
2.2 Verify Dependencies
Ensure requirements.txt includes:
huggingface_hub>=0.16.0
torch>=2.0.0
transformers>=4.30.0
gradio>=4.0.0
Step 3: Create Hugging Face Space
3.1 Create New Space
- Go to huggingface.co/spaces
- Click "Create new Space"
- Fill in details:
- Space name:
wash-cfm-classifier(or your choice) - License:
apache-2.0(or your preference) - Hardware:
CPU basic(sufficient for this model) - Visibility:
Public
- Space name:
3.2 Choose SDK
Select "Gradio" as your SDK.
3.3 Upload Files
Upload these files to your Space repository:
π wash-cfm-classifier-space/
βββ app.py # Your main application
βββ requirements.txt # Dependencies
βββ README.md # Space documentation
βββ .gitattributes # Optional: for large file handling
Step 4: Space Configuration
4.1 Hardware Recommendations
For your model size (~500MB):
- CPU Basic: β Sufficient (free tier)
- CPU Upgrade: β‘ Faster inference
- GPU: π Only if needed for larger models
4.2 Environment Variables (Optional)
In Space Settings β Environment, add:
HF_HOME=/tmp/.cache/huggingface
TRANSFORMERS_CACHE=/tmp/.cache/transformers
This ensures cache directories have sufficient space.
4.3 Build Logs
Monitor the "Logs" tab for:
- β Successful dependency installation
- β Model download progress
- β Application startup
Step 5: Testing and Validation
5.1 First Run
The first run will:
- Install dependencies (~2-3 minutes)
- Download model (~1-2 minutes, depending on connection)
- Start application (~30 seconds)
Expected timeline: 5-7 minutes for first successful run.
5.2 Subsequent Runs
After caching:
- Startup time: ~10-15 seconds
- Prediction time: <1 second per request
5.3 Verification Checklist
- Space builds successfully (green β in status)
- Model downloads without errors
- Web interface loads
- Sample predictions work
- Performance is acceptable
Step 6: Optimization and Monitoring
6.1 Performance Monitoring
Monitor these metrics:
- Build time: First deployment duration
- Download time: Model download duration
- Inference time: Response latency
- Memory usage: RAM consumption
6.2 Common Issues and Solutions
Issue: "Model download timeout"
# Solution: Use faster hardware tier or optimize cache
HF_MODEL_CACHE_DIR = "/tmp/model_cache"
Issue: "Out of memory"
# Solution: Use smaller hardware or optimize model loading
device = torch.device("cpu") # Force CPU if GPU memory insufficient
Issue: "Repository not found"
# Solution: Verify repository ID and visibility
HF_REPO_ID = "exact-username/exact-repo-name" # Case sensitive
6.3 Space Management
Regular maintenance:
- Monitor disk usage in cache directory
- Update model versions by changing repository revision
- Scale hardware based on usage patterns
Version updates:
- Update model in your Hub repository
- Space automatically uses latest version (or specify revision)
Step 7: Production Considerations
7.1 Security
- β Use public repositories for Spaces
- β Validate model integrity
- β Implement proper error handling
- β Monitor for unusual access patterns
7.2 Reliability
- β Implement retry logic for downloads
- β Add fallback mechanisms
- β Monitor network connectivity
- β Set up alerts for failures
7.3 Scalability
- Multiple Spaces: Same model, different interfaces
- Load Balancing: Distribute across multiple hardware tiers
- Caching Strategy: Optimize for your usage patterns
Troubleshooting Guide
Build Failures
| Error | Solution |
|---|---|
pip install failed |
Check requirements.txt syntax |
torch install failed |
Verify Python version compatibility |
Memory limit exceeded |
Reduce model size or upgrade hardware |
Runtime Failures
| Error | Solution |
|---|---|
Download interrupted |
Network issues - will auto-resume |
Model not found |
Verify repository ID and visibility |
CUDA out of memory |
Use CPU fallback or upgrade hardware |
Performance Issues
| Issue | Solution |
|---|---|
| Slow first run | Normal - model download required |
| High memory usage | Consider hardware upgrade |
| Slow predictions | Optimize model or upgrade hardware |
Success Metrics
Your deployment is successful when:
- β Space builds without errors
- β Model downloads and loads successfully
- β Web interface is responsive
- β Predictions are accurate and fast
- β Resource usage is within limits
Next Steps
- Monitor Performance: Track usage and optimize as needed
- User Feedback: Collect feedback and iterate
- Feature Updates: Add new features or model improvements
- Scaling: Consider multiple spaces or hardware upgrades
π Congratulations! Your WASH CFM Topic Classifier is now deployed to Hugging Face Spaces with Download-at-Runtime functionality, bypassing the 1GB storage limit while maintaining excellent performance.
For additional help, consult: