# Hugging Face Spaces Deployment Guide

## Quick Start for Download-at-Runtime

This guide walks you through deploying your WASH CFM Topic Classifier to Hugging Face Spaces using the Download-at-Runtime strategy.

## Prerequisites

1. ✅ Your model files are ready (`.safetensors`, config files, tokenizer files)
2. ✅ You have a Hugging Face account
3. ✅ You've created a public model repository on Hugging Face Hub

## Step 1: Upload Model to Hugging Face Hub

### 1.1 Create Model Repository

1. Go to [huggingface.co/new](https://huggingface.co/new)
2. Select **"Model"** tab
3. Repository name: `wash-cfm-classifier` (or your preferred name)
4. Make it **Public** (required for Spaces)
5. Click **"Create a new model"**

### 1.2 Upload Model Files

Upload these files to your repository:

```
📁 wash-cfm-classifier/
├── model.safetensors          (~400-500MB)
├── config.json                (~1KB)
├── tokenizer.json            (~2-3MB)
├── tokenizer_config.json     (~1KB)
└── special_tokens_map.json   (~1KB)
```

**Methods to upload:**
- **Web Interface**: Drag and drop files
- **Git LFS**: For command-line users
- **Python Script**: Use `huggingface_hub` library

### 1.3 Add Model Card (Optional but Recommended)

Create a `README.md` in your repository:

```markdown
# WASH CFM Topic Classifier

A fine-tuned ModernBERT model for classifying WASH (Water, Sanitation, and Hygiene) feedback into topic categories.

## Usage

```python
from transformers import pipeline

classifier = pipeline("text-classification", 
                     model="your-username/wash-cfm-classifier")
result = classifier("The water pump is broken")
```

## Model Details

- **Base Model**: modernbert-large
- **Fine-tuned on**: WASH CFM feedback data
- **Task**: Multi-label text classification
- **Labels**: [Add your actual labels]
```

## Step 2: Update Your Application Code

### 2.1 Configuration

In `app.py`, update the configuration section:

```python
# CONFIGURATION SECTION
HF_REPO_ID = "your-username/wash-cfm-classifier"  # ← Replace with your repo
HF_MODEL_CACHE_DIR = "./model_cache"  # Cache directory
```

### 2.2 Verify Dependencies

Ensure `requirements.txt` includes:

```txt
huggingface_hub>=0.16.0
torch>=2.0.0
transformers>=4.30.0
gradio>=4.0.0
```

## Step 3: Create Hugging Face Space

### 3.1 Create New Space

1. Go to [huggingface.co/spaces](https://huggingface.co/spaces)
2. Click **"Create new Space"**
3. Fill in details:
   - **Space name**: `wash-cfm-classifier` (or your choice)
   - **License**: `apache-2.0` (or your preference)
   - **Hardware**: `CPU basic` (sufficient for this model)
   - **Visibility**: `Public`

### 3.2 Choose SDK

Select **"Gradio"** as your SDK.

### 3.3 Upload Files

Upload these files to your Space repository:

```
📁 wash-cfm-classifier-space/
├── app.py                    # Your main application
├── requirements.txt          # Dependencies
├── README.md                 # Space documentation
└── .gitattributes           # Optional: for large file handling
```

## Step 4: Space Configuration

### 4.1 Hardware Recommendations

For your model size (~500MB):

- **CPU Basic**: ✅ Sufficient (free tier)
- **CPU Upgrade**: ⚡ Faster inference
- **GPU**: 🚀 Only if needed for larger models

### 4.2 Environment Variables (Optional)

In Space Settings → Environment, add:

```
HF_HOME=/tmp/.cache/huggingface
TRANSFORMERS_CACHE=/tmp/.cache/transformers
```

This ensures cache directories have sufficient space.

### 4.3 Build Logs

Monitor the **"Logs"** tab for:
- ✅ Successful dependency installation
- ✅ Model download progress
- ✅ Application startup

## Step 5: Testing and Validation

### 5.1 First Run

The first run will:
1. **Install dependencies** (~2-3 minutes)
2. **Download model** (~1-2 minutes, depending on connection)
3. **Start application** (~30 seconds)

**Expected timeline**: 5-7 minutes for first successful run.

### 5.2 Subsequent Runs

After caching:
- **Startup time**: ~10-15 seconds
- **Prediction time**: <1 second per request

### 5.3 Verification Checklist

- [ ] Space builds successfully (green ✅ in status)
- [ ] Model downloads without errors
- [ ] Web interface loads
- [ ] Sample predictions work
- [ ] Performance is acceptable

## Step 6: Optimization and Monitoring

### 6.1 Performance Monitoring

Monitor these metrics:
- **Build time**: First deployment duration
- **Download time**: Model download duration
- **Inference time**: Response latency
- **Memory usage**: RAM consumption

### 6.2 Common Issues and Solutions

#### Issue: "Model download timeout"
```bash
# Solution: Use faster hardware tier or optimize cache
HF_MODEL_CACHE_DIR = "/tmp/model_cache"
```

#### Issue: "Out of memory"
```bash
# Solution: Use smaller hardware or optimize model loading
device = torch.device("cpu")  # Force CPU if GPU memory insufficient
```

#### Issue: "Repository not found"
```python
# Solution: Verify repository ID and visibility
HF_REPO_ID = "exact-username/exact-repo-name"  # Case sensitive
```

### 6.3 Space Management

**Regular maintenance:**
- Monitor disk usage in cache directory
- Update model versions by changing repository revision
- Scale hardware based on usage patterns

**Version updates:**
- Update model in your Hub repository
- Space automatically uses latest version (or specify revision)

## Step 7: Production Considerations

### 7.1 Security

- ✅ Use public repositories for Spaces
- ✅ Validate model integrity
- ✅ Implement proper error handling
- ✅ Monitor for unusual access patterns

### 7.2 Reliability

- ✅ Implement retry logic for downloads
- ✅ Add fallback mechanisms
- ✅ Monitor network connectivity
- ✅ Set up alerts for failures

### 7.3 Scalability

- **Multiple Spaces**: Same model, different interfaces
- **Load Balancing**: Distribute across multiple hardware tiers
- **Caching Strategy**: Optimize for your usage patterns

## Troubleshooting Guide

### Build Failures

| Error | Solution |
|-------|----------|
| `pip install failed` | Check requirements.txt syntax |
| `torch install failed` | Verify Python version compatibility |
| `Memory limit exceeded` | Reduce model size or upgrade hardware |

### Runtime Failures

| Error | Solution |
|-------|----------|
| `Download interrupted` | Network issues - will auto-resume |
| `Model not found` | Verify repository ID and visibility |
| `CUDA out of memory` | Use CPU fallback or upgrade hardware |

### Performance Issues

| Issue | Solution |
|-------|----------|
| Slow first run | Normal - model download required |
| High memory usage | Consider hardware upgrade |
| Slow predictions | Optimize model or upgrade hardware |

## Success Metrics

Your deployment is successful when:

- ✅ Space builds without errors
- ✅ Model downloads and loads successfully
- ✅ Web interface is responsive
- ✅ Predictions are accurate and fast
- ✅ Resource usage is within limits

## Next Steps

1. **Monitor Performance**: Track usage and optimize as needed
2. **User Feedback**: Collect feedback and iterate
3. **Feature Updates**: Add new features or model improvements
4. **Scaling**: Consider multiple spaces or hardware upgrades

---

**🎉 Congratulations!** Your WASH CFM Topic Classifier is now deployed to Hugging Face Spaces with Download-at-Runtime functionality, bypassing the 1GB storage limit while maintaining excellent performance.

For additional help, consult:
- [Hugging Face Spaces Documentation](https://huggingface.co/docs/spaces)
- [huggingface_hub Documentation](https://huggingface.co/docs/huggingface_hub)
- [Community Forum](https://discuss.huggingface.co/)