WritingStudio / DEPLOYMENT_CHECKLIST.md
jmisak's picture
Upload 7 files
ead4c16 verified
# HuggingFace Spaces Deployment Checklist
## Pre-Deployment Verification
### 1. Local Testing (Recommended)
```bash
# Install dependencies
pip install -r requirements.txt
# Quick sanity check
python3 test_flan_t5.py
# Full UI test
python3 app.py
# Open http://localhost:7860 and test manually
```
### 2. File Verification
- [ ] `app.py` - HF Spaces entry point ✅
- [ ] `requirements.txt` - All dependencies listed ✅
- [ ] `README_HF_SPACES.md` - HF Spaces README (copy as README.md) ✅
- [ ] `src/writing_studio/` - All source code ✅
- [ ] `LICENSE` - MIT license file
- [ ] `.gitignore` - Ignore logs, cache, etc.
### 3. Configuration Check
- [ ] Default model: `google/flan-t5-base`
- [ ] Max text length: 10,000 characters ✅
- [ ] Log format: `text` (easier to read on HF Spaces) ✅
- [ ] Metrics disabled: `ENABLE_METRICS=false`
- [ ] No .env file required ✅
## HuggingFace Spaces Setup
### Step 1: Create Space
1. Go to https://huggingface.co/new-space
2. Choose a name (e.g., "ai-writing-studio")
3. License: MIT
4. SDK: **Gradio**
5. SDK version: **4.0.0** (must be quoted in YAML)
6. Hardware: **CPU basic** (free tier works!)
7. Visibility: Public or Private
### Step 2: Upload Files
**Option A: Git Push (Recommended)**
```bash
# Initialize git if not already
git init
git add .
git commit -m "Initial commit: FLAN-T5 powered AI Writing Studio"
# Add HF Space as remote
git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
git push hf main
```
**Option B: Web Upload**
1. Click "Files" tab in your Space
2. Upload files one by one or drag-and-drop folders
3. Ensure `app.py` is in root directory
### Step 3: Configure README
1. Copy `README_HF_SPACES.md` to `README.md`
2. Update GitHub URLs if you have a repo
3. Verify YAML frontmatter:
```yaml
---
title: AI Writing Studio
emoji: ✍️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "4.0.0" # MUST BE QUOTED!
app_file: app.py
suggested_hardware: cpu-basic
---
```
### Step 4: Set Environment Variables (Optional)
In Space settings, add if needed:
- `LOG_LEVEL=INFO`
- `ENVIRONMENT=production`
- `DEBUG=false`
Default values work fine without setting these!
## Post-Deployment Testing
### Immediate Checks
- [ ] Space builds successfully (no errors in logs)
- [ ] Gradio UI loads
- [ ] All UI elements present (input box, model selector, prompt pack dropdown)
- [ ] No import errors in logs
### First Analysis Test
- [ ] Paste test text (200-500 words)
- [ ] Select "General" revision mode
- [ ] Click "✨ Revise & Analyze"
- [ ] Wait ~60 seconds (first model load)
- [ ] Verify revision is generated
- [ ] Check revision differs from original
- [ ] Verify rubric scores appear
- [ ] Check diff highlighting works
### Second Analysis Test
- [ ] Paste different text
- [ ] Try different revision mode (e.g., "Academic")
- [ ] Click analyze
- [ ] Should be MUCH faster (~5-10s) - model cached!
- [ ] Verify revision style matches selected mode
## Common Deployment Issues
### Issue 1: "Missing configuration" error
**Cause**: YAML frontmatter malformed
**Fix**: Ensure `sdk_version: "4.0.0"` is quoted!
### Issue 2: "Module not found" error
**Cause**: Missing dependency in requirements.txt
**Fix**: Check all imports are listed in requirements.txt
### Issue 3: Space crashes on first load
**Cause**: OOM during model download
**Fix**:
- Refresh and try again (HF Spaces issue)
- Verify using flan-t5-base (not -large)
- Consider upgrading hardware tier
### Issue 4: Slow response times
**Cause**: Model reloading on each request
**Fix**:
- Check logs for "Loading model" messages
- Verify @lru_cache on get_model_service()
- Model should load once and persist
### Issue 5: Revision quality is poor
**Cause**: FLAN-T5-base is smallest model
**Fix**:
- Upgrade to CPU upgrade or T4 GPU
- Change model to google/flan-t5-large
- Set environment variable: DEFAULT_MODEL=google/flan-t5-large
## Performance Expectations
### Free Tier (CPU Basic)
- **Model**: google/flan-t5-base
- **First load**: ~60 seconds
- **Subsequent**: ~5-10 seconds
- **Concurrent users**: 1-2
- **Cost**: $0/month ✅
### CPU Upgrade
- **Model**: google/flan-t5-large possible
- **First load**: ~2-3 minutes
- **Subsequent**: ~10-15 seconds
- **Concurrent users**: 3-5
- **Cost**: ~$0.03/hour when running
### T4 GPU
- **Model**: google/flan-t5-xl possible
- **First load**: ~5 minutes
- **Subsequent**: ~3-5 seconds
- **Concurrent users**: 10+
- **Cost**: ~$0.60/hour when running
## Monitoring
### Check Space Health
1. **Logs**: Click "Logs" tab in Space
- Look for "Model loaded successfully"
- Check for any errors during startup
- Monitor analysis request times
2. **Usage**: Check Space settings
- See user count
- Monitor resource usage
- Check for crashes/restarts
3. **Feedback**: Enable Discussions
- Users can report issues
- Collect feedback on revision quality
## Success Criteria
- [x] Space builds without errors ✅
- [x] UI loads and displays correctly ✅
- [x] First analysis completes in ~60s ✅
- [x] Subsequent analyses in ~5-10s ✅
- [x] AI revisions are coherent and on-topic ✅
- [x] Different prompt packs work differently ✅
- [x] Rubric scores display correctly ✅
- [x] Diff highlighting shows changes ✅
- [x] No crashes or OOM errors ✅
## Post-Launch
### Week 1
- Monitor logs for errors
- Collect user feedback
- Note common issues
- Document workarounds
### Month 1
- Analyze usage patterns
- Consider model upgrade if needed
- Optimize prompt packs based on feedback
- Add new revision modes if requested
### Ongoing
- Keep dependencies updated
- Monitor HF Spaces announcements
- Update FLAN-T5 model if newer versions release
- Consider adding more features (export, history, etc.)
## Support
If deployment issues occur:
1. Check HF Spaces status: https://status.huggingface.co/
2. Review Space logs for errors
3. Compare with working example Spaces
4. Ask in HF Spaces Discord or forums
5. Check this project's GitHub issues
## Next Steps After Deployment
1. ✅ Share your Space URL!
2. ✅ Add to your portfolio/projects
3. ✅ Tweet about it with #HuggingFace #GradIO
4. ✅ Submit to Gradio showcase
5. ✅ Collect user feedback
6. ✅ Iterate based on usage
7. ✅ Consider adding more features
Good luck with deployment! 🚀