ProjectEcho / DEPLOYMENT.md
jmisak's picture
Upload 23 files
c327bd5 verified
# Deployment Guide
## Deploying to HuggingFace Spaces
### Prerequisites
- HuggingFace account
- API token from your LLM provider (or use HF Inference API)
### Step-by-Step Deployment
#### 1. Create a New Space
1. Go to https://huggingface.co/spaces
2. Click "Create new Space"
3. Choose a name (e.g., "conversai-research-assistant")
4. Select SDK: **Gradio**
5. Choose visibility (Public or Private)
6. Click "Create Space"
#### 2. Upload Files
Upload these files to your Space:
**Required Files:**
- `app.py` - Main application
- `llm_backend.py` - LLM interface
- `survey_generator.py` - Survey generation
- `survey_translator.py` - Translation module
- `data_analyzer.py` - Analysis module
- `export_utils.py` - Export utilities
- `requirements.txt` - Dependencies
- `README.md` - Space description
**Optional Files:**
- `.env.example` - Configuration template
- `USAGE_GUIDE.md` - User guide
- `test_app.py` - Testing script
#### 3. Configure Environment Variables (Optional)
**Default Configuration (Recommended for Quick Start):**
No configuration needed! The app automatically uses HuggingFace Inference API with the built-in `HF_TOKEN`.
**Optional: Use Premium Providers**
For better performance, you can add these environment variables in Space Settings:
**For OpenAI:**
```
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-your-key-here
```
**For Anthropic:**
```
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=your-key-here
```
**For Custom HuggingFace Model:**
```
LLM_MODEL=mistralai/Mistral-7B-Instruct-v0.2
# LLM_PROVIDER defaults to huggingface
```
#### 4. Space Will Auto-Deploy
- HuggingFace will automatically build and deploy
- Check the "Logs" tab for build status
- First build may take 2-3 minutes
#### 5. Test Your Deployment
1. Wait for "Running" status
2. Open the Space URL
3. Test survey generation
4. Test translation
5. Test analysis with example data
### Using HuggingFace Inference API
The easiest option for deployment is to use HuggingFace's free Inference API:
**Pros:**
- No API key needed (uses HF_TOKEN automatically)
- Free tier available
- Easy setup
**Cons:**
- May have rate limits on free tier
- Slower than paid providers
- May queue during high usage
**Configuration:**
Just set `LLM_PROVIDER=huggingface` in your environment variables.
### Using Other Providers
#### OpenAI (Recommended for Production)
**Pros:**
- Fast and reliable
- High quality outputs
- Good API documentation
**Cons:**
- Requires paid API key
- Usage costs
**Cost Estimate:**
- Survey generation: ~$0.01-0.05 per survey
- Translation: ~$0.01-0.03 per language
- Analysis: ~$0.05-0.15 per batch
#### Anthropic Claude
**Pros:**
- Excellent for nuanced text
- Strong reasoning capabilities
- Good safety features
**Cons:**
- Requires API key
- Usage costs
**Cost Estimate:**
Similar to OpenAI pricing
## Deploying Locally
### For Development
```bash
# 1. Clone/download repository
git clone <your-repo-url>
cd ConversAI
# 2. Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Set environment variables
export LLM_PROVIDER="openai"
export OPENAI_API_KEY="your-key"
# 5. Run
python app.py
```
Access at `http://localhost:7860`
### For Production (Self-Hosted)
Use Docker for production deployment:
**Create Dockerfile:**
```dockerfile
FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY *.py .
COPY *.md .
ENV GRADIO_SERVER_NAME="0.0.0.0"
ENV GRADIO_SERVER_PORT=7860
EXPOSE 7860
CMD ["python", "app.py"]
```
**Build and run:**
```bash
docker build -t conversai .
docker run -p 7860:7860 \
-e LLM_PROVIDER=openai \
-e OPENAI_API_KEY=your-key \
conversai
```
## Post-Deployment Checklist
- [ ] App loads without errors
- [ ] Can generate a survey
- [ ] Can translate a survey
- [ ] Can analyze sample data
- [ ] Downloads work correctly
- [ ] Error messages are clear
- [ ] All tabs are accessible
- [ ] Mobile view works (if public)
## Monitoring and Maintenance
### Check Usage
Monitor your LLM API usage:
- OpenAI: https://platform.openai.com/usage
- Anthropic: Check your console
- HuggingFace: Monitor rate limits
### Update Dependencies
Regularly update to get security fixes:
```bash
pip install --upgrade gradio requests pandas
```
### Backup
Regularly backup:
- Generated surveys
- Analysis results
- User feedback
- Configuration
## Troubleshooting Deployment
### Space Build Fails
**Check:**
- `requirements.txt` is valid
- `README.md` has correct frontmatter
- No syntax errors in Python files
### Space Runs But Errors
**Check:**
- Environment variables are set
- API keys are valid
- Provider quotas aren't exceeded
### Slow Performance
**Solutions:**
- Upgrade to paid LLM tier
- Use faster models (e.g., GPT-4o-mini)
- Add caching for common requests
- Optimize prompts for shorter responses
## Scaling Considerations
### For Heavy Usage
1. **Use faster models**: GPT-4o-mini instead of GPT-4
2. **Implement caching**: Cache common survey patterns
3. **Add rate limiting**: Prevent abuse
4. **Load balancing**: Use multiple API keys
5. **Queue system**: Handle concurrent requests
### Cost Optimization
1. **Optimize prompts**: Shorter prompts = lower costs
2. **Batch operations**: Process multiple items together
3. **Use cheaper models**: For simpler tasks
4. **Set token limits**: Prevent runaway costs
5. **Monitor usage**: Set up alerts
## Security Best Practices
1. **Never commit API keys** to version control
2. **Use environment variables** for secrets
3. **Rotate keys regularly**
4. **Set spending limits** with providers
5. **Monitor for unusual activity**
6. **Use private Spaces** for sensitive research
## Support and Resources
- **HuggingFace Docs**: https://huggingface.co/docs/hub/spaces
- **Gradio Docs**: https://gradio.app/docs
- **OpenAI API**: https://platform.openai.com/docs
- **Anthropic API**: https://docs.anthropic.com
---
Need help? Check the USAGE_GUIDE.md or open an issue!