ProjectEcho / DEPLOYMENT.md
jmisak's picture
Upload 23 files
c327bd5 verified

A newer version of the Gradio SDK is available: 6.2.0

Upgrade

Deployment Guide

Deploying to HuggingFace Spaces

Prerequisites

  • HuggingFace account
  • API token from your LLM provider (or use HF Inference API)

Step-by-Step Deployment

1. Create a New Space

  1. Go to https://huggingface.co/spaces
  2. Click "Create new Space"
  3. Choose a name (e.g., "conversai-research-assistant")
  4. Select SDK: Gradio
  5. Choose visibility (Public or Private)
  6. Click "Create Space"

2. Upload Files

Upload these files to your Space:

Required Files:

  • app.py - Main application
  • llm_backend.py - LLM interface
  • survey_generator.py - Survey generation
  • survey_translator.py - Translation module
  • data_analyzer.py - Analysis module
  • export_utils.py - Export utilities
  • requirements.txt - Dependencies
  • README.md - Space description

Optional Files:

  • .env.example - Configuration template
  • USAGE_GUIDE.md - User guide
  • test_app.py - Testing script

3. Configure Environment Variables (Optional)

Default Configuration (Recommended for Quick Start):

No configuration needed! The app automatically uses HuggingFace Inference API with the built-in HF_TOKEN.

Optional: Use Premium Providers

For better performance, you can add these environment variables in Space Settings:

For OpenAI:

LLM_PROVIDER=openai
OPENAI_API_KEY=sk-your-key-here

For Anthropic:

LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=your-key-here

For Custom HuggingFace Model:

LLM_MODEL=mistralai/Mistral-7B-Instruct-v0.2
# LLM_PROVIDER defaults to huggingface

4. Space Will Auto-Deploy

  • HuggingFace will automatically build and deploy
  • Check the "Logs" tab for build status
  • First build may take 2-3 minutes

5. Test Your Deployment

  1. Wait for "Running" status
  2. Open the Space URL
  3. Test survey generation
  4. Test translation
  5. Test analysis with example data

Using HuggingFace Inference API

The easiest option for deployment is to use HuggingFace's free Inference API:

Pros:

  • No API key needed (uses HF_TOKEN automatically)
  • Free tier available
  • Easy setup

Cons:

  • May have rate limits on free tier
  • Slower than paid providers
  • May queue during high usage

Configuration: Just set LLM_PROVIDER=huggingface in your environment variables.

Using Other Providers

OpenAI (Recommended for Production)

Pros:

  • Fast and reliable
  • High quality outputs
  • Good API documentation

Cons:

  • Requires paid API key
  • Usage costs

Cost Estimate:

  • Survey generation: ~$0.01-0.05 per survey
  • Translation: ~$0.01-0.03 per language
  • Analysis: ~$0.05-0.15 per batch

Anthropic Claude

Pros:

  • Excellent for nuanced text
  • Strong reasoning capabilities
  • Good safety features

Cons:

  • Requires API key
  • Usage costs

Cost Estimate: Similar to OpenAI pricing

Deploying Locally

For Development

# 1. Clone/download repository
git clone <your-repo-url>
cd ConversAI

# 2. Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Set environment variables
export LLM_PROVIDER="openai"
export OPENAI_API_KEY="your-key"

# 5. Run
python app.py

Access at http://localhost:7860

For Production (Self-Hosted)

Use Docker for production deployment:

Create Dockerfile:

FROM python:3.10-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY *.py .
COPY *.md .

ENV GRADIO_SERVER_NAME="0.0.0.0"
ENV GRADIO_SERVER_PORT=7860

EXPOSE 7860

CMD ["python", "app.py"]

Build and run:

docker build -t conversai .
docker run -p 7860:7860 \
  -e LLM_PROVIDER=openai \
  -e OPENAI_API_KEY=your-key \
  conversai

Post-Deployment Checklist

  • App loads without errors
  • Can generate a survey
  • Can translate a survey
  • Can analyze sample data
  • Downloads work correctly
  • Error messages are clear
  • All tabs are accessible
  • Mobile view works (if public)

Monitoring and Maintenance

Check Usage

Monitor your LLM API usage:

Update Dependencies

Regularly update to get security fixes:

pip install --upgrade gradio requests pandas

Backup

Regularly backup:

  • Generated surveys
  • Analysis results
  • User feedback
  • Configuration

Troubleshooting Deployment

Space Build Fails

Check:

  • requirements.txt is valid
  • README.md has correct frontmatter
  • No syntax errors in Python files

Space Runs But Errors

Check:

  • Environment variables are set
  • API keys are valid
  • Provider quotas aren't exceeded

Slow Performance

Solutions:

  • Upgrade to paid LLM tier
  • Use faster models (e.g., GPT-4o-mini)
  • Add caching for common requests
  • Optimize prompts for shorter responses

Scaling Considerations

For Heavy Usage

  1. Use faster models: GPT-4o-mini instead of GPT-4
  2. Implement caching: Cache common survey patterns
  3. Add rate limiting: Prevent abuse
  4. Load balancing: Use multiple API keys
  5. Queue system: Handle concurrent requests

Cost Optimization

  1. Optimize prompts: Shorter prompts = lower costs
  2. Batch operations: Process multiple items together
  3. Use cheaper models: For simpler tasks
  4. Set token limits: Prevent runaway costs
  5. Monitor usage: Set up alerts

Security Best Practices

  1. Never commit API keys to version control
  2. Use environment variables for secrets
  3. Rotate keys regularly
  4. Set spending limits with providers
  5. Monitor for unusual activity
  6. Use private Spaces for sensitive research

Support and Resources


Need help? Check the USAGE_GUIDE.md or open an issue!