Docker_Deploy / HUGGINGFACE_DEPLOYMENT_GUIDE.md
Shaheryar Shah
Add backend files for RAG Chatbot Docker deployment
bec06d9

Deploying Backend on Docker for Hugging Face Spaces

This guide provides step-by-step instructions for deploying your RAG Chatbot backend to Hugging Face Spaces using Docker.

Prerequisites

  • Hugging Face account
  • Access to a Qdrant vector database (hosted or self-hosted)
  • API keys for your LLM provider (e.g., OpenAI)
  • Git and Docker installed on your local machine

Step 1: Prepare Your Hugging Face Repository

  1. Go to Hugging Face Spaces
  2. Click "Create New Space"
  3. Fill in the details:
    • Name: Choose a name for your space (e.g., your-username/physical-ai-chatbot-backend)
    • License: Select an appropriate license
    • SDK: Choose "Docker"
    • Hardware: Select the hardware specifications based on your needs (CPU is usually sufficient for a demo, GPU if you're running inference locally)
    • Spot Instance: Optional, for cost savings (interruptible hardware)
  4. Click "Create Space"

Step 2: Clone Your Space Repository

  1. Copy the git clone URL for your new space
  2. Clone the repository to your local machine:
git clone [your-space-git-url]
cd [your-space-directory]

Step 3: Add Backend Files to Your Space Repository

In your cloned space directory, copy all the files from the backend directory of your project:

# Copy backend files to the space directory
cp -r /path/to/your/HackthoneI_Book_ChatBot/backend/* ./

Your space directory should now have the following key files:

  • Dockerfile - Defines the Docker image
  • api.py - Main Flask application
  • requirements.txt - Python dependencies
  • app.py - Hugging Face Spaces entry point
  • Other backend files

Step 4: Configure Environment Variables

Hugging Face Spaces allows you to set secrets for environment variables:

  1. In your space repository on Hugging Face, go to the "Files" tab
  2. Click "Settings" (or "Edit Settings" if visible)
  3. In the "Secrets" section, add the following environment variables:
    • QDRANT_URL - Your Qdrant database URL
    • QDRANT_API_KEY - Your Qdrant API key
    • OPENAI_API_KEY - Your OpenAI API key (or your chosen LLM provider's key)
    • OPENAI_BASE_URL - Base URL for OpenAI API (if applicable, otherwise leave empty)
    • EMBEDDING_MODEL - The embedding model you're using (default: text-embedding-ada-002)

⚠️ Important Security Note: Never commit actual API keys to the repository. Always use Hugging Face Secrets for sensitive information.

Step 5: Commit and Push Your Changes

git add .
git commit -m "Add backend files for RAG Chatbot"
git push origin main

Step 6: Monitor the Build Process

  1. After pushing, Hugging Face will automatically build and deploy your Docker container
  2. You can monitor the build progress in the "Build logs" tab of your Space
  3. The first build may take 5-15 minutes depending on the size of your dependencies
  4. Once the build is complete, your backend will be accessible at https://[your-username]-[space-name].hf.space

Step 7: Test Your Deployment

  1. Once deployed, verify that your backend is running by checking:
    • Health endpoint: https://[your-username]-[space-name].hf.space/health
    • If you see {"status": "healthy"}, your backend is running correctly

Step 8: Connect Frontend (Optional)

If you have a frontend application, update it to use the Hugging Face Space URL as the backend endpoint instead of localhost.

Troubleshooting

Common Issues and Solutions

  1. Build Fails:

    • Check the build logs for specific error messages
    • Ensure all dependencies in requirements.txt are valid
    • Make sure your Dockerfile is properly formatted
  2. Container Crashes:

    • Check the runtime logs in the "Logs" tab
    • Verify all required environment variables are set in Secrets
    • Check that your Qdrant instance is accessible
  3. API Keys Not Working:

    • Verify that secrets are correctly set in the Hugging Face Space settings
    • Ensure environment variables are referenced correctly in your application code
  4. High Resource Usage:

    • Consider upgrading hardware for your Space if needed
    • Optimize embedding model size if possible
    • Implement caching to reduce API calls

Optimizing for Production

For production use, consider these improvements:

  1. Add Authentication: Implement authentication for your API endpoints
  2. Rate Limiting: Add rate limiting to prevent abuse
  3. Caching: Implement caching for frequent queries
  4. Health Monitoring: Add more comprehensive health checks
  5. Logging: Implement structured logging for easier debugging

Updating Your Deployment

To update your backend:

  1. Make changes to your local files
  2. Commit and push the changes:
    git add .
    git commit -m "Describe your changes"
    git push origin main
    
  3. Hugging Face will automatically rebuild and redeploy your container

Cost Considerations

  • Free tier: Limited compute hours per month
  • Paid tier: For more intensive usage or guaranteed uptime
  • Compute costs vary based on hardware selection

Security Best Practices

  1. Never commit API keys or other secrets to the repository
  2. Use Hugging Face Secrets for all sensitive information
  3. Implement proper authentication for your endpoints in production
  4. Regularly rotate your API keys
  5. Monitor API usage to detect potential abuse