hrbot / DEPLOY.md
Sonu Prasad
updated
8a1c0d1

HuggingFace Spaces Deployment Guide

Overview

This guide walks you through deploying the HR Report Generator API on HuggingFace Spaces using Docker.


Prerequisites

  1. HuggingFace Account: Create a free account at huggingface.co
  2. OpenRouter API Key: Get your key from openrouter.ai

Step-by-Step Deployment

Step 1: Create a New Space

  1. Go to huggingface.co/new-space
  2. Fill in the details:
    • Space name: hr-report-api (or your preferred name)
    • License: Apache 2.0 (or your preference)
    • SDK: Select Docker
    • Visibility: Private (recommended for HR data)
  3. Click Create Space

Step 2: Upload Files

Upload all files from this folder to your Space. The structure should be:

your-space/
β”œβ”€β”€ api.py
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ endpoints.txt
β”œβ”€β”€ README.md
└── src/
    β”œβ”€β”€ __init__.py
    β”œβ”€β”€ config.py
    β”œβ”€β”€ rag/
    β”‚   β”œβ”€β”€ __init__.py
    β”‚   β”œβ”€β”€ synthesizer.py
    β”‚   β”œβ”€β”€ retriever.py
    β”‚   └── prompts.py
    β”œβ”€β”€ knowledge/
    β”‚   β”œβ”€β”€ __init__.py
    β”‚   β”œβ”€β”€ vector_store.py
    β”‚   └── embeddings.py
    └── document_processor/
        β”œβ”€β”€ __init__.py
        └── chunker.py

You can upload via:

  • Web UI: Drag and drop files
  • Git: Clone the repo and push
git clone https://huggingface.co/spaces/YOUR_USERNAME/hr-report-api
cd hr-report-api
# Copy all files from this folder
git add .
git commit -m "Initial deployment"
git push

Step 3: Configure Secrets

Go to Settings β†’ Secrets in your Space and add:

Secret Name Value Description
OPENROUTER_API_KEY sk-or-... Your OpenRouter API key
ALLOWED_ORIGINS https://checkin.hillsideprimarycare.com,https://hsmg.netlify.app Comma-separated allowed origins
LLM_MODEL google/gemma-2-9b-it:free (Optional) Override model from endpoints.txt

Step 4: Upload FAISS Index (Optional)

If you have a pre-built FAISS index with HR policies:

  1. Create a data/embeddings/ folder in your Space
  2. Upload:
    • faiss_index.faiss - The FAISS index file
    • faiss_index.chunks.json - The chunks metadata

Without this, the API will still work but report "insufficient documentation."

Step 5: Verify Deployment

  1. Wait for the build to complete (1-3 minutes)
  2. Your API will be available at:
    https://YOUR_USERNAME-hr-report-api.hf.space
    
  3. Check health: https://YOUR_USERNAME-hr-report-api.hf.space/api/health

API Endpoints

Endpoint Method Description
/ GET API info and status
/api/health GET Health check
/api/generate POST Generate HR document
/api/status GET Knowledge base status
/api/config GET Public configuration

Generate Document Example

fetch('https://YOUR-SPACE.hf.space/api/generate', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({
        doc_type: 'Memorandum',
        employee_name: 'John Smith',
        date_from: '2026-02-01',
        date_to: '2026-02-01',
        reason: 'Tardiness',
        additional_notes: 'Employee arrived 30 minutes late.'
    })
})

Updating the LLM Model

  1. Edit endpoints.txt in your Space
  2. Uncomment the model you want to use
  3. The first uncommented line will be used
# Free Models:
google/gemma-2-9b-it:free
# meta-llama/llama-3.2-3b-instruct:free

# Paid Models:
# openai/gpt-4o

Origin Validation

The API validates the Origin header against ALLOWED_ORIGINS. Only requests from these domains are allowed:

  • https://checkin.hillsideprimarycare.com
  • https://hsmg.netlify.app
  • http://localhost:3000 (for development)
  • http://localhost:5500

To add more origins, update the ALLOWED_ORIGINS secret (comma-separated).


Troubleshooting

Build Fails

  • Check Dockerfile syntax
  • Ensure all files are uploaded
  • Check the build logs for errors

CORS Errors

  • Verify ALLOWED_ORIGINS includes your frontend domain
  • Make sure the domain has https:// prefix

API Returns 500

  • Check if OPENROUTER_API_KEY is set correctly
  • Verify the model in endpoints.txt is available
  • Check Space logs for detailed errors

Slow Response

  • First request may be slow due to model loading (~30s)
  • Subsequent requests should be faster

Cost

Component Cost
HuggingFace Space Free (with cold starts)
OpenRouter (free models) Free
Total $0/month

Note: Free tier has 30-60 second cold starts when the Space sleeps after inactivity.


Next Steps

  1. βœ… Deploy to HuggingFace Spaces
  2. βœ… Configure secrets
  3. ⏳ Deploy frontend to Netlify (see netlify/DEPLOY.md)
  4. ⏳ Test end-to-end integration