Spaces:
Sleeping
Sleeping
HuggingFace Spaces Deployment Guide
Overview
This guide walks you through deploying the HR Report Generator API on HuggingFace Spaces using Docker.
Prerequisites
- HuggingFace Account: Create a free account at huggingface.co
- OpenRouter API Key: Get your key from openrouter.ai
Step-by-Step Deployment
Step 1: Create a New Space
- Go to huggingface.co/new-space
- Fill in the details:
- Space name:
hr-report-api(or your preferred name) - License: Apache 2.0 (or your preference)
- SDK: Select Docker
- Visibility: Private (recommended for HR data)
- Space name:
- Click Create Space
Step 2: Upload Files
Upload all files from this folder to your Space. The structure should be:
your-space/
βββ api.py
βββ Dockerfile
βββ requirements.txt
βββ endpoints.txt
βββ README.md
βββ src/
βββ __init__.py
βββ config.py
βββ rag/
β βββ __init__.py
β βββ synthesizer.py
β βββ retriever.py
β βββ prompts.py
βββ knowledge/
β βββ __init__.py
β βββ vector_store.py
β βββ embeddings.py
βββ document_processor/
βββ __init__.py
βββ chunker.py
You can upload via:
- Web UI: Drag and drop files
- Git: Clone the repo and push
git clone https://huggingface.co/spaces/YOUR_USERNAME/hr-report-api
cd hr-report-api
# Copy all files from this folder
git add .
git commit -m "Initial deployment"
git push
Step 3: Configure Secrets
Go to Settings β Secrets in your Space and add:
| Secret Name | Value | Description |
|---|---|---|
OPENROUTER_API_KEY |
sk-or-... |
Your OpenRouter API key |
ALLOWED_ORIGINS |
https://checkin.hillsideprimarycare.com,https://hsmg.netlify.app |
Comma-separated allowed origins |
LLM_MODEL |
google/gemma-2-9b-it:free |
(Optional) Override model from endpoints.txt |
Step 4: Upload FAISS Index (Optional)
If you have a pre-built FAISS index with HR policies:
- Create a
data/embeddings/folder in your Space - Upload:
faiss_index.faiss- The FAISS index filefaiss_index.chunks.json- The chunks metadata
Without this, the API will still work but report "insufficient documentation."
Step 5: Verify Deployment
- Wait for the build to complete (1-3 minutes)
- Your API will be available at:
https://YOUR_USERNAME-hr-report-api.hf.space - Check health:
https://YOUR_USERNAME-hr-report-api.hf.space/api/health
API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/ |
GET | API info and status |
/api/health |
GET | Health check |
/api/generate |
POST | Generate HR document |
/api/status |
GET | Knowledge base status |
/api/config |
GET | Public configuration |
Generate Document Example
fetch('https://YOUR-SPACE.hf.space/api/generate', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
doc_type: 'Memorandum',
employee_name: 'John Smith',
date_from: '2026-02-01',
date_to: '2026-02-01',
reason: 'Tardiness',
additional_notes: 'Employee arrived 30 minutes late.'
})
})
Updating the LLM Model
- Edit
endpoints.txtin your Space - Uncomment the model you want to use
- The first uncommented line will be used
# Free Models:
google/gemma-2-9b-it:free
# meta-llama/llama-3.2-3b-instruct:free
# Paid Models:
# openai/gpt-4o
Origin Validation
The API validates the Origin header against ALLOWED_ORIGINS. Only requests from these domains are allowed:
https://checkin.hillsideprimarycare.comhttps://hsmg.netlify.apphttp://localhost:3000(for development)http://localhost:5500
To add more origins, update the ALLOWED_ORIGINS secret (comma-separated).
Troubleshooting
Build Fails
- Check Dockerfile syntax
- Ensure all files are uploaded
- Check the build logs for errors
CORS Errors
- Verify
ALLOWED_ORIGINSincludes your frontend domain - Make sure the domain has
https://prefix
API Returns 500
- Check if
OPENROUTER_API_KEYis set correctly - Verify the model in
endpoints.txtis available - Check Space logs for detailed errors
Slow Response
- First request may be slow due to model loading (~30s)
- Subsequent requests should be faster
Cost
| Component | Cost |
|---|---|
| HuggingFace Space | Free (with cold starts) |
| OpenRouter (free models) | Free |
| Total | $0/month |
Note: Free tier has 30-60 second cold starts when the Space sleeps after inactivity.
Next Steps
- β Deploy to HuggingFace Spaces
- β Configure secrets
- β³ Deploy frontend to Netlify (see
netlify/DEPLOY.md) - β³ Test end-to-end integration