Spaces:

Text-to-Document-Generation
/

Docgenie-API

Paused

App Files Files Community

Docgenie-API / DEPLOYMENT.md

Ahadhassan-2003

deploy: update HF Space

dc4e6da 19 days ago

preview code

raw

history blame contribute delete

26.1 kB

🚀 DocGenie Deployment Guide

Complete guide for deploying DocGenie API + Handwriting Service to production with all interdependencies resolved.

📊 System Architecture

┌─────────────────────────────────────────────────────────────┐
│                         Client                               │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│                    Railway (CPU)                             │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  DocGenie API (Port 8000)                           │   │
│  │  - FastAPI server                                     │   │
│  │  - Imports: docgenie.generation.*                     │   │
│  │  - Endpoints: /generate, /generate/pdf, /generate/async│  │
│  └──────────────┬───────────────────────────────────────┘   │
│                 │                                            │
│  ┌──────────────▼───────────────────────────────────────┐   │
│  │  Background Worker                                    │   │
│  │  - RQ worker (Redis Queue)                           │   │
│  │  - ClaudeBatchedClient (50% cost savings)            │   │
│  │  - Imports: docgenie.generation.*                     │   │
│  └──────────────┬───────────────────────────────────────┘   │
└─────────────────┼────────────────────────────────────────────┘
                  │
        ┌─────────┴──────────┬──────────────┐
        │                    │              │
        ▼                    ▼              ▼
┌───────────────┐  ┌──────────────────┐  ┌──────────────┐
│ Redis (Upstash)│  │ Supabase         │  │ Google Drive │
│ - Job queue    │  │ - PostgreSQL     │  │ - File storage│
│ - Free tier    │  │ - Document DB    │  │ - OAuth 2.0  │
└───────────────┘  └──────────────────┘  └──────────────┘
        │
        ▼
┌─────────────────────────────────────────────────────────────┐
│             RunPod Serverless (GPU)                          │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  Handwriting Service (Port 8080)                     │   │
│  │  - WordStylist diffusion model                        │   │
│  │  - PyTorch + CUDA 11.8                                │   │
│  │  - NO docgenie imports (standalone)                   │   │
│  └──────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘

🔗 Dependency Resolution

✅ Problem: API imports from docgenie package

Solution: Deploy entire monorepo, install as package with pip install -e .

API Service imports:

# api/worker.py
from docgenie.generation.pipeline_01.claude_batching import ClaudeBatchedClient
from docgenie import ENV

# api/utils.py
from docgenie.generation.constants import BS_PARSER, HANDWRITING_CLASS_NAME
from docgenie.generation.pipeline_01.claude_batching import create_message
from docgenie.generation.pipeline_03_process_response import process_response
from docgenie.generation.pipeline_04_render_pdf_and_extract_geos import render_pdf

Dockerfile solution:

# Copy entire monorepo
COPY . .

# Install as editable package
RUN pip install -e .

# Install API requirements
RUN pip install -r api/requirements.txt

✅ Handwriting Service is Independent

No docgenie imports! Can be deployed standalone.

# handwriting_service/main.py - NO docgenie imports
from handwriting_service.inference import HandwritingGenerator
from handwriting_service.models import HandwritingRequest

📦 Pre-Deployment Checklist

1. Environment Variables

Create api/.env with all required variables:

# Claude API
ANTHROPIC_API_KEY=sk-ant-xxxxx

# Redis (will be replaced with Upstash URL)
REDIS_URL=redis://localhost:6379

# Handwriting Service
HANDWRITING_SERVICE_URL=http://localhost:8080

# Supabase
SUPABASE_URL=https://xxxxx.supabase.co
SUPABASE_KEY=eyJxxxxx

# Google Drive (for token refresh only)
# The frontend handles OAuth and sends tokens in API requests
# These credentials are only needed to refresh expired tokens during long jobs
GOOGLE_CLIENT_ID=xxxxx.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=GOCSPX-xxxxx
GOOGLE_DRIVE_FOLDER_NAME=DocGenie Documents

2. Test Locally First

# Terminal 1: Start Redis
docker run -p 6379:6379 redis:7-alpine

# Terminal 2: Start Handwriting Service
cd handwriting_service
DEVICE=cpu uvicorn main:app --port 8080

# Terminal 3: Start API
cd api
source ../.venv/bin/activate
uvicorn main:app --reload --port 8000

# Terminal 4: Start Worker
cd api
source ../.venv/bin/activate
python worker.py

Test endpoints:

# Health check
curl http://localhost:8000/health

# Async generation (uses batched API)
curl -X POST http://localhost:8000/generate/async \
  -H "Content-Type: application/json" \
  -d '{"template_name": "DocGenie", "num_pages": 2}'

🚢 Deployment Steps

Option A: Railway + RunPod (RECOMMENDED - $10/month)

Step 1: Deploy Redis to Upstash (FREE)

Go to https://upstash.com
Create account → New Redis Database
Copy the UPSTASH_REDIS_REST_URL (looks like: redis://default:xxxxx@xxxxx.upstash.io:6379)

Step 2: Deploy Handwriting Service to RunPod

Option A: Build from Git Repository (RECOMMENDED - No Docker Hub needed!)

This builds directly on RunPod's servers, avoiding the need to upload 10GB over your internet.

Prepare and push code to Git:

cd /media/ahad-hassan/Volume_E/FYP/FYP/docgenie

# First, prepare optimized WordStylist (removes 432MB of unnecessary files)
cd handwriting_service
./prepare_build.sh
cd ..

# Now commit the optimized WordStylist
git add handwriting_service/
git status  # Verify WordStylist is included (should show WordStylist/models/ema_ckpt.pt, etc.)
git commit -m "Add handwriting service with optimized WordStylist"
git push origin main

Deploy to RunPod:
- Go to https://runpod.io → Serverless → New Endpoint
- Click "Build from Git" (not Docker Image)
- Settings:
  - Name: docgenie-handwriting
  - Git URL: https://github.com/Ahadhassan-2003/FYP.git
  - Git Branch: main
  - Docker Build Context: docgenie/handwriting_service
  - Dockerfile Path: Dockerfile
  - GPU: RTX 4090 or A40
  - Container Disk: 15GB
  - Max Workers: 1
  - Idle Timeout: 5 seconds
  - Exposed Port: 8080
- Environment Variables:
```
DEVICE=cuda
PYTHONUNBUFFERED=1
```
- Build Args (prepare WordStylist):
```
PREPARE_WORDSTYLIST=true
```
- Click "Deploy"

RunPod will clone your repo and build the image on their fast servers!

Option B: Pre-built Docker Image (if Git unavailable)

Click to expand Docker Hub method

cd handwriting_service

# Prepare optimized build (removes 432MB)
./prepare_build.sh

# Login to Docker Hub
docker login

# Build image
docker buildx build --platform linux/amd64 \
  -t yourusername/docgenie-handwriting:latest \
  --build-arg BUILDKIT_INLINE_CACHE=1 \
  .

# Push to Docker Hub (may take 20-30 minutes for 10GB)
docker push yourusername/docgenie-handwriting:latest

Then deploy on RunPod:

Go to https://runpod.io → Serverless → New Endpoint
Docker Image: yourusername/docgenie-handwriting:latest
GPU: RTX 4090 or A40
Port: 8080
Environment Variables: DEVICE=cuda

docker push ahadhassan/docgenie-handwriting:v2 3. **Get endpoint URL:** - Copy the URL (looks like: `https://api.runpod.ai/v2/xxxxx/runsync`) - This is your `HANDWRITING_SERVICE_URL`

Step 3: Deploy API to Railway

Install Railway CLI:

# Install Railway CLI
npm i -g @railway/cli

# Or use curl
bash <(curl -fsSL cli.new) railway

Initialize Railway project:

cd /media/ahad-hassan/Volume_E/FYP/FYP/docgenie

# Login to Railway
railway login

# Create new project
railway init

# Link to project (creates railway.json)
railway link

Set environment variables:

# Set all environment variables from api/.env
railway variables set ANTHROPIC_API_KEY=sk-ant-xxxxx
railway variables set REDIS_URL=redis://default:xxxxx@xxxxx.upstash.io:6379
railway variables set HANDWRITING_SERVICE_URL=https://api.runpod.ai/v2/xxxxx/runsync
railway variables set SUPABASE_URL=https://xxxxx.supabase.co
railway variables set SUPABASE_KEY=eyJxxxxx

# Google OAuth (for token refresh only - frontend provides tokens in requests)
railway variables set GOOGLE_CLIENT_ID=xxxxx.apps.googleusercontent.com
railway variables set GOOGLE_CLIENT_SECRET=GOCSPX-xxxxx
railway variables set GOOGLE_DRIVE_FOLDER_NAME="DocGenie Documents"

Note: Google access/refresh tokens are NOT environment variables! The frontend authenticates with Google OAuth, then passes google_drive_token and google_drive_refresh_token in the API request body. See API request schema.

Deploy API + Worker:

# Railway will detect Dockerfile and deploy automatically
railway up

# Or connect to GitHub and deploy from there
railway connect

Option 1: Separate Worker Service (For Production Scale):

Note: Only needed if processing 50+ concurrent jobs. For most use cases, Option 2 (combined) is sufficient.

Method A: Connect to Same GitHub Repo (Recommended)
- Go to Railway dashboard → Your project → New Service
- Click "GitHub Repo" → Select your repo
- Name: docgenie-worker
- Settings → Deploy:
  - Builder: DOCKERFILE
  - Dockerfile Path: Dockerfile
  - Root Directory: / (same as API)
  - Custom Start Command:
```
rq worker --url $REDIS_URL
```
- Variables: Add all environment variables (same as API service)
- Deploy
Method B: Use Same Docker Image as API
- Railway dashboard → New Service → Empty Service
- Name: docgenie-worker
- Settings → Source: Link to API service's image
- Custom Start Command: rq worker --url $REDIS_URL
- Variables: Copy from API service
- Deploy
Option 2: Combined API + Worker (Recommended for Getting Started):

Update railway.json to run both in one service:
```
{
  "deploy": {
    "startCommand": "uvicorn api.main:app --host 0.0.0.0 --port $PORT & rq worker --url $REDIS_URL & wait"
  }
}
```
Then push:
```
git add railway.json
git commit -m "feat: Run API and worker in combined service"
git push
```
Benefits:
- ✅ Single service ($5/month instead of $10/month)
- ✅ Simpler logs and monitoring
- ✅ Automatic scaling together
- ✅ Good for 90% of use cases
Get API URL:
- Railway dashboard → API service → Settings → Domains
- Generate domain (e.g., docgenie-api.up.railway.app)

Step 4: Update Frontend

Update your frontend API URL to Railway domain:

const API_URL = 'https://docgenie-api.up.railway.app';

Option B: AWS EC2 + RunPod (For Production)

Prerequisites

AWS account with EC2 access
Domain name (optional, for SSL)

Step 1: Launch EC2 Instance

# Launch t3.medium instance
aws ec2 run-instances \
  --image-id ami-0c55b159cbfafe1f0 \
  --instance-type t3.medium \
  --key-name your-key-pair \
  --security-group-ids sg-xxxxx \
  --subnet-id subnet-xxxxx

Security Group Rules:

Port 22 (SSH) - Your IP only
Port 80 (HTTP) - 0.0.0.0/0
Port 443 (HTTPS) - 0.0.0.0/0
Port 8000 (API) - 0.0.0.0/0

Step 2: Setup EC2

# SSH into instance
ssh -i your-key.pem ubuntu@your-ec2-ip

# Update system
sudo apt update && sudo apt upgrade -y

# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu

# Install Docker Compose
sudo apt install docker-compose-plugin -y

# Install Git
sudo apt install git -y

# Clone repository
git clone https://gitlab.cs.hs-rm.de/diss_lamott/docgenie.git
cd docgenie

Step 3: Configure Environment

# Create .env file
cd api
nano .env

# Paste all environment variables
# Save: Ctrl+X, Y, Enter

# Update REDIS_URL to use Upstash
# Update HANDWRITING_SERVICE_URL to RunPod endpoint

Step 4: Deploy with Docker Compose

cd /home/ubuntu/docgenie

# Start services (API + Worker + Redis)
docker-compose up -d api worker redis

# Check logs
docker-compose logs -f api
docker-compose logs -f worker

Step 5: Setup Nginx Reverse Proxy

# Install Nginx
sudo apt install nginx -y

# Create config
sudo nano /etc/nginx/sites-available/docgenie

# Paste configuration:

server {
    listen 80;
    server_name your-domain.com;  # Or use EC2 IP

    location / {
        proxy_pass http://localhost:8000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Increase timeout for long-running requests
        proxy_read_timeout 300s;
        proxy_connect_timeout 75s;
    }
}

# Enable site
sudo ln -s /etc/nginx/sites-available/docgenie /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl restart nginx

# Optional: Setup SSL with Let's Encrypt
sudo apt install certbot python3-certbot-nginx -y
sudo certbot --nginx -d your-domain.com

Step 6: Setup Systemd Service (Auto-restart)

# Create service file
sudo nano /etc/systemd/system/docgenie.service

[Unit]
Description=DocGenie API
After=docker.service
Requires=docker.service

[Service]
Type=oneshot
RemainAfterExit=yes
WorkingDirectory=/home/ubuntu/docgenie
ExecStart=/usr/bin/docker-compose up -d api worker redis
ExecStop=/usr/bin/docker-compose down
User=ubuntu

[Install]
WantedBy=multi-user.target

# Enable service
sudo systemctl daemon-reload
sudo systemctl enable docgenie
sudo systemctl start docgenie

# Check status
sudo systemctl status docgenie

🧪 Testing Production Deployment

1. Health Check

curl https://your-domain.com/health

2. Sync Generation (Fast)

curl -X POST https://your-domain.com/generate \
  -H "Content-Type: application/json" \
  -d '{
    "template_name": "DocGenie",
    "num_pages": 1
  }'

3. Async Generation (Batched, Cheap)

# Start async job
RESPONSE=$(curl -X POST https://your-domain.com/generate/async \
  -H "Content-Type: application/json" \
  -d '{
    "template_name": "DocGenie",
    "num_pages": 2
  }')

REQUEST_ID=$(echo $RESPONSE | jq -r '.request_id')
echo "Request ID: $REQUEST_ID"

# Poll status
while true; do
  STATUS=$(curl -s https://your-domain.com/jobs/$REQUEST_ID/status | jq -r '.status')
  echo "Status: $STATUS"
  if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then
    break
  fi
  sleep 10
done

# Get result
curl https://your-domain.com/jobs/$REQUEST_ID/status | jq

📊 Cost Breakdown

Railway + RunPod (Recommended)

Service	Cost	Notes
Railway (API + Worker)	$5-10/month	Includes 500 hours
Upstash Redis	FREE	10K requests/day
RunPod Serverless GPU	$0.20/hr	Only charged when active
Supabase	FREE	500MB database
Total	~$10-15/month	+ $0.20/hr GPU usage

EC2 + RunPod

Service	Cost	Notes
EC2 t3.medium	$30/month	2 vCPU, 4GB RAM
Upstash Redis	FREE	External Redis
RunPod Serverless GPU	$0.20/hr	Only when needed
Supabase	FREE	External DB
Total	~$30/month	+ $0.20/hr GPU usage

EC2 + Dedicated GPU (Production)

Service	Cost	Notes
EC2 g4dn.xlarge	$150/month	4 vCPU, 16GB RAM, T4 GPU
Supabase	FREE	External DB
Total	~$150/month	All-in-one solution

🔧 Maintenance

Update Deployment

Railway:

# Push to main branch (auto-deploy)
git push origin main

# Or manual deploy
railway up

EC2:

ssh ubuntu@your-ec2-ip
cd docgenie
git pull
docker-compose down
docker-compose up -d --build

View Logs

Railway:

railway logs

EC2:

# API logs
docker-compose logs -f api

# Worker logs
docker-compose logs -f worker

# Nginx logs
sudo tail -f /var/log/nginx/access.log
sudo tail -f /var/log/nginx/error.log

Monitor Redis Queue

# Connect to Redis
redis-cli -u $REDIS_URL

# Check queue status
> LLEN rq:queue:default
> LRANGE rq:queue:default 0 -1

🚨 Troubleshooting

Issue: Worker can't import docgenie package

Solution: Dockerfile installs entire monorepo with pip install -e .

Issue: Handwriting service connection timeout

Solution: Use RunPod's /runsync endpoint, not /run (synchronous)

Issue: Google token expired during job

Solution: Ensure GOOGLE_REFRESH_TOKEN, GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET are set

Issue: Railway build fails (too large)

Solution: Check .dockerignore excludes data/ folders

Issue: Worker heartbeat timeout

Solution: Job is still running, batched API takes 10-30 minutes

📚 Next Steps

Monitor costs: Railway dashboard, RunPod usage page
Setup alerts: Railway → Settings → Notifications
Scale workers: Railway → Worker service → Settings → Replicas
Add caching: Redis cache for generated documents
Setup CI/CD: GitHub Actions → Railway auto-deploy

🎉 You're Done!

Your DocGenie API is now deployed with:

✅ All docgenie package imports resolved
✅ GPU handwriting service on RunPod
✅ Background workers for batched API
✅ Auto-scaling and cost optimization
✅ Google token refresh working
✅ Database schema compatibility

API URL: https://your-domain.com
Docs: https://your-domain.com/docs
Health: https://your-domain.com/health

🖥️ Local Testing Guide

Architecture

┌─────────────────────────────────┐
│   DocGenie API (Port 8000)      │──┐ HTTP
└─────────────────────────────────┘  │ localhost:8080
                                     ▼
┌─────────────────────────────────┐
│ Handwriting Service (Port 8080) │
│ - Loads WordStylist model       │
└─────────────────────────────────┘

Prerequisites

Python environment: source .venv/bin/activate
WordStylist Model at WordStylist/models/ckpt.pt and ema_ckpt.pt
api/.env with ANTHROPIC_API_KEY, HANDWRITING_SERVICE_ENABLED=true, HANDWRITING_SERVICE_URL=http://localhost:8080

Step-by-Step Setup

Terminal 1 – Handwriting Service:

cd handwriting_service
DEVICE=cpu ./start.sh          # CPU (no GPU required)
# DEVICE=cuda ./start.sh       # GPU (faster)

Terminal 2 – DocGenie API:

cd api
uvicorn main:app --reload

Terminal 3 – Test:

curl http://localhost:8080/health   # Handwriting service
curl http://localhost:8000/health   # API
cd api && python test_api.py

Performance Notes

CPU mode: ~5–10 s/word | GPU mode: ~0.5–1 s/word
Service processes all words in one batch for efficiency

⚙️ Railway-Specific Configuration

Critical Issues & Fixes

1. .dockerignore – Keep required data folders:

!data/prompt_templates/
!data/visual_element_prefabs/

2. railway.json – Start both API and worker:

"startCommand": "cd api && uvicorn main:app --host 0.0.0.0 --port $PORT & rq worker --url $REDIS_URL & wait"

Environment Variables

🔴 Required

ANTHROPIC_API_KEY=sk-ant-api03-xxx
REDIS_URL=rediss://default:xxx@xxx.upstash.io:6379
HANDWRITING_SERVICE_URL=https://api.runpod.ai/v2/ht9ajgrduitgpr/runsync
HANDWRITING_SERVICE_ENABLED=true
SUPABASE_URL=https://xxx.supabase.co
SUPABASE_KEY=xxx
GOOGLE_CLIENT_ID=xxx.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=xxx

🟡 Recommended

RUNPOD_API_KEY=xxx
OCR_SERVICE_ENABLED=true
OCR_USE_LOCAL=true
OCR_ENGINE=microsoft_di
OCR_DPI=300
HANDWRITING_SERVICE_TIMEOUT=300
HANDWRITING_SERVICE_MAX_RETRIES=3
RQ_QUEUE_NAME=docgenie
LOG_LEVEL=INFO

🟢 Optional (defaults are fine)

API_HOST=0.0.0.0
API_PORT=8000
DEBUG_MODE=false
CLAUDE_MODEL=claude-sonnet-4-5-20250929
CORS_ORIGINS=*
GOOGLE_DRIVE_FOLDER_NAME=DocGenie Documents
TEMP_DIR=/tmp/docgenie_api
HANDWRITING_APPLY_BLUR=false
BBOX_NORMALIZATION_ENABLED=false
GT_VERIFICATION_ENABLED=false
ANALYSIS_ENABLED=false
DEBUG_VISUALIZATION_ENABLED=false

Validation Steps

# 1. Health check
curl https://your-app.up.railway.app/health

# 2. Sync generation
curl -X POST https://your-app.up.railway.app/api/generate \
  -H "Content-Type: application/json" \
  -d '{"document_category": "invoice", "pages": 1}'

# 3. Async generation
curl -X POST https://your-app.up.railway.app/api/async/generate \
  -H "Content-Type: application/json" \
  -d '{"document_category": "invoice", "pages": 1, "google_access_token": "ya29.xxx"}'

Common Railway Issues

Issue	Cause	Solution
Worker not starting	Missing `rq worker` in start command	Check `railway.json` `startCommand`
Missing prompt templates	`.dockerignore` too aggressive	Add `!data/prompt_templates/`
Playwright errors	Browser not installed	Ensure `playwright install chromium` in Dockerfile
Redis connection errors	Wrong `REDIS_URL`	Verify in Railway env variables
Handwriting timeout	Batch too large	Increase `HANDWRITING_SERVICE_TIMEOUT`
Large Docker image	`data/` folders included	Check `.dockerignore` excludes datasets/embeddings

⚡ RunPod Batch Optimization

Problem (Old Parallel Processing)

Each text was sent as a separate RunPod request → N texts = N workers = N× activation cost.

Example: 10 texts → 10 workers × 18 s = 180 worker-seconds + 10× activation fees

Solution (New Batch Processing)

All texts sent in one RunPod request → 1 worker handles everything.

Example: 10 texts → 1 worker × 190 s = 190 worker-seconds + 1× activation fee
Savings: ~45–60% cost reduction (activation fees dominate RunPod pricing)

Batch Request Format (handler.py)

{
  "input": {
    "texts": [
      {"text": "Hello", "author_id": 42, "hw_id": "hw_0"},
      {"text": "World", "author_id": 42, "hw_id": "hw_1"}
    ],
    "apply_blur": true
  }
}

Response:

{
  "status": "COMPLETED",
  "output": {
    "images": [
      {"image_base64": "...", "width": 217, "height": 61, "text": "Hello", "author_id": 42, "hw_id": "hw_0"},
      {"image_base64": "...", "width": 195, "height": 58, "text": "World", "author_id": 42, "hw_id": "hw_1"}
    ],
    "total_generated": 2
  }
}

Note: Backward-compatible – single text requests (old format) are still supported. Handler auto-detects batch vs single based on the "texts" key.

Timeout Configuration

Timeout is dynamically calculated: num_texts × 20 + 30 seconds.
For large batches (20+ texts), set RunPod endpoint max execution time to 600 s.

Cost Comparison

Scenario	OLD (parallel)	NEW (batched)	Savings
2 texts	2 workers × 18 s	1 worker × 38 s	~50%
10 texts	10 workers × 18 s	1 worker × 190 s	~55%
25 texts	25 workers × 18 s	1 worker × 480 s	~60%

Integration Test

cd api
python test_runpod_integration.py