Docgenie-API / DEPLOYMENT.md
Ahadhassan-2003
deploy: update HF Space
dc4e6da

πŸš€ DocGenie Deployment Guide

Complete guide for deploying DocGenie API + Handwriting Service to production with all interdependencies resolved.

πŸ“Š System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         Client                               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Railway (CPU)                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  DocGenie API (Port 8000)                           β”‚   β”‚
β”‚  β”‚  - FastAPI server                                     β”‚   β”‚
β”‚  β”‚  - Imports: docgenie.generation.*                     β”‚   β”‚
β”‚  β”‚  - Endpoints: /generate, /generate/pdf, /generate/asyncβ”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                 β”‚                                            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  Background Worker                                    β”‚   β”‚
β”‚  β”‚  - RQ worker (Redis Queue)                           β”‚   β”‚
β”‚  β”‚  - ClaudeBatchedClient (50% cost savings)            β”‚   β”‚
β”‚  β”‚  - Imports: docgenie.generation.*                     β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚                    β”‚              β”‚
        β–Ό                    β–Ό              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Redis (Upstash)β”‚  β”‚ Supabase         β”‚  β”‚ Google Drive β”‚
β”‚ - Job queue    β”‚  β”‚ - PostgreSQL     β”‚  β”‚ - File storageβ”‚
β”‚ - Free tier    β”‚  β”‚ - Document DB    β”‚  β”‚ - OAuth 2.0  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚
        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚             RunPod Serverless (GPU)                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  Handwriting Service (Port 8080)                     β”‚   β”‚
β”‚  β”‚  - WordStylist diffusion model                        β”‚   β”‚
β”‚  β”‚  - PyTorch + CUDA 11.8                                β”‚   β”‚
β”‚  β”‚  - NO docgenie imports (standalone)                   β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”— Dependency Resolution

βœ… Problem: API imports from docgenie package

Solution: Deploy entire monorepo, install as package with pip install -e .

API Service imports:

# api/worker.py
from docgenie.generation.pipeline_01.claude_batching import ClaudeBatchedClient
from docgenie import ENV

# api/utils.py
from docgenie.generation.constants import BS_PARSER, HANDWRITING_CLASS_NAME
from docgenie.generation.pipeline_01.claude_batching import create_message
from docgenie.generation.pipeline_03_process_response import process_response
from docgenie.generation.pipeline_04_render_pdf_and_extract_geos import render_pdf

Dockerfile solution:

# Copy entire monorepo
COPY . .

# Install as editable package
RUN pip install -e .

# Install API requirements
RUN pip install -r api/requirements.txt

βœ… Handwriting Service is Independent

No docgenie imports! Can be deployed standalone.

# handwriting_service/main.py - NO docgenie imports
from handwriting_service.inference import HandwritingGenerator
from handwriting_service.models import HandwritingRequest

πŸ“¦ Pre-Deployment Checklist

1. Environment Variables

Create api/.env with all required variables:

# Claude API
ANTHROPIC_API_KEY=sk-ant-xxxxx

# Redis (will be replaced with Upstash URL)
REDIS_URL=redis://localhost:6379

# Handwriting Service
HANDWRITING_SERVICE_URL=http://localhost:8080

# Supabase
SUPABASE_URL=https://xxxxx.supabase.co
SUPABASE_KEY=eyJxxxxx

# Google Drive (for token refresh only)
# The frontend handles OAuth and sends tokens in API requests
# These credentials are only needed to refresh expired tokens during long jobs
GOOGLE_CLIENT_ID=xxxxx.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=GOCSPX-xxxxx
GOOGLE_DRIVE_FOLDER_NAME=DocGenie Documents

2. Test Locally First

# Terminal 1: Start Redis
docker run -p 6379:6379 redis:7-alpine

# Terminal 2: Start Handwriting Service
cd handwriting_service
DEVICE=cpu uvicorn main:app --port 8080

# Terminal 3: Start API
cd api
source ../.venv/bin/activate
uvicorn main:app --reload --port 8000

# Terminal 4: Start Worker
cd api
source ../.venv/bin/activate
python worker.py

Test endpoints:

# Health check
curl http://localhost:8000/health

# Async generation (uses batched API)
curl -X POST http://localhost:8000/generate/async \
  -H "Content-Type: application/json" \
  -d '{"template_name": "DocGenie", "num_pages": 2}'

🚒 Deployment Steps

Option A: Railway + RunPod (RECOMMENDED - $10/month)

Step 1: Deploy Redis to Upstash (FREE)

  1. Go to https://upstash.com
  2. Create account β†’ New Redis Database
  3. Copy the UPSTASH_REDIS_REST_URL (looks like: redis://default:xxxxx@xxxxx.upstash.io:6379)

Step 2: Deploy Handwriting Service to RunPod

Option A: Build from Git Repository (RECOMMENDED - No Docker Hub needed!)

This builds directly on RunPod's servers, avoiding the need to upload 10GB over your internet.

  1. Prepare and push code to Git:
cd /media/ahad-hassan/Volume_E/FYP/FYP/docgenie

# First, prepare optimized WordStylist (removes 432MB of unnecessary files)
cd handwriting_service
./prepare_build.sh
cd ..

# Now commit the optimized WordStylist
git add handwriting_service/
git status  # Verify WordStylist is included (should show WordStylist/models/ema_ckpt.pt, etc.)
git commit -m "Add handwriting service with optimized WordStylist"
git push origin main
  1. Deploy to RunPod:
    • Go to https://runpod.io β†’ Serverless β†’ New Endpoint
    • Click "Build from Git" (not Docker Image)
    • Settings:
      • Name: docgenie-handwriting
      • Git URL: https://github.com/Ahadhassan-2003/FYP.git
      • Git Branch: main
      • Docker Build Context: docgenie/handwriting_service
      • Dockerfile Path: Dockerfile
      • GPU: RTX 4090 or A40
      • Container Disk: 15GB
      • Max Workers: 1
      • Idle Timeout: 5 seconds
      • Exposed Port: 8080
    • Environment Variables:
      DEVICE=cuda
      PYTHONUNBUFFERED=1
      
    • Build Args (prepare WordStylist):
      PREPARE_WORDSTYLIST=true
      
    • Click "Deploy"

RunPod will clone your repo and build the image on their fast servers!

Option B: Pre-built Docker Image (if Git unavailable)

Click to expand Docker Hub method
cd handwriting_service

# Prepare optimized build (removes 432MB)
./prepare_build.sh

# Login to Docker Hub
docker login

# Build image
docker buildx build --platform linux/amd64 \
  -t yourusername/docgenie-handwriting:latest \
  --build-arg BUILDKIT_INLINE_CACHE=1 \
  .

# Push to Docker Hub (may take 20-30 minutes for 10GB)
docker push yourusername/docgenie-handwriting:latest

Then deploy on RunPod:

  • Go to https://runpod.io β†’ Serverless β†’ New Endpoint
  • Docker Image: yourusername/docgenie-handwriting:latest
  • GPU: RTX 4090 or A40
  • Port: 8080
  • Environment Variables: DEVICE=cuda
docker push ahadhassan/docgenie-handwriting:v2 3. **Get endpoint URL:** - Copy the URL (looks like: `https://api.runpod.ai/v2/xxxxx/runsync`) - This is your `HANDWRITING_SERVICE_URL`

Step 3: Deploy API to Railway

  1. Install Railway CLI:
# Install Railway CLI
npm i -g @railway/cli

# Or use curl
bash <(curl -fsSL cli.new) railway
  1. Initialize Railway project:
cd /media/ahad-hassan/Volume_E/FYP/FYP/docgenie

# Login to Railway
railway login

# Create new project
railway init

# Link to project (creates railway.json)
railway link
  1. Set environment variables:
# Set all environment variables from api/.env
railway variables set ANTHROPIC_API_KEY=sk-ant-xxxxx
railway variables set REDIS_URL=redis://default:xxxxx@xxxxx.upstash.io:6379
railway variables set HANDWRITING_SERVICE_URL=https://api.runpod.ai/v2/xxxxx/runsync
railway variables set SUPABASE_URL=https://xxxxx.supabase.co
railway variables set SUPABASE_KEY=eyJxxxxx

# Google OAuth (for token refresh only - frontend provides tokens in requests)
railway variables set GOOGLE_CLIENT_ID=xxxxx.apps.googleusercontent.com
railway variables set GOOGLE_CLIENT_SECRET=GOCSPX-xxxxx
railway variables set GOOGLE_DRIVE_FOLDER_NAME="DocGenie Documents"

Note: Google access/refresh tokens are NOT environment variables! The frontend authenticates with Google OAuth, then passes google_drive_token and google_drive_refresh_token in the API request body. See API request schema.

  1. Deploy API + Worker:
# Railway will detect Dockerfile and deploy automatically
railway up

# Or connect to GitHub and deploy from there
railway connect
  1. Option 1: Separate Worker Service (For Production Scale):

    Note: Only needed if processing 50+ concurrent jobs. For most use cases, Option 2 (combined) is sufficient.

    Method A: Connect to Same GitHub Repo (Recommended)

    • Go to Railway dashboard β†’ Your project β†’ New Service
    • Click "GitHub Repo" β†’ Select your repo
    • Name: docgenie-worker
    • Settings β†’ Deploy:
      • Builder: DOCKERFILE
      • Dockerfile Path: Dockerfile
      • Root Directory: / (same as API)
      • Custom Start Command:
        rq worker --url $REDIS_URL
        
    • Variables: Add all environment variables (same as API service)
    • Deploy

    Method B: Use Same Docker Image as API

    • Railway dashboard β†’ New Service β†’ Empty Service
    • Name: docgenie-worker
    • Settings β†’ Source: Link to API service's image
    • Custom Start Command: rq worker --url $REDIS_URL
    • Variables: Copy from API service
    • Deploy
  2. Option 2: Combined API + Worker (Recommended for Getting Started):

    Update railway.json to run both in one service:

    {
      "deploy": {
        "startCommand": "uvicorn api.main:app --host 0.0.0.0 --port $PORT & rq worker --url $REDIS_URL & wait"
      }
    }
    

    Then push:

    git add railway.json
    git commit -m "feat: Run API and worker in combined service"
    git push
    

    Benefits:

    • βœ… Single service ($5/month instead of $10/month)
    • βœ… Simpler logs and monitoring
    • βœ… Automatic scaling together
    • βœ… Good for 90% of use cases
  3. Get API URL:

    • Railway dashboard β†’ API service β†’ Settings β†’ Domains
    • Generate domain (e.g., docgenie-api.up.railway.app)

Step 4: Update Frontend

Update your frontend API URL to Railway domain:

const API_URL = 'https://docgenie-api.up.railway.app';

Option B: AWS EC2 + RunPod (For Production)

Prerequisites

  • AWS account with EC2 access
  • Domain name (optional, for SSL)

Step 1: Launch EC2 Instance

# Launch t3.medium instance
aws ec2 run-instances \
  --image-id ami-0c55b159cbfafe1f0 \
  --instance-type t3.medium \
  --key-name your-key-pair \
  --security-group-ids sg-xxxxx \
  --subnet-id subnet-xxxxx

Security Group Rules:

  • Port 22 (SSH) - Your IP only
  • Port 80 (HTTP) - 0.0.0.0/0
  • Port 443 (HTTPS) - 0.0.0.0/0
  • Port 8000 (API) - 0.0.0.0/0

Step 2: Setup EC2

# SSH into instance
ssh -i your-key.pem ubuntu@your-ec2-ip

# Update system
sudo apt update && sudo apt upgrade -y

# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu

# Install Docker Compose
sudo apt install docker-compose-plugin -y

# Install Git
sudo apt install git -y

# Clone repository
git clone https://gitlab.cs.hs-rm.de/diss_lamott/docgenie.git
cd docgenie

Step 3: Configure Environment

# Create .env file
cd api
nano .env

# Paste all environment variables
# Save: Ctrl+X, Y, Enter

# Update REDIS_URL to use Upstash
# Update HANDWRITING_SERVICE_URL to RunPod endpoint

Step 4: Deploy with Docker Compose

cd /home/ubuntu/docgenie

# Start services (API + Worker + Redis)
docker-compose up -d api worker redis

# Check logs
docker-compose logs -f api
docker-compose logs -f worker

Step 5: Setup Nginx Reverse Proxy

# Install Nginx
sudo apt install nginx -y

# Create config
sudo nano /etc/nginx/sites-available/docgenie

# Paste configuration:
server {
    listen 80;
    server_name your-domain.com;  # Or use EC2 IP

    location / {
        proxy_pass http://localhost:8000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Increase timeout for long-running requests
        proxy_read_timeout 300s;
        proxy_connect_timeout 75s;
    }
}
# Enable site
sudo ln -s /etc/nginx/sites-available/docgenie /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl restart nginx

# Optional: Setup SSL with Let's Encrypt
sudo apt install certbot python3-certbot-nginx -y
sudo certbot --nginx -d your-domain.com

Step 6: Setup Systemd Service (Auto-restart)

# Create service file
sudo nano /etc/systemd/system/docgenie.service
[Unit]
Description=DocGenie API
After=docker.service
Requires=docker.service

[Service]
Type=oneshot
RemainAfterExit=yes
WorkingDirectory=/home/ubuntu/docgenie
ExecStart=/usr/bin/docker-compose up -d api worker redis
ExecStop=/usr/bin/docker-compose down
User=ubuntu

[Install]
WantedBy=multi-user.target
# Enable service
sudo systemctl daemon-reload
sudo systemctl enable docgenie
sudo systemctl start docgenie

# Check status
sudo systemctl status docgenie

πŸ§ͺ Testing Production Deployment

1. Health Check

curl https://your-domain.com/health

2. Sync Generation (Fast)

curl -X POST https://your-domain.com/generate \
  -H "Content-Type: application/json" \
  -d '{
    "template_name": "DocGenie",
    "num_pages": 1
  }'

3. Async Generation (Batched, Cheap)

# Start async job
RESPONSE=$(curl -X POST https://your-domain.com/generate/async \
  -H "Content-Type: application/json" \
  -d '{
    "template_name": "DocGenie",
    "num_pages": 2
  }')

REQUEST_ID=$(echo $RESPONSE | jq -r '.request_id')
echo "Request ID: $REQUEST_ID"

# Poll status
while true; do
  STATUS=$(curl -s https://your-domain.com/jobs/$REQUEST_ID/status | jq -r '.status')
  echo "Status: $STATUS"
  if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then
    break
  fi
  sleep 10
done

# Get result
curl https://your-domain.com/jobs/$REQUEST_ID/status | jq

πŸ“Š Cost Breakdown

Railway + RunPod (Recommended)

Service Cost Notes
Railway (API + Worker) $5-10/month Includes 500 hours
Upstash Redis FREE 10K requests/day
RunPod Serverless GPU $0.20/hr Only charged when active
Supabase FREE 500MB database
Total ~$10-15/month + $0.20/hr GPU usage

EC2 + RunPod

Service Cost Notes
EC2 t3.medium $30/month 2 vCPU, 4GB RAM
Upstash Redis FREE External Redis
RunPod Serverless GPU $0.20/hr Only when needed
Supabase FREE External DB
Total ~$30/month + $0.20/hr GPU usage

EC2 + Dedicated GPU (Production)

Service Cost Notes
EC2 g4dn.xlarge $150/month 4 vCPU, 16GB RAM, T4 GPU
Supabase FREE External DB
Total ~$150/month All-in-one solution

πŸ”§ Maintenance

Update Deployment

Railway:

# Push to main branch (auto-deploy)
git push origin main

# Or manual deploy
railway up

EC2:

ssh ubuntu@your-ec2-ip
cd docgenie
git pull
docker-compose down
docker-compose up -d --build

View Logs

Railway:

railway logs

EC2:

# API logs
docker-compose logs -f api

# Worker logs
docker-compose logs -f worker

# Nginx logs
sudo tail -f /var/log/nginx/access.log
sudo tail -f /var/log/nginx/error.log

Monitor Redis Queue

# Connect to Redis
redis-cli -u $REDIS_URL

# Check queue status
> LLEN rq:queue:default
> LRANGE rq:queue:default 0 -1

🚨 Troubleshooting

Issue: Worker can't import docgenie package

Solution: Dockerfile installs entire monorepo with pip install -e .

Issue: Handwriting service connection timeout

Solution: Use RunPod's /runsync endpoint, not /run (synchronous)

Issue: Google token expired during job

Solution: Ensure GOOGLE_REFRESH_TOKEN, GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET are set

Issue: Railway build fails (too large)

Solution: Check .dockerignore excludes data/ folders

Issue: Worker heartbeat timeout

Solution: Job is still running, batched API takes 10-30 minutes

πŸ“š Next Steps

  1. Monitor costs: Railway dashboard, RunPod usage page
  2. Setup alerts: Railway β†’ Settings β†’ Notifications
  3. Scale workers: Railway β†’ Worker service β†’ Settings β†’ Replicas
  4. Add caching: Redis cache for generated documents
  5. Setup CI/CD: GitHub Actions β†’ Railway auto-deploy

πŸŽ‰ You're Done!

Your DocGenie API is now deployed with:

  • βœ… All docgenie package imports resolved
  • βœ… GPU handwriting service on RunPod
  • βœ… Background workers for batched API
  • βœ… Auto-scaling and cost optimization
  • βœ… Google token refresh working
  • βœ… Database schema compatibility

API URL: https://your-domain.com
Docs: https://your-domain.com/docs
Health: https://your-domain.com/health


πŸ–₯️ Local Testing Guide

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   DocGenie API (Port 8000)      │──┐ HTTP
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚ localhost:8080
                                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Handwriting Service (Port 8080) β”‚
β”‚ - Loads WordStylist model       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Prerequisites

  1. Python environment: source .venv/bin/activate
  2. WordStylist Model at WordStylist/models/ckpt.pt and ema_ckpt.pt
  3. api/.env with ANTHROPIC_API_KEY, HANDWRITING_SERVICE_ENABLED=true, HANDWRITING_SERVICE_URL=http://localhost:8080

Step-by-Step Setup

Terminal 1 – Handwriting Service:

cd handwriting_service
DEVICE=cpu ./start.sh          # CPU (no GPU required)
# DEVICE=cuda ./start.sh       # GPU (faster)

Terminal 2 – DocGenie API:

cd api
uvicorn main:app --reload

Terminal 3 – Test:

curl http://localhost:8080/health   # Handwriting service
curl http://localhost:8000/health   # API
cd api && python test_api.py

Performance Notes

  • CPU mode: ~5–10 s/word | GPU mode: ~0.5–1 s/word
  • Service processes all words in one batch for efficiency

βš™οΈ Railway-Specific Configuration

Critical Issues & Fixes

1. .dockerignore – Keep required data folders:

!data/prompt_templates/
!data/visual_element_prefabs/

2. railway.json – Start both API and worker:

"startCommand": "cd api && uvicorn main:app --host 0.0.0.0 --port $PORT & rq worker --url $REDIS_URL & wait"

Environment Variables

πŸ”΄ Required

ANTHROPIC_API_KEY=sk-ant-api03-xxx
REDIS_URL=rediss://default:xxx@xxx.upstash.io:6379
HANDWRITING_SERVICE_URL=https://api.runpod.ai/v2/ht9ajgrduitgpr/runsync
HANDWRITING_SERVICE_ENABLED=true
SUPABASE_URL=https://xxx.supabase.co
SUPABASE_KEY=xxx
GOOGLE_CLIENT_ID=xxx.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=xxx

🟑 Recommended

RUNPOD_API_KEY=xxx
OCR_SERVICE_ENABLED=true
OCR_USE_LOCAL=true
OCR_ENGINE=microsoft_di
OCR_DPI=300
HANDWRITING_SERVICE_TIMEOUT=300
HANDWRITING_SERVICE_MAX_RETRIES=3
RQ_QUEUE_NAME=docgenie
LOG_LEVEL=INFO

🟒 Optional (defaults are fine)

API_HOST=0.0.0.0
API_PORT=8000
DEBUG_MODE=false
CLAUDE_MODEL=claude-sonnet-4-5-20250929
CORS_ORIGINS=*
GOOGLE_DRIVE_FOLDER_NAME=DocGenie Documents
TEMP_DIR=/tmp/docgenie_api
HANDWRITING_APPLY_BLUR=false
BBOX_NORMALIZATION_ENABLED=false
GT_VERIFICATION_ENABLED=false
ANALYSIS_ENABLED=false
DEBUG_VISUALIZATION_ENABLED=false

Validation Steps

# 1. Health check
curl https://your-app.up.railway.app/health

# 2. Sync generation
curl -X POST https://your-app.up.railway.app/api/generate \
  -H "Content-Type: application/json" \
  -d '{"document_category": "invoice", "pages": 1}'

# 3. Async generation
curl -X POST https://your-app.up.railway.app/api/async/generate \
  -H "Content-Type: application/json" \
  -d '{"document_category": "invoice", "pages": 1, "google_access_token": "ya29.xxx"}'

Common Railway Issues

Issue Cause Solution
Worker not starting Missing rq worker in start command Check railway.json startCommand
Missing prompt templates .dockerignore too aggressive Add !data/prompt_templates/
Playwright errors Browser not installed Ensure playwright install chromium in Dockerfile
Redis connection errors Wrong REDIS_URL Verify in Railway env variables
Handwriting timeout Batch too large Increase HANDWRITING_SERVICE_TIMEOUT
Large Docker image data/ folders included Check .dockerignore excludes datasets/embeddings

⚑ RunPod Batch Optimization

Problem (Old Parallel Processing)

Each text was sent as a separate RunPod request β†’ N texts = N workers = NΓ— activation cost.

Example: 10 texts β†’ 10 workers Γ— 18 s = 180 worker-seconds + 10Γ— activation fees

Solution (New Batch Processing)

All texts sent in one RunPod request β†’ 1 worker handles everything.

Example: 10 texts β†’ 1 worker Γ— 190 s = 190 worker-seconds + 1Γ— activation fee
Savings: ~45–60% cost reduction (activation fees dominate RunPod pricing)

Batch Request Format (handler.py)

{
  "input": {
    "texts": [
      {"text": "Hello", "author_id": 42, "hw_id": "hw_0"},
      {"text": "World", "author_id": 42, "hw_id": "hw_1"}
    ],
    "apply_blur": true
  }
}

Response:

{
  "status": "COMPLETED",
  "output": {
    "images": [
      {"image_base64": "...", "width": 217, "height": 61, "text": "Hello", "author_id": 42, "hw_id": "hw_0"},
      {"image_base64": "...", "width": 195, "height": 58, "text": "World", "author_id": 42, "hw_id": "hw_1"}
    ],
    "total_generated": 2
  }
}

Note: Backward-compatible – single text requests (old format) are still supported. Handler auto-detects batch vs single based on the "texts" key.

Timeout Configuration

Timeout is dynamically calculated: num_texts Γ— 20 + 30 seconds.
For large batches (20+ texts), set RunPod endpoint max execution time to 600 s.

Cost Comparison

Scenario OLD (parallel) NEW (batched) Savings
2 texts 2 workers Γ— 18 s 1 worker Γ— 38 s ~50%
10 texts 10 workers Γ— 18 s 1 worker Γ— 190 s ~55%
25 texts 25 workers Γ— 18 s 1 worker Γ— 480 s ~60%

Integration Test

cd api
python test_runpod_integration.py