| # π DocGenie Deployment Guide |
|
|
| Complete guide for deploying DocGenie API + Handwriting Service to production with all interdependencies resolved. |
|
|
| ## π System Architecture |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Client β |
| ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Railway (CPU) β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β DocGenie API (Port 8000) β β |
| β β - FastAPI server β β |
| β β - Imports: docgenie.generation.* β β |
| β β - Endpoints: /generate, /generate/pdf, /generate/asyncβ β |
| β ββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β |
| β β β |
| β ββββββββββββββββΌββββββββββββββββββββββββββββββββββββββββ β |
| β β Background Worker β β |
| β β - RQ worker (Redis Queue) β β |
| β β - ClaudeBatchedClient (50% cost savings) β β |
| β β - Imports: docgenie.generation.* β β |
| β ββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ β |
| βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββ |
| β |
| βββββββββββ΄βββββββββββ¬βββββββββββββββ |
| β β β |
| βΌ βΌ βΌ |
| βββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ |
| β Redis (Upstash)β β Supabase β β Google Drive β |
| β - Job queue β β - PostgreSQL β β - File storageβ |
| β - Free tier β β - Document DB β β - OAuth 2.0 β |
| βββββββββββββββββ ββββββββββββββββββββ ββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β RunPod Serverless (GPU) β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β Handwriting Service (Port 8080) β β |
| β β - WordStylist diffusion model β β |
| β β - PyTorch + CUDA 11.8 β β |
| β β - NO docgenie imports (standalone) β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| ## π Dependency Resolution |
|
|
| ### β
Problem: API imports from docgenie package |
| **Solution:** Deploy entire monorepo, install as package with `pip install -e .` |
|
|
| **API Service imports:** |
| ```python |
| # api/worker.py |
| from docgenie.generation.pipeline_01.claude_batching import ClaudeBatchedClient |
| from docgenie import ENV |
| |
| # api/utils.py |
| from docgenie.generation.constants import BS_PARSER, HANDWRITING_CLASS_NAME |
| from docgenie.generation.pipeline_01.claude_batching import create_message |
| from docgenie.generation.pipeline_03_process_response import process_response |
| from docgenie.generation.pipeline_04_render_pdf_and_extract_geos import render_pdf |
| ``` |
|
|
| **Dockerfile solution:** |
| ```dockerfile |
| # Copy entire monorepo |
| COPY . . |
| |
| # Install as editable package |
| RUN pip install -e . |
| |
| # Install API requirements |
| RUN pip install -r api/requirements.txt |
| ``` |
|
|
| ### β
Handwriting Service is Independent |
| **No docgenie imports!** Can be deployed standalone. |
|
|
| ```python |
| # handwriting_service/main.py - NO docgenie imports |
| from handwriting_service.inference import HandwritingGenerator |
| from handwriting_service.models import HandwritingRequest |
| ``` |
|
|
| ## π¦ Pre-Deployment Checklist |
|
|
| ### 1. Environment Variables |
| Create `api/.env` with all required variables: |
|
|
| ```bash |
| # Claude API |
| ANTHROPIC_API_KEY=sk-ant-xxxxx |
| |
| # Redis (will be replaced with Upstash URL) |
| REDIS_URL=redis://localhost:6379 |
| |
| # Handwriting Service |
| HANDWRITING_SERVICE_URL=http://localhost:8080 |
| |
| # Supabase |
| SUPABASE_URL=https://xxxxx.supabase.co |
| SUPABASE_KEY=eyJxxxxx |
| |
| # Google Drive (for token refresh only) |
| # The frontend handles OAuth and sends tokens in API requests |
| # These credentials are only needed to refresh expired tokens during long jobs |
| GOOGLE_CLIENT_ID=xxxxx.apps.googleusercontent.com |
| GOOGLE_CLIENT_SECRET=GOCSPX-xxxxx |
| GOOGLE_DRIVE_FOLDER_NAME=DocGenie Documents |
| ``` |
|
|
| ### 2. Test Locally First |
| ```bash |
| # Terminal 1: Start Redis |
| docker run -p 6379:6379 redis:7-alpine |
| |
| # Terminal 2: Start Handwriting Service |
| cd handwriting_service |
| DEVICE=cpu uvicorn main:app --port 8080 |
| |
| # Terminal 3: Start API |
| cd api |
| source ../.venv/bin/activate |
| uvicorn main:app --reload --port 8000 |
| |
| # Terminal 4: Start Worker |
| cd api |
| source ../.venv/bin/activate |
| python worker.py |
| ``` |
|
|
| Test endpoints: |
| ```bash |
| # Health check |
| curl http://localhost:8000/health |
| |
| # Async generation (uses batched API) |
| curl -X POST http://localhost:8000/generate/async \ |
| -H "Content-Type: application/json" \ |
| -d '{"template_name": "DocGenie", "num_pages": 2}' |
| ``` |
|
|
| ## π’ Deployment Steps |
|
|
| ### Option A: Railway + RunPod (RECOMMENDED - $10/month) |
|
|
| #### Step 1: Deploy Redis to Upstash (FREE) |
|
|
| 1. Go to https://upstash.com |
| 2. Create account β New Redis Database |
| 3. Copy the `UPSTASH_REDIS_REST_URL` (looks like: `redis://default:xxxxx@xxxxx.upstash.io:6379`) |
|
|
| #### Step 2: Deploy Handwriting Service to RunPod |
|
|
| **Option A: Build from Git Repository (RECOMMENDED - No Docker Hub needed!)** |
|
|
| This builds directly on RunPod's servers, avoiding the need to upload 10GB over your internet. |
|
|
| 1. **Prepare and push code to Git:** |
| ```bash |
| cd /media/ahad-hassan/Volume_E/FYP/FYP/docgenie |
| |
| # First, prepare optimized WordStylist (removes 432MB of unnecessary files) |
| cd handwriting_service |
| ./prepare_build.sh |
| cd .. |
| |
| # Now commit the optimized WordStylist |
| git add handwriting_service/ |
| git status # Verify WordStylist is included (should show WordStylist/models/ema_ckpt.pt, etc.) |
| git commit -m "Add handwriting service with optimized WordStylist" |
| git push origin main |
| ``` |
|
|
| 2. **Deploy to RunPod:** |
| - Go to https://runpod.io β Serverless β New Endpoint |
| - Click "Build from Git" (not Docker Image) |
| - Settings: |
| - Name: `docgenie-handwriting` |
| - Git URL: `https://github.com/Ahadhassan-2003/FYP.git` |
| - Git Branch: `main` |
| - Docker Build Context: `docgenie/handwriting_service` |
| - Dockerfile Path: `Dockerfile` |
| - GPU: RTX 4090 or A40 |
| - Container Disk: 15GB |
| - Max Workers: 1 |
| - Idle Timeout: 5 seconds |
| - Exposed Port: 8080 |
| - Environment Variables: |
| ``` |
| DEVICE=cuda |
| PYTHONUNBUFFERED=1 |
| ``` |
| - Build Args (prepare WordStylist): |
| ``` |
| PREPARE_WORDSTYLIST=true |
| ``` |
| - Click "Deploy" |
| |
| RunPod will clone your repo and build the image on their fast servers! |
|
|
| **Option B: Pre-built Docker Image (if Git unavailable)** |
|
|
| <details> |
| <summary>Click to expand Docker Hub method</summary> |
|
|
| ```bash |
| cd handwriting_service |
| |
| # Prepare optimized build (removes 432MB) |
| ./prepare_build.sh |
| |
| # Login to Docker Hub |
| docker login |
| |
| # Build image |
| docker buildx build --platform linux/amd64 \ |
| -t yourusername/docgenie-handwriting:latest \ |
| --build-arg BUILDKIT_INLINE_CACHE=1 \ |
| . |
| |
| # Push to Docker Hub (may take 20-30 minutes for 10GB) |
| docker push yourusername/docgenie-handwriting:latest |
| ``` |
|
|
| Then deploy on RunPod: |
| - Go to https://runpod.io β Serverless β New Endpoint |
| - Docker Image: `yourusername/docgenie-handwriting:latest` |
| - GPU: RTX 4090 or A40 |
| - Port: 8080 |
| - Environment Variables: `DEVICE=cuda` |
|
|
| </details> |
| docker push ahadhassan/docgenie-handwriting:v2 |
| 3. **Get endpoint URL:** |
| - Copy the URL (looks like: `https://api.runpod.ai/v2/xxxxx/runsync`) |
| - This is your `HANDWRITING_SERVICE_URL` |
|
|
| #### Step 3: Deploy API to Railway |
|
|
| 1. **Install Railway CLI:** |
| ```bash |
| # Install Railway CLI |
| npm i -g @railway/cli |
| |
| # Or use curl |
| bash <(curl -fsSL cli.new) railway |
| ``` |
|
|
| 2. **Initialize Railway project:** |
| ```bash |
| cd /media/ahad-hassan/Volume_E/FYP/FYP/docgenie |
| |
| # Login to Railway |
| railway login |
| |
| # Create new project |
| railway init |
| |
| # Link to project (creates railway.json) |
| railway link |
| ``` |
|
|
| 3. **Set environment variables:** |
| ```bash |
| # Set all environment variables from api/.env |
| railway variables set ANTHROPIC_API_KEY=sk-ant-xxxxx |
| railway variables set REDIS_URL=redis://default:xxxxx@xxxxx.upstash.io:6379 |
| railway variables set HANDWRITING_SERVICE_URL=https://api.runpod.ai/v2/xxxxx/runsync |
| railway variables set SUPABASE_URL=https://xxxxx.supabase.co |
| railway variables set SUPABASE_KEY=eyJxxxxx |
| |
| # Google OAuth (for token refresh only - frontend provides tokens in requests) |
| railway variables set GOOGLE_CLIENT_ID=xxxxx.apps.googleusercontent.com |
| railway variables set GOOGLE_CLIENT_SECRET=GOCSPX-xxxxx |
| railway variables set GOOGLE_DRIVE_FOLDER_NAME="DocGenie Documents" |
| ``` |
|
|
| **Note:** Google access/refresh tokens are NOT environment variables! The frontend authenticates with Google OAuth, then passes `google_drive_token` and `google_drive_refresh_token` in the API request body. See [API request schema](api/schemas.py#L108-L114). |
|
|
| 4. **Deploy API + Worker:** |
| ```bash |
| # Railway will detect Dockerfile and deploy automatically |
| railway up |
| |
| # Or connect to GitHub and deploy from there |
| railway connect |
| ``` |
|
|
| 5. **Option 1: Separate Worker Service (For Production Scale):** |
| |
| *Note: Only needed if processing 50+ concurrent jobs. For most use cases, Option 2 (combined) is sufficient.* |
| |
| **Method A: Connect to Same GitHub Repo (Recommended)** |
| - Go to Railway dashboard β Your project β **New Service** |
| - Click **"GitHub Repo"** β Select your repo |
| - Name: `docgenie-worker` |
| - **Settings** β **Deploy**: |
| - Builder: `DOCKERFILE` |
| - Dockerfile Path: `Dockerfile` |
| - Root Directory: `/` (same as API) |
| - **Custom Start Command**: |
| ```bash |
| rq worker --url $REDIS_URL |
| ``` |
| - **Variables**: Add all environment variables (same as API service) |
| - **Deploy** |
| |
| **Method B: Use Same Docker Image as API** |
| - Railway dashboard β New Service β **Empty Service** |
| - Name: `docgenie-worker` |
| - **Settings** β **Source**: Link to API service's image |
| - **Custom Start Command**: `rq worker --url $REDIS_URL` |
| - **Variables**: Copy from API service |
| - **Deploy** |
| |
| 6. **Option 2: Combined API + Worker (Recommended for Getting Started):** |
| |
| Update `railway.json` to run both in one service: |
| ```json |
| { |
| "deploy": { |
| "startCommand": "uvicorn api.main:app --host 0.0.0.0 --port $PORT & rq worker --url $REDIS_URL & wait" |
| } |
| } |
| ``` |
| |
| Then push: |
| ```bash |
| git add railway.json |
| git commit -m "feat: Run API and worker in combined service" |
| git push |
| ``` |
| |
| **Benefits:** |
| - β
Single service ($5/month instead of $10/month) |
| - β
Simpler logs and monitoring |
| - β
Automatic scaling together |
| - β
Good for 90% of use cases |
|
|
| 7. **Get API URL:** |
| - Railway dashboard β API service β Settings β Domains |
| - Generate domain (e.g., `docgenie-api.up.railway.app`) |
|
|
| #### Step 4: Update Frontend |
|
|
| Update your frontend API URL to Railway domain: |
| ```javascript |
| const API_URL = 'https://docgenie-api.up.railway.app'; |
| ``` |
|
|
| ### Option B: AWS EC2 + RunPod (For Production) |
|
|
| #### Prerequisites |
| - AWS account with EC2 access |
| - Domain name (optional, for SSL) |
|
|
| #### Step 1: Launch EC2 Instance |
|
|
| ```bash |
| # Launch t3.medium instance |
| aws ec2 run-instances \ |
| --image-id ami-0c55b159cbfafe1f0 \ |
| --instance-type t3.medium \ |
| --key-name your-key-pair \ |
| --security-group-ids sg-xxxxx \ |
| --subnet-id subnet-xxxxx |
| ``` |
|
|
| **Security Group Rules:** |
| - Port 22 (SSH) - Your IP only |
| - Port 80 (HTTP) - 0.0.0.0/0 |
| - Port 443 (HTTPS) - 0.0.0.0/0 |
| - Port 8000 (API) - 0.0.0.0/0 |
|
|
| #### Step 2: Setup EC2 |
|
|
| ```bash |
| # SSH into instance |
| ssh -i your-key.pem ubuntu@your-ec2-ip |
| |
| # Update system |
| sudo apt update && sudo apt upgrade -y |
| |
| # Install Docker |
| curl -fsSL https://get.docker.com -o get-docker.sh |
| sudo sh get-docker.sh |
| sudo usermod -aG docker ubuntu |
| |
| # Install Docker Compose |
| sudo apt install docker-compose-plugin -y |
| |
| # Install Git |
| sudo apt install git -y |
| |
| # Clone repository |
| git clone https://gitlab.cs.hs-rm.de/diss_lamott/docgenie.git |
| cd docgenie |
| ``` |
|
|
| #### Step 3: Configure Environment |
|
|
| ```bash |
| # Create .env file |
| cd api |
| nano .env |
| |
| # Paste all environment variables |
| # Save: Ctrl+X, Y, Enter |
| |
| # Update REDIS_URL to use Upstash |
| # Update HANDWRITING_SERVICE_URL to RunPod endpoint |
| ``` |
|
|
| #### Step 4: Deploy with Docker Compose |
|
|
| ```bash |
| cd /home/ubuntu/docgenie |
| |
| # Start services (API + Worker + Redis) |
| docker-compose up -d api worker redis |
| |
| # Check logs |
| docker-compose logs -f api |
| docker-compose logs -f worker |
| ``` |
|
|
| #### Step 5: Setup Nginx Reverse Proxy |
|
|
| ```bash |
| # Install Nginx |
| sudo apt install nginx -y |
| |
| # Create config |
| sudo nano /etc/nginx/sites-available/docgenie |
| |
| # Paste configuration: |
| ``` |
|
|
| ```nginx |
| server { |
| listen 80; |
| server_name your-domain.com; # Or use EC2 IP |
| |
| location / { |
| proxy_pass http://localhost:8000; |
| proxy_http_version 1.1; |
| proxy_set_header Upgrade $http_upgrade; |
| proxy_set_header Connection 'upgrade'; |
| proxy_set_header Host $host; |
| proxy_cache_bypass $http_upgrade; |
| proxy_set_header X-Real-IP $remote_addr; |
| proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; |
| proxy_set_header X-Forwarded-Proto $scheme; |
| |
| # Increase timeout for long-running requests |
| proxy_read_timeout 300s; |
| proxy_connect_timeout 75s; |
| } |
| } |
| ``` |
|
|
| ```bash |
| # Enable site |
| sudo ln -s /etc/nginx/sites-available/docgenie /etc/nginx/sites-enabled/ |
| sudo nginx -t |
| sudo systemctl restart nginx |
| |
| # Optional: Setup SSL with Let's Encrypt |
| sudo apt install certbot python3-certbot-nginx -y |
| sudo certbot --nginx -d your-domain.com |
| ``` |
|
|
| #### Step 6: Setup Systemd Service (Auto-restart) |
|
|
| ```bash |
| # Create service file |
| sudo nano /etc/systemd/system/docgenie.service |
| ``` |
|
|
| ```ini |
| [Unit] |
| Description=DocGenie API |
| After=docker.service |
| Requires=docker.service |
| |
| [Service] |
| Type=oneshot |
| RemainAfterExit=yes |
| WorkingDirectory=/home/ubuntu/docgenie |
| ExecStart=/usr/bin/docker-compose up -d api worker redis |
| ExecStop=/usr/bin/docker-compose down |
| User=ubuntu |
| |
| [Install] |
| WantedBy=multi-user.target |
| ``` |
|
|
| ```bash |
| # Enable service |
| sudo systemctl daemon-reload |
| sudo systemctl enable docgenie |
| sudo systemctl start docgenie |
| |
| # Check status |
| sudo systemctl status docgenie |
| ``` |
|
|
| ## π§ͺ Testing Production Deployment |
|
|
| ### 1. Health Check |
| ```bash |
| curl https://your-domain.com/health |
| ``` |
|
|
| ### 2. Sync Generation (Fast) |
| ```bash |
| curl -X POST https://your-domain.com/generate \ |
| -H "Content-Type: application/json" \ |
| -d '{ |
| "template_name": "DocGenie", |
| "num_pages": 1 |
| }' |
| ``` |
|
|
| ### 3. Async Generation (Batched, Cheap) |
| ```bash |
| # Start async job |
| RESPONSE=$(curl -X POST https://your-domain.com/generate/async \ |
| -H "Content-Type: application/json" \ |
| -d '{ |
| "template_name": "DocGenie", |
| "num_pages": 2 |
| }') |
| |
| REQUEST_ID=$(echo $RESPONSE | jq -r '.request_id') |
| echo "Request ID: $REQUEST_ID" |
| |
| # Poll status |
| while true; do |
| STATUS=$(curl -s https://your-domain.com/jobs/$REQUEST_ID/status | jq -r '.status') |
| echo "Status: $STATUS" |
| if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then |
| break |
| fi |
| sleep 10 |
| done |
| |
| # Get result |
| curl https://your-domain.com/jobs/$REQUEST_ID/status | jq |
| ``` |
|
|
| ## π Cost Breakdown |
|
|
| ### Railway + RunPod (Recommended) |
| | Service | Cost | Notes | |
| |---------|------|-------| |
| | Railway (API + Worker) | $5-10/month | Includes 500 hours | |
| | Upstash Redis | FREE | 10K requests/day | |
| | RunPod Serverless GPU | $0.20/hr | Only charged when active | |
| | Supabase | FREE | 500MB database | |
| | **Total** | **~$10-15/month** | + $0.20/hr GPU usage | |
|
|
| ### EC2 + RunPod |
| | Service | Cost | Notes | |
| |---------|------|-------| |
| | EC2 t3.medium | $30/month | 2 vCPU, 4GB RAM | |
| | Upstash Redis | FREE | External Redis | |
| | RunPod Serverless GPU | $0.20/hr | Only when needed | |
| | Supabase | FREE | External DB | |
| | **Total** | **~$30/month** | + $0.20/hr GPU usage | |
|
|
| ### EC2 + Dedicated GPU (Production) |
| | Service | Cost | Notes | |
| |---------|------|-------| |
| | EC2 g4dn.xlarge | $150/month | 4 vCPU, 16GB RAM, T4 GPU | |
| | Supabase | FREE | External DB | |
| | **Total** | **~$150/month** | All-in-one solution | |
|
|
| ## π§ Maintenance |
|
|
| ### Update Deployment |
|
|
| **Railway:** |
| ```bash |
| # Push to main branch (auto-deploy) |
| git push origin main |
| |
| # Or manual deploy |
| railway up |
| ``` |
|
|
| **EC2:** |
| ```bash |
| ssh ubuntu@your-ec2-ip |
| cd docgenie |
| git pull |
| docker-compose down |
| docker-compose up -d --build |
| ``` |
|
|
| ### View Logs |
|
|
| **Railway:** |
| ```bash |
| railway logs |
| ``` |
|
|
| **EC2:** |
| ```bash |
| # API logs |
| docker-compose logs -f api |
| |
| # Worker logs |
| docker-compose logs -f worker |
| |
| # Nginx logs |
| sudo tail -f /var/log/nginx/access.log |
| sudo tail -f /var/log/nginx/error.log |
| ``` |
|
|
| ### Monitor Redis Queue |
|
|
| ```bash |
| # Connect to Redis |
| redis-cli -u $REDIS_URL |
| |
| # Check queue status |
| > LLEN rq:queue:default |
| > LRANGE rq:queue:default 0 -1 |
| ``` |
|
|
| ## π¨ Troubleshooting |
|
|
| ### Issue: Worker can't import docgenie package |
| **Solution:** Dockerfile installs entire monorepo with `pip install -e .` |
|
|
| ### Issue: Handwriting service connection timeout |
| **Solution:** Use RunPod's `/runsync` endpoint, not `/run` (synchronous) |
|
|
| ### Issue: Google token expired during job |
| **Solution:** Ensure `GOOGLE_REFRESH_TOKEN`, `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` are set |
|
|
| ### Issue: Railway build fails (too large) |
| **Solution:** Check `.dockerignore` excludes `data/` folders |
|
|
| ### Issue: Worker heartbeat timeout |
| **Solution:** Job is still running, batched API takes 10-30 minutes |
|
|
| ## π Next Steps |
|
|
| 1. **Monitor costs:** Railway dashboard, RunPod usage page |
| 2. **Setup alerts:** Railway β Settings β Notifications |
| 3. **Scale workers:** Railway β Worker service β Settings β Replicas |
| 4. **Add caching:** Redis cache for generated documents |
| 5. **Setup CI/CD:** GitHub Actions β Railway auto-deploy |
|
|
| ## π You're Done! |
|
|
| Your DocGenie API is now deployed with: |
| - β
All docgenie package imports resolved |
| - β
GPU handwriting service on RunPod |
| - β
Background workers for batched API |
| - β
Auto-scaling and cost optimization |
| - β
Google token refresh working |
| - β
Database schema compatibility |
|
|
| **API URL:** `https://your-domain.com` |
| **Docs:** `https://your-domain.com/docs` |
| **Health:** `https://your-domain.com/health` |
|
|
| --- |
|
|
| ## π₯οΈ Local Testing Guide |
|
|
| ### Architecture |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββ |
| β DocGenie API (Port 8000) ββββ HTTP |
| βββββββββββββββββββββββββββββββββββ β localhost:8080 |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β Handwriting Service (Port 8080) β |
| β - Loads WordStylist model β |
| βββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| ### Prerequisites |
|
|
| 1. **Python environment**: `source .venv/bin/activate` |
| 2. **WordStylist Model** at `WordStylist/models/ckpt.pt` and `ema_ckpt.pt` |
| 3. **`api/.env`** with `ANTHROPIC_API_KEY`, `HANDWRITING_SERVICE_ENABLED=true`, `HANDWRITING_SERVICE_URL=http://localhost:8080` |
|
|
| ### Step-by-Step Setup |
|
|
| **Terminal 1 β Handwriting Service:** |
| ```bash |
| cd handwriting_service |
| DEVICE=cpu ./start.sh # CPU (no GPU required) |
| # DEVICE=cuda ./start.sh # GPU (faster) |
| ``` |
|
|
| **Terminal 2 β DocGenie API:** |
| ```bash |
| cd api |
| uvicorn main:app --reload |
| ``` |
|
|
| **Terminal 3 β Test:** |
| ```bash |
| curl http://localhost:8080/health # Handwriting service |
| curl http://localhost:8000/health # API |
| cd api && python test_api.py |
| ``` |
|
|
| ### Performance Notes |
| - CPU mode: ~5β10 s/word | GPU mode: ~0.5β1 s/word |
| - Service processes all words in one batch for efficiency |
|
|
| --- |
|
|
| ## βοΈ Railway-Specific Configuration |
|
|
| ### Critical Issues & Fixes |
|
|
| **1. `.dockerignore` β Keep required data folders:** |
| ``` |
| !data/prompt_templates/ |
| !data/visual_element_prefabs/ |
| ``` |
|
|
| **2. `railway.json` β Start both API and worker:** |
| ```json |
| "startCommand": "cd api && uvicorn main:app --host 0.0.0.0 --port $PORT & rq worker --url $REDIS_URL & wait" |
| ``` |
|
|
| ### Environment Variables |
|
|
| #### π΄ Required |
| ```bash |
| ANTHROPIC_API_KEY=sk-ant-api03-xxx |
| REDIS_URL=rediss://default:xxx@xxx.upstash.io:6379 |
| HANDWRITING_SERVICE_URL=https://api.runpod.ai/v2/ht9ajgrduitgpr/runsync |
| HANDWRITING_SERVICE_ENABLED=true |
| SUPABASE_URL=https://xxx.supabase.co |
| SUPABASE_KEY=xxx |
| GOOGLE_CLIENT_ID=xxx.apps.googleusercontent.com |
| GOOGLE_CLIENT_SECRET=xxx |
| ``` |
|
|
| #### π‘ Recommended |
| ```bash |
| RUNPOD_API_KEY=xxx |
| OCR_SERVICE_ENABLED=true |
| OCR_USE_LOCAL=true |
| OCR_ENGINE=microsoft_di |
| OCR_DPI=300 |
| HANDWRITING_SERVICE_TIMEOUT=300 |
| HANDWRITING_SERVICE_MAX_RETRIES=3 |
| RQ_QUEUE_NAME=docgenie |
| LOG_LEVEL=INFO |
| ``` |
|
|
| #### π’ Optional (defaults are fine) |
| ```bash |
| API_HOST=0.0.0.0 |
| API_PORT=8000 |
| DEBUG_MODE=false |
| CLAUDE_MODEL=claude-sonnet-4-5-20250929 |
| CORS_ORIGINS=* |
| GOOGLE_DRIVE_FOLDER_NAME=DocGenie Documents |
| TEMP_DIR=/tmp/docgenie_api |
| HANDWRITING_APPLY_BLUR=false |
| BBOX_NORMALIZATION_ENABLED=false |
| GT_VERIFICATION_ENABLED=false |
| ANALYSIS_ENABLED=false |
| DEBUG_VISUALIZATION_ENABLED=false |
| ``` |
|
|
| ### Validation Steps |
|
|
| ```bash |
| # 1. Health check |
| curl https://your-app.up.railway.app/health |
| |
| # 2. Sync generation |
| curl -X POST https://your-app.up.railway.app/api/generate \ |
| -H "Content-Type: application/json" \ |
| -d '{"document_category": "invoice", "pages": 1}' |
| |
| # 3. Async generation |
| curl -X POST https://your-app.up.railway.app/api/async/generate \ |
| -H "Content-Type: application/json" \ |
| -d '{"document_category": "invoice", "pages": 1, "google_access_token": "ya29.xxx"}' |
| ``` |
|
|
| ### Common Railway Issues |
|
|
| | Issue | Cause | Solution | |
| |-------|-------|----------| |
| | Worker not starting | Missing `rq worker` in start command | Check `railway.json` `startCommand` | |
| | Missing prompt templates | `.dockerignore` too aggressive | Add `!data/prompt_templates/` | |
| | Playwright errors | Browser not installed | Ensure `playwright install chromium` in Dockerfile | |
| | Redis connection errors | Wrong `REDIS_URL` | Verify in Railway env variables | |
| | Handwriting timeout | Batch too large | Increase `HANDWRITING_SERVICE_TIMEOUT` | |
| | Large Docker image | `data/` folders included | Check `.dockerignore` excludes datasets/embeddings | |
|
|
| --- |
|
|
| ## β‘ RunPod Batch Optimization |
|
|
| ### Problem (Old Parallel Processing) |
| Each text was sent as a separate RunPod request β N texts = N workers = NΓ activation cost. |
|
|
| **Example:** 10 texts β 10 workers Γ 18 s = 180 worker-seconds + 10Γ activation fees |
|
|
| ### Solution (New Batch Processing) |
| All texts sent in **one** RunPod request β 1 worker handles everything. |
|
|
| **Example:** 10 texts β 1 worker Γ 190 s = 190 worker-seconds + 1Γ activation fee |
| **Savings: ~45β60% cost reduction** (activation fees dominate RunPod pricing) |
|
|
| ### Batch Request Format (handler.py) |
|
|
| ```json |
| { |
| "input": { |
| "texts": [ |
| {"text": "Hello", "author_id": 42, "hw_id": "hw_0"}, |
| {"text": "World", "author_id": 42, "hw_id": "hw_1"} |
| ], |
| "apply_blur": true |
| } |
| } |
| ``` |
|
|
| **Response:** |
| ```json |
| { |
| "status": "COMPLETED", |
| "output": { |
| "images": [ |
| {"image_base64": "...", "width": 217, "height": 61, "text": "Hello", "author_id": 42, "hw_id": "hw_0"}, |
| {"image_base64": "...", "width": 195, "height": 58, "text": "World", "author_id": 42, "hw_id": "hw_1"} |
| ], |
| "total_generated": 2 |
| } |
| } |
| ``` |
|
|
| > **Note:** Backward-compatible β single text requests (old format) are still supported. Handler auto-detects batch vs single based on the `"texts"` key. |
|
|
| ### Timeout Configuration |
| Timeout is dynamically calculated: `num_texts Γ 20 + 30` seconds. |
| For large batches (20+ texts), set RunPod endpoint max execution time to 600 s. |
|
|
| ### Cost Comparison |
|
|
| | Scenario | OLD (parallel) | NEW (batched) | Savings | |
| |----------|---------------|---------------|---------| |
| | 2 texts | 2 workers Γ 18 s | 1 worker Γ 38 s | ~50% | |
| | 10 texts | 10 workers Γ 18 s | 1 worker Γ 190 s | ~55% | |
| | 25 texts | 25 workers Γ 18 s | 1 worker Γ 480 s | ~60% | |
|
|
| ### Integration Test |
| ```bash |
| cd api |
| python test_runpod_integration.py |
| ``` |
|
|