# πŸš€ DocGenie Deployment Guide Complete guide for deploying DocGenie API + Handwriting Service to production with all interdependencies resolved. ## πŸ“Š System Architecture ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Client β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Railway (CPU) β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ DocGenie API (Port 8000) β”‚ β”‚ β”‚ β”‚ - FastAPI server β”‚ β”‚ β”‚ β”‚ - Imports: docgenie.generation.* β”‚ β”‚ β”‚ β”‚ - Endpoints: /generate, /generate/pdf, /generate/asyncβ”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Background Worker β”‚ β”‚ β”‚ β”‚ - RQ worker (Redis Queue) β”‚ β”‚ β”‚ β”‚ - ClaudeBatchedClient (50% cost savings) β”‚ β”‚ β”‚ β”‚ - Imports: docgenie.generation.* β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β–Ό β–Ό β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Redis (Upstash)β”‚ β”‚ Supabase β”‚ β”‚ Google Drive β”‚ β”‚ - Job queue β”‚ β”‚ - PostgreSQL β”‚ β”‚ - File storageβ”‚ β”‚ - Free tier β”‚ β”‚ - Document DB β”‚ β”‚ - OAuth 2.0 β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ RunPod Serverless (GPU) β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Handwriting Service (Port 8080) β”‚ β”‚ β”‚ β”‚ - WordStylist diffusion model β”‚ β”‚ β”‚ β”‚ - PyTorch + CUDA 11.8 β”‚ β”‚ β”‚ β”‚ - NO docgenie imports (standalone) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` ## πŸ”— Dependency Resolution ### βœ… Problem: API imports from docgenie package **Solution:** Deploy entire monorepo, install as package with `pip install -e .` **API Service imports:** ```python # api/worker.py from docgenie.generation.pipeline_01.claude_batching import ClaudeBatchedClient from docgenie import ENV # api/utils.py from docgenie.generation.constants import BS_PARSER, HANDWRITING_CLASS_NAME from docgenie.generation.pipeline_01.claude_batching import create_message from docgenie.generation.pipeline_03_process_response import process_response from docgenie.generation.pipeline_04_render_pdf_and_extract_geos import render_pdf ``` **Dockerfile solution:** ```dockerfile # Copy entire monorepo COPY . . # Install as editable package RUN pip install -e . # Install API requirements RUN pip install -r api/requirements.txt ``` ### βœ… Handwriting Service is Independent **No docgenie imports!** Can be deployed standalone. ```python # handwriting_service/main.py - NO docgenie imports from handwriting_service.inference import HandwritingGenerator from handwriting_service.models import HandwritingRequest ``` ## πŸ“¦ Pre-Deployment Checklist ### 1. Environment Variables Create `api/.env` with all required variables: ```bash # Claude API ANTHROPIC_API_KEY=sk-ant-xxxxx # Redis (will be replaced with Upstash URL) REDIS_URL=redis://localhost:6379 # Handwriting Service HANDWRITING_SERVICE_URL=http://localhost:8080 # Supabase SUPABASE_URL=https://xxxxx.supabase.co SUPABASE_KEY=eyJxxxxx # Google Drive (for token refresh only) # The frontend handles OAuth and sends tokens in API requests # These credentials are only needed to refresh expired tokens during long jobs GOOGLE_CLIENT_ID=xxxxx.apps.googleusercontent.com GOOGLE_CLIENT_SECRET=GOCSPX-xxxxx GOOGLE_DRIVE_FOLDER_NAME=DocGenie Documents ``` ### 2. Test Locally First ```bash # Terminal 1: Start Redis docker run -p 6379:6379 redis:7-alpine # Terminal 2: Start Handwriting Service cd handwriting_service DEVICE=cpu uvicorn main:app --port 8080 # Terminal 3: Start API cd api source ../.venv/bin/activate uvicorn main:app --reload --port 8000 # Terminal 4: Start Worker cd api source ../.venv/bin/activate python worker.py ``` Test endpoints: ```bash # Health check curl http://localhost:8000/health # Async generation (uses batched API) curl -X POST http://localhost:8000/generate/async \ -H "Content-Type: application/json" \ -d '{"template_name": "DocGenie", "num_pages": 2}' ``` ## 🚒 Deployment Steps ### Option A: Railway + RunPod (RECOMMENDED - $10/month) #### Step 1: Deploy Redis to Upstash (FREE) 1. Go to https://upstash.com 2. Create account β†’ New Redis Database 3. Copy the `UPSTASH_REDIS_REST_URL` (looks like: `redis://default:xxxxx@xxxxx.upstash.io:6379`) #### Step 2: Deploy Handwriting Service to RunPod **Option A: Build from Git Repository (RECOMMENDED - No Docker Hub needed!)** This builds directly on RunPod's servers, avoiding the need to upload 10GB over your internet. 1. **Prepare and push code to Git:** ```bash cd /media/ahad-hassan/Volume_E/FYP/FYP/docgenie # First, prepare optimized WordStylist (removes 432MB of unnecessary files) cd handwriting_service ./prepare_build.sh cd .. # Now commit the optimized WordStylist git add handwriting_service/ git status # Verify WordStylist is included (should show WordStylist/models/ema_ckpt.pt, etc.) git commit -m "Add handwriting service with optimized WordStylist" git push origin main ``` 2. **Deploy to RunPod:** - Go to https://runpod.io β†’ Serverless β†’ New Endpoint - Click "Build from Git" (not Docker Image) - Settings: - Name: `docgenie-handwriting` - Git URL: `https://github.com/Ahadhassan-2003/FYP.git` - Git Branch: `main` - Docker Build Context: `docgenie/handwriting_service` - Dockerfile Path: `Dockerfile` - GPU: RTX 4090 or A40 - Container Disk: 15GB - Max Workers: 1 - Idle Timeout: 5 seconds - Exposed Port: 8080 - Environment Variables: ``` DEVICE=cuda PYTHONUNBUFFERED=1 ``` - Build Args (prepare WordStylist): ``` PREPARE_WORDSTYLIST=true ``` - Click "Deploy" RunPod will clone your repo and build the image on their fast servers! **Option B: Pre-built Docker Image (if Git unavailable)**
Click to expand Docker Hub method ```bash cd handwriting_service # Prepare optimized build (removes 432MB) ./prepare_build.sh # Login to Docker Hub docker login # Build image docker buildx build --platform linux/amd64 \ -t yourusername/docgenie-handwriting:latest \ --build-arg BUILDKIT_INLINE_CACHE=1 \ . # Push to Docker Hub (may take 20-30 minutes for 10GB) docker push yourusername/docgenie-handwriting:latest ``` Then deploy on RunPod: - Go to https://runpod.io β†’ Serverless β†’ New Endpoint - Docker Image: `yourusername/docgenie-handwriting:latest` - GPU: RTX 4090 or A40 - Port: 8080 - Environment Variables: `DEVICE=cuda`
docker push ahadhassan/docgenie-handwriting:v2 3. **Get endpoint URL:** - Copy the URL (looks like: `https://api.runpod.ai/v2/xxxxx/runsync`) - This is your `HANDWRITING_SERVICE_URL` #### Step 3: Deploy API to Railway 1. **Install Railway CLI:** ```bash # Install Railway CLI npm i -g @railway/cli # Or use curl bash <(curl -fsSL cli.new) railway ``` 2. **Initialize Railway project:** ```bash cd /media/ahad-hassan/Volume_E/FYP/FYP/docgenie # Login to Railway railway login # Create new project railway init # Link to project (creates railway.json) railway link ``` 3. **Set environment variables:** ```bash # Set all environment variables from api/.env railway variables set ANTHROPIC_API_KEY=sk-ant-xxxxx railway variables set REDIS_URL=redis://default:xxxxx@xxxxx.upstash.io:6379 railway variables set HANDWRITING_SERVICE_URL=https://api.runpod.ai/v2/xxxxx/runsync railway variables set SUPABASE_URL=https://xxxxx.supabase.co railway variables set SUPABASE_KEY=eyJxxxxx # Google OAuth (for token refresh only - frontend provides tokens in requests) railway variables set GOOGLE_CLIENT_ID=xxxxx.apps.googleusercontent.com railway variables set GOOGLE_CLIENT_SECRET=GOCSPX-xxxxx railway variables set GOOGLE_DRIVE_FOLDER_NAME="DocGenie Documents" ``` **Note:** Google access/refresh tokens are NOT environment variables! The frontend authenticates with Google OAuth, then passes `google_drive_token` and `google_drive_refresh_token` in the API request body. See [API request schema](api/schemas.py#L108-L114). 4. **Deploy API + Worker:** ```bash # Railway will detect Dockerfile and deploy automatically railway up # Or connect to GitHub and deploy from there railway connect ``` 5. **Option 1: Separate Worker Service (For Production Scale):** *Note: Only needed if processing 50+ concurrent jobs. For most use cases, Option 2 (combined) is sufficient.* **Method A: Connect to Same GitHub Repo (Recommended)** - Go to Railway dashboard β†’ Your project β†’ **New Service** - Click **"GitHub Repo"** β†’ Select your repo - Name: `docgenie-worker` - **Settings** β†’ **Deploy**: - Builder: `DOCKERFILE` - Dockerfile Path: `Dockerfile` - Root Directory: `/` (same as API) - **Custom Start Command**: ```bash rq worker --url $REDIS_URL ``` - **Variables**: Add all environment variables (same as API service) - **Deploy** **Method B: Use Same Docker Image as API** - Railway dashboard β†’ New Service β†’ **Empty Service** - Name: `docgenie-worker` - **Settings** β†’ **Source**: Link to API service's image - **Custom Start Command**: `rq worker --url $REDIS_URL` - **Variables**: Copy from API service - **Deploy** 6. **Option 2: Combined API + Worker (Recommended for Getting Started):** Update `railway.json` to run both in one service: ```json { "deploy": { "startCommand": "uvicorn api.main:app --host 0.0.0.0 --port $PORT & rq worker --url $REDIS_URL & wait" } } ``` Then push: ```bash git add railway.json git commit -m "feat: Run API and worker in combined service" git push ``` **Benefits:** - βœ… Single service ($5/month instead of $10/month) - βœ… Simpler logs and monitoring - βœ… Automatic scaling together - βœ… Good for 90% of use cases 7. **Get API URL:** - Railway dashboard β†’ API service β†’ Settings β†’ Domains - Generate domain (e.g., `docgenie-api.up.railway.app`) #### Step 4: Update Frontend Update your frontend API URL to Railway domain: ```javascript const API_URL = 'https://docgenie-api.up.railway.app'; ``` ### Option B: AWS EC2 + RunPod (For Production) #### Prerequisites - AWS account with EC2 access - Domain name (optional, for SSL) #### Step 1: Launch EC2 Instance ```bash # Launch t3.medium instance aws ec2 run-instances \ --image-id ami-0c55b159cbfafe1f0 \ --instance-type t3.medium \ --key-name your-key-pair \ --security-group-ids sg-xxxxx \ --subnet-id subnet-xxxxx ``` **Security Group Rules:** - Port 22 (SSH) - Your IP only - Port 80 (HTTP) - 0.0.0.0/0 - Port 443 (HTTPS) - 0.0.0.0/0 - Port 8000 (API) - 0.0.0.0/0 #### Step 2: Setup EC2 ```bash # SSH into instance ssh -i your-key.pem ubuntu@your-ec2-ip # Update system sudo apt update && sudo apt upgrade -y # Install Docker curl -fsSL https://get.docker.com -o get-docker.sh sudo sh get-docker.sh sudo usermod -aG docker ubuntu # Install Docker Compose sudo apt install docker-compose-plugin -y # Install Git sudo apt install git -y # Clone repository git clone https://gitlab.cs.hs-rm.de/diss_lamott/docgenie.git cd docgenie ``` #### Step 3: Configure Environment ```bash # Create .env file cd api nano .env # Paste all environment variables # Save: Ctrl+X, Y, Enter # Update REDIS_URL to use Upstash # Update HANDWRITING_SERVICE_URL to RunPod endpoint ``` #### Step 4: Deploy with Docker Compose ```bash cd /home/ubuntu/docgenie # Start services (API + Worker + Redis) docker-compose up -d api worker redis # Check logs docker-compose logs -f api docker-compose logs -f worker ``` #### Step 5: Setup Nginx Reverse Proxy ```bash # Install Nginx sudo apt install nginx -y # Create config sudo nano /etc/nginx/sites-available/docgenie # Paste configuration: ``` ```nginx server { listen 80; server_name your-domain.com; # Or use EC2 IP location / { proxy_pass http://localhost:8000; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection 'upgrade'; proxy_set_header Host $host; proxy_cache_bypass $http_upgrade; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; # Increase timeout for long-running requests proxy_read_timeout 300s; proxy_connect_timeout 75s; } } ``` ```bash # Enable site sudo ln -s /etc/nginx/sites-available/docgenie /etc/nginx/sites-enabled/ sudo nginx -t sudo systemctl restart nginx # Optional: Setup SSL with Let's Encrypt sudo apt install certbot python3-certbot-nginx -y sudo certbot --nginx -d your-domain.com ``` #### Step 6: Setup Systemd Service (Auto-restart) ```bash # Create service file sudo nano /etc/systemd/system/docgenie.service ``` ```ini [Unit] Description=DocGenie API After=docker.service Requires=docker.service [Service] Type=oneshot RemainAfterExit=yes WorkingDirectory=/home/ubuntu/docgenie ExecStart=/usr/bin/docker-compose up -d api worker redis ExecStop=/usr/bin/docker-compose down User=ubuntu [Install] WantedBy=multi-user.target ``` ```bash # Enable service sudo systemctl daemon-reload sudo systemctl enable docgenie sudo systemctl start docgenie # Check status sudo systemctl status docgenie ``` ## πŸ§ͺ Testing Production Deployment ### 1. Health Check ```bash curl https://your-domain.com/health ``` ### 2. Sync Generation (Fast) ```bash curl -X POST https://your-domain.com/generate \ -H "Content-Type: application/json" \ -d '{ "template_name": "DocGenie", "num_pages": 1 }' ``` ### 3. Async Generation (Batched, Cheap) ```bash # Start async job RESPONSE=$(curl -X POST https://your-domain.com/generate/async \ -H "Content-Type: application/json" \ -d '{ "template_name": "DocGenie", "num_pages": 2 }') REQUEST_ID=$(echo $RESPONSE | jq -r '.request_id') echo "Request ID: $REQUEST_ID" # Poll status while true; do STATUS=$(curl -s https://your-domain.com/jobs/$REQUEST_ID/status | jq -r '.status') echo "Status: $STATUS" if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then break fi sleep 10 done # Get result curl https://your-domain.com/jobs/$REQUEST_ID/status | jq ``` ## πŸ“Š Cost Breakdown ### Railway + RunPod (Recommended) | Service | Cost | Notes | |---------|------|-------| | Railway (API + Worker) | $5-10/month | Includes 500 hours | | Upstash Redis | FREE | 10K requests/day | | RunPod Serverless GPU | $0.20/hr | Only charged when active | | Supabase | FREE | 500MB database | | **Total** | **~$10-15/month** | + $0.20/hr GPU usage | ### EC2 + RunPod | Service | Cost | Notes | |---------|------|-------| | EC2 t3.medium | $30/month | 2 vCPU, 4GB RAM | | Upstash Redis | FREE | External Redis | | RunPod Serverless GPU | $0.20/hr | Only when needed | | Supabase | FREE | External DB | | **Total** | **~$30/month** | + $0.20/hr GPU usage | ### EC2 + Dedicated GPU (Production) | Service | Cost | Notes | |---------|------|-------| | EC2 g4dn.xlarge | $150/month | 4 vCPU, 16GB RAM, T4 GPU | | Supabase | FREE | External DB | | **Total** | **~$150/month** | All-in-one solution | ## πŸ”§ Maintenance ### Update Deployment **Railway:** ```bash # Push to main branch (auto-deploy) git push origin main # Or manual deploy railway up ``` **EC2:** ```bash ssh ubuntu@your-ec2-ip cd docgenie git pull docker-compose down docker-compose up -d --build ``` ### View Logs **Railway:** ```bash railway logs ``` **EC2:** ```bash # API logs docker-compose logs -f api # Worker logs docker-compose logs -f worker # Nginx logs sudo tail -f /var/log/nginx/access.log sudo tail -f /var/log/nginx/error.log ``` ### Monitor Redis Queue ```bash # Connect to Redis redis-cli -u $REDIS_URL # Check queue status > LLEN rq:queue:default > LRANGE rq:queue:default 0 -1 ``` ## 🚨 Troubleshooting ### Issue: Worker can't import docgenie package **Solution:** Dockerfile installs entire monorepo with `pip install -e .` ### Issue: Handwriting service connection timeout **Solution:** Use RunPod's `/runsync` endpoint, not `/run` (synchronous) ### Issue: Google token expired during job **Solution:** Ensure `GOOGLE_REFRESH_TOKEN`, `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` are set ### Issue: Railway build fails (too large) **Solution:** Check `.dockerignore` excludes `data/` folders ### Issue: Worker heartbeat timeout **Solution:** Job is still running, batched API takes 10-30 minutes ## πŸ“š Next Steps 1. **Monitor costs:** Railway dashboard, RunPod usage page 2. **Setup alerts:** Railway β†’ Settings β†’ Notifications 3. **Scale workers:** Railway β†’ Worker service β†’ Settings β†’ Replicas 4. **Add caching:** Redis cache for generated documents 5. **Setup CI/CD:** GitHub Actions β†’ Railway auto-deploy ## πŸŽ‰ You're Done! Your DocGenie API is now deployed with: - βœ… All docgenie package imports resolved - βœ… GPU handwriting service on RunPod - βœ… Background workers for batched API - βœ… Auto-scaling and cost optimization - βœ… Google token refresh working - βœ… Database schema compatibility **API URL:** `https://your-domain.com` **Docs:** `https://your-domain.com/docs` **Health:** `https://your-domain.com/health` --- ## πŸ–₯️ Local Testing Guide ### Architecture ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ DocGenie API (Port 8000) │──┐ HTTP β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ localhost:8080 β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Handwriting Service (Port 8080) β”‚ β”‚ - Loads WordStylist model β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` ### Prerequisites 1. **Python environment**: `source .venv/bin/activate` 2. **WordStylist Model** at `WordStylist/models/ckpt.pt` and `ema_ckpt.pt` 3. **`api/.env`** with `ANTHROPIC_API_KEY`, `HANDWRITING_SERVICE_ENABLED=true`, `HANDWRITING_SERVICE_URL=http://localhost:8080` ### Step-by-Step Setup **Terminal 1 – Handwriting Service:** ```bash cd handwriting_service DEVICE=cpu ./start.sh # CPU (no GPU required) # DEVICE=cuda ./start.sh # GPU (faster) ``` **Terminal 2 – DocGenie API:** ```bash cd api uvicorn main:app --reload ``` **Terminal 3 – Test:** ```bash curl http://localhost:8080/health # Handwriting service curl http://localhost:8000/health # API cd api && python test_api.py ``` ### Performance Notes - CPU mode: ~5–10 s/word | GPU mode: ~0.5–1 s/word - Service processes all words in one batch for efficiency --- ## βš™οΈ Railway-Specific Configuration ### Critical Issues & Fixes **1. `.dockerignore` – Keep required data folders:** ``` !data/prompt_templates/ !data/visual_element_prefabs/ ``` **2. `railway.json` – Start both API and worker:** ```json "startCommand": "cd api && uvicorn main:app --host 0.0.0.0 --port $PORT & rq worker --url $REDIS_URL & wait" ``` ### Environment Variables #### πŸ”΄ Required ```bash ANTHROPIC_API_KEY=sk-ant-api03-xxx REDIS_URL=rediss://default:xxx@xxx.upstash.io:6379 HANDWRITING_SERVICE_URL=https://api.runpod.ai/v2/ht9ajgrduitgpr/runsync HANDWRITING_SERVICE_ENABLED=true SUPABASE_URL=https://xxx.supabase.co SUPABASE_KEY=xxx GOOGLE_CLIENT_ID=xxx.apps.googleusercontent.com GOOGLE_CLIENT_SECRET=xxx ``` #### 🟑 Recommended ```bash RUNPOD_API_KEY=xxx OCR_SERVICE_ENABLED=true OCR_USE_LOCAL=true OCR_ENGINE=microsoft_di OCR_DPI=300 HANDWRITING_SERVICE_TIMEOUT=300 HANDWRITING_SERVICE_MAX_RETRIES=3 RQ_QUEUE_NAME=docgenie LOG_LEVEL=INFO ``` #### 🟒 Optional (defaults are fine) ```bash API_HOST=0.0.0.0 API_PORT=8000 DEBUG_MODE=false CLAUDE_MODEL=claude-sonnet-4-5-20250929 CORS_ORIGINS=* GOOGLE_DRIVE_FOLDER_NAME=DocGenie Documents TEMP_DIR=/tmp/docgenie_api HANDWRITING_APPLY_BLUR=false BBOX_NORMALIZATION_ENABLED=false GT_VERIFICATION_ENABLED=false ANALYSIS_ENABLED=false DEBUG_VISUALIZATION_ENABLED=false ``` ### Validation Steps ```bash # 1. Health check curl https://your-app.up.railway.app/health # 2. Sync generation curl -X POST https://your-app.up.railway.app/api/generate \ -H "Content-Type: application/json" \ -d '{"document_category": "invoice", "pages": 1}' # 3. Async generation curl -X POST https://your-app.up.railway.app/api/async/generate \ -H "Content-Type: application/json" \ -d '{"document_category": "invoice", "pages": 1, "google_access_token": "ya29.xxx"}' ``` ### Common Railway Issues | Issue | Cause | Solution | |-------|-------|----------| | Worker not starting | Missing `rq worker` in start command | Check `railway.json` `startCommand` | | Missing prompt templates | `.dockerignore` too aggressive | Add `!data/prompt_templates/` | | Playwright errors | Browser not installed | Ensure `playwright install chromium` in Dockerfile | | Redis connection errors | Wrong `REDIS_URL` | Verify in Railway env variables | | Handwriting timeout | Batch too large | Increase `HANDWRITING_SERVICE_TIMEOUT` | | Large Docker image | `data/` folders included | Check `.dockerignore` excludes datasets/embeddings | --- ## ⚑ RunPod Batch Optimization ### Problem (Old Parallel Processing) Each text was sent as a separate RunPod request β†’ N texts = N workers = NΓ— activation cost. **Example:** 10 texts β†’ 10 workers Γ— 18 s = 180 worker-seconds + 10Γ— activation fees ### Solution (New Batch Processing) All texts sent in **one** RunPod request β†’ 1 worker handles everything. **Example:** 10 texts β†’ 1 worker Γ— 190 s = 190 worker-seconds + 1Γ— activation fee **Savings: ~45–60% cost reduction** (activation fees dominate RunPod pricing) ### Batch Request Format (handler.py) ```json { "input": { "texts": [ {"text": "Hello", "author_id": 42, "hw_id": "hw_0"}, {"text": "World", "author_id": 42, "hw_id": "hw_1"} ], "apply_blur": true } } ``` **Response:** ```json { "status": "COMPLETED", "output": { "images": [ {"image_base64": "...", "width": 217, "height": 61, "text": "Hello", "author_id": 42, "hw_id": "hw_0"}, {"image_base64": "...", "width": 195, "height": 58, "text": "World", "author_id": 42, "hw_id": "hw_1"} ], "total_generated": 2 } } ``` > **Note:** Backward-compatible – single text requests (old format) are still supported. Handler auto-detects batch vs single based on the `"texts"` key. ### Timeout Configuration Timeout is dynamically calculated: `num_texts Γ— 20 + 30` seconds. For large batches (20+ texts), set RunPod endpoint max execution time to 600 s. ### Cost Comparison | Scenario | OLD (parallel) | NEW (batched) | Savings | |----------|---------------|---------------|---------| | 2 texts | 2 workers Γ— 18 s | 1 worker Γ— 38 s | ~50% | | 10 texts | 10 workers Γ— 18 s | 1 worker Γ— 190 s | ~55% | | 25 texts | 25 workers Γ— 18 s | 1 worker Γ— 480 s | ~60% | ### Integration Test ```bash cd api python test_runpod_integration.py ```