# 🚀 DocGenie Deployment Guide

Complete guide for deploying DocGenie API + Handwriting Service to production with all interdependencies resolved.

## 📊 System Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                         Client                               │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│                    Railway (CPU)                             │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  DocGenie API (Port 8000)                           │   │
│  │  - FastAPI server                                     │   │
│  │  - Imports: docgenie.generation.*                     │   │
│  │  - Endpoints: /generate, /generate/pdf, /generate/async│  │
│  └──────────────┬───────────────────────────────────────┘   │
│                 │                                            │
│  ┌──────────────▼───────────────────────────────────────┐   │
│  │  Background Worker                                    │   │
│  │  - RQ worker (Redis Queue)                           │   │
│  │  - ClaudeBatchedClient (50% cost savings)            │   │
│  │  - Imports: docgenie.generation.*                     │   │
│  └──────────────┬───────────────────────────────────────┘   │
└─────────────────┼────────────────────────────────────────────┘
                  │
        ┌─────────┴──────────┬──────────────┐
        │                    │              │
        ▼                    ▼              ▼
┌───────────────┐  ┌──────────────────┐  ┌──────────────┐
│ Redis (Upstash)│  │ Supabase         │  │ Google Drive │
│ - Job queue    │  │ - PostgreSQL     │  │ - File storage│
│ - Free tier    │  │ - Document DB    │  │ - OAuth 2.0  │
└───────────────┘  └──────────────────┘  └──────────────┘
        │
        ▼
┌─────────────────────────────────────────────────────────────┐
│             RunPod Serverless (GPU)                          │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  Handwriting Service (Port 8080)                     │   │
│  │  - WordStylist diffusion model                        │   │
│  │  - PyTorch + CUDA 11.8                                │   │
│  │  - NO docgenie imports (standalone)                   │   │
│  └──────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘
```

## 🔗 Dependency Resolution

### ✅ Problem: API imports from docgenie package
**Solution:** Deploy entire monorepo, install as package with `pip install -e .`

**API Service imports:**
```python
# api/worker.py
from docgenie.generation.pipeline_01.claude_batching import ClaudeBatchedClient
from docgenie import ENV

# api/utils.py
from docgenie.generation.constants import BS_PARSER, HANDWRITING_CLASS_NAME
from docgenie.generation.pipeline_01.claude_batching import create_message
from docgenie.generation.pipeline_03_process_response import process_response
from docgenie.generation.pipeline_04_render_pdf_and_extract_geos import render_pdf
```

**Dockerfile solution:**
```dockerfile
# Copy entire monorepo
COPY . .

# Install as editable package
RUN pip install -e .

# Install API requirements
RUN pip install -r api/requirements.txt
```

### ✅ Handwriting Service is Independent
**No docgenie imports!** Can be deployed standalone.

```python
# handwriting_service/main.py - NO docgenie imports
from handwriting_service.inference import HandwritingGenerator
from handwriting_service.models import HandwritingRequest
```

## 📦 Pre-Deployment Checklist

### 1. Environment Variables
Create `api/.env` with all required variables:

```bash
# Claude API
ANTHROPIC_API_KEY=sk-ant-xxxxx

# Redis (will be replaced with Upstash URL)
REDIS_URL=redis://localhost:6379

# Handwriting Service
HANDWRITING_SERVICE_URL=http://localhost:8080

# Supabase
SUPABASE_URL=https://xxxxx.supabase.co
SUPABASE_KEY=eyJxxxxx

# Google Drive (for token refresh only)
# The frontend handles OAuth and sends tokens in API requests
# These credentials are only needed to refresh expired tokens during long jobs
GOOGLE_CLIENT_ID=xxxxx.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=GOCSPX-xxxxx
GOOGLE_DRIVE_FOLDER_NAME=DocGenie Documents
```

### 2. Test Locally First
```bash
# Terminal 1: Start Redis
docker run -p 6379:6379 redis:7-alpine

# Terminal 2: Start Handwriting Service
cd handwriting_service
DEVICE=cpu uvicorn main:app --port 8080

# Terminal 3: Start API
cd api
source ../.venv/bin/activate
uvicorn main:app --reload --port 8000

# Terminal 4: Start Worker
cd api
source ../.venv/bin/activate
python worker.py
```

Test endpoints:
```bash
# Health check
curl http://localhost:8000/health

# Async generation (uses batched API)
curl -X POST http://localhost:8000/generate/async \
  -H "Content-Type: application/json" \
  -d '{"template_name": "DocGenie", "num_pages": 2}'
```

## 🚢 Deployment Steps

### Option A: Railway + RunPod (RECOMMENDED - $10/month)

#### Step 1: Deploy Redis to Upstash (FREE)

1. Go to https://upstash.com
2. Create account → New Redis Database
3. Copy the `UPSTASH_REDIS_REST_URL` (looks like: `redis://default:xxxxx@xxxxx.upstash.io:6379`)

#### Step 2: Deploy Handwriting Service to RunPod

**Option A: Build from Git Repository (RECOMMENDED - No Docker Hub needed!)**

This builds directly on RunPod's servers, avoiding the need to upload 10GB over your internet.

1. **Prepare and push code to Git:**
```bash
cd /media/ahad-hassan/Volume_E/FYP/FYP/docgenie

# First, prepare optimized WordStylist (removes 432MB of unnecessary files)
cd handwriting_service
./prepare_build.sh
cd ..

# Now commit the optimized WordStylist
git add handwriting_service/
git status  # Verify WordStylist is included (should show WordStylist/models/ema_ckpt.pt, etc.)
git commit -m "Add handwriting service with optimized WordStylist"
git push origin main
```

2. **Deploy to RunPod:**
   - Go to https://runpod.io → Serverless → New Endpoint
   - Click "Build from Git" (not Docker Image)
   - Settings:
     - Name: `docgenie-handwriting`
     - Git URL: `https://github.com/Ahadhassan-2003/FYP.git`
     - Git Branch: `main`
     - Docker Build Context: `docgenie/handwriting_service`
     - Dockerfile Path: `Dockerfile`
     - GPU: RTX 4090 or A40
     - Container Disk: 15GB
     - Max Workers: 1
     - Idle Timeout: 5 seconds
     - Exposed Port: 8080
   - Environment Variables:
     ```
     DEVICE=cuda
     PYTHONUNBUFFERED=1
     ```
   - Build Args (prepare WordStylist):
     ```
     PREPARE_WORDSTYLIST=true
     ```
   - Click "Deploy"

RunPod will clone your repo and build the image on their fast servers!

**Option B: Pre-built Docker Image (if Git unavailable)**

<details>
<summary>Click to expand Docker Hub method</summary>

```bash
cd handwriting_service

# Prepare optimized build (removes 432MB)
./prepare_build.sh

# Login to Docker Hub
docker login

# Build image
docker buildx build --platform linux/amd64 \
  -t yourusername/docgenie-handwriting:latest \
  --build-arg BUILDKIT_INLINE_CACHE=1 \
  .

# Push to Docker Hub (may take 20-30 minutes for 10GB)
docker push yourusername/docgenie-handwriting:latest
```

Then deploy on RunPod:
   - Go to https://runpod.io → Serverless → New Endpoint
   - Docker Image: `yourusername/docgenie-handwriting:latest`
   - GPU: RTX 4090 or A40
   - Port: 8080
   - Environment Variables: `DEVICE=cuda`

</details>
docker push ahadhassan/docgenie-handwriting:v2
3. **Get endpoint URL:**
   - Copy the URL (looks like: `https://api.runpod.ai/v2/xxxxx/runsync`)
   - This is your `HANDWRITING_SERVICE_URL`

#### Step 3: Deploy API to Railway

1. **Install Railway CLI:**
```bash
# Install Railway CLI
npm i -g @railway/cli

# Or use curl
bash <(curl -fsSL cli.new) railway
```

2. **Initialize Railway project:**
```bash
cd /media/ahad-hassan/Volume_E/FYP/FYP/docgenie

# Login to Railway
railway login

# Create new project
railway init

# Link to project (creates railway.json)
railway link
```

3. **Set environment variables:**
```bash
# Set all environment variables from api/.env
railway variables set ANTHROPIC_API_KEY=sk-ant-xxxxx
railway variables set REDIS_URL=redis://default:xxxxx@xxxxx.upstash.io:6379
railway variables set HANDWRITING_SERVICE_URL=https://api.runpod.ai/v2/xxxxx/runsync
railway variables set SUPABASE_URL=https://xxxxx.supabase.co
railway variables set SUPABASE_KEY=eyJxxxxx

# Google OAuth (for token refresh only - frontend provides tokens in requests)
railway variables set GOOGLE_CLIENT_ID=xxxxx.apps.googleusercontent.com
railway variables set GOOGLE_CLIENT_SECRET=GOCSPX-xxxxx
railway variables set GOOGLE_DRIVE_FOLDER_NAME="DocGenie Documents"
```

**Note:** Google access/refresh tokens are NOT environment variables! The frontend authenticates with Google OAuth, then passes `google_drive_token` and `google_drive_refresh_token` in the API request body. See [API request schema](api/schemas.py#L108-L114).

4. **Deploy API + Worker:**
```bash
# Railway will detect Dockerfile and deploy automatically
railway up

# Or connect to GitHub and deploy from there
railway connect
```

5. **Option 1: Separate Worker Service (For Production Scale):**
   
   *Note: Only needed if processing 50+ concurrent jobs. For most use cases, Option 2 (combined) is sufficient.*
   
   **Method A: Connect to Same GitHub Repo (Recommended)**
   - Go to Railway dashboard → Your project → **New Service**
   - Click **"GitHub Repo"** → Select your repo
   - Name: `docgenie-worker`
   - **Settings** → **Deploy**:
     - Builder: `DOCKERFILE`
     - Dockerfile Path: `Dockerfile`
     - Root Directory: `/` (same as API)
     - **Custom Start Command**:
       ```bash
       rq worker --url $REDIS_URL
       ```
   - **Variables**: Add all environment variables (same as API service)
   - **Deploy**
   
   **Method B: Use Same Docker Image as API**
   - Railway dashboard → New Service → **Empty Service**
   - Name: `docgenie-worker`
   - **Settings** → **Source**: Link to API service's image
   - **Custom Start Command**: `rq worker --url $REDIS_URL`
   - **Variables**: Copy from API service
   - **Deploy**

6. **Option 2: Combined API + Worker (Recommended for Getting Started):**
   
   Update `railway.json` to run both in one service:
   ```json
   {
     "deploy": {
       "startCommand": "uvicorn api.main:app --host 0.0.0.0 --port $PORT & rq worker --url $REDIS_URL & wait"
     }
   }
   ```
   
   Then push:
   ```bash
   git add railway.json
   git commit -m "feat: Run API and worker in combined service"
   git push
   ```
   
   **Benefits:**
   - ✅ Single service ($5/month instead of $10/month)
   - ✅ Simpler logs and monitoring
   - ✅ Automatic scaling together
   - ✅ Good for 90% of use cases

7. **Get API URL:**
   - Railway dashboard → API service → Settings → Domains
   - Generate domain (e.g., `docgenie-api.up.railway.app`)

#### Step 4: Update Frontend

Update your frontend API URL to Railway domain:
```javascript
const API_URL = 'https://docgenie-api.up.railway.app';
```

### Option B: AWS EC2 + RunPod (For Production)

#### Prerequisites
- AWS account with EC2 access
- Domain name (optional, for SSL)

#### Step 1: Launch EC2 Instance

```bash
# Launch t3.medium instance
aws ec2 run-instances \
  --image-id ami-0c55b159cbfafe1f0 \
  --instance-type t3.medium \
  --key-name your-key-pair \
  --security-group-ids sg-xxxxx \
  --subnet-id subnet-xxxxx
```

**Security Group Rules:**
- Port 22 (SSH) - Your IP only
- Port 80 (HTTP) - 0.0.0.0/0
- Port 443 (HTTPS) - 0.0.0.0/0
- Port 8000 (API) - 0.0.0.0/0

#### Step 2: Setup EC2

```bash
# SSH into instance
ssh -i your-key.pem ubuntu@your-ec2-ip

# Update system
sudo apt update && sudo apt upgrade -y

# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu

# Install Docker Compose
sudo apt install docker-compose-plugin -y

# Install Git
sudo apt install git -y

# Clone repository
git clone https://gitlab.cs.hs-rm.de/diss_lamott/docgenie.git
cd docgenie
```

#### Step 3: Configure Environment

```bash
# Create .env file
cd api
nano .env

# Paste all environment variables
# Save: Ctrl+X, Y, Enter

# Update REDIS_URL to use Upstash
# Update HANDWRITING_SERVICE_URL to RunPod endpoint
```

#### Step 4: Deploy with Docker Compose

```bash
cd /home/ubuntu/docgenie

# Start services (API + Worker + Redis)
docker-compose up -d api worker redis

# Check logs
docker-compose logs -f api
docker-compose logs -f worker
```

#### Step 5: Setup Nginx Reverse Proxy

```bash
# Install Nginx
sudo apt install nginx -y

# Create config
sudo nano /etc/nginx/sites-available/docgenie

# Paste configuration:
```

```nginx
server {
    listen 80;
    server_name your-domain.com;  # Or use EC2 IP

    location / {
        proxy_pass http://localhost:8000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_cache_bypass $http_upgrade;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        
        # Increase timeout for long-running requests
        proxy_read_timeout 300s;
        proxy_connect_timeout 75s;
    }
}
```

```bash
# Enable site
sudo ln -s /etc/nginx/sites-available/docgenie /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl restart nginx

# Optional: Setup SSL with Let's Encrypt
sudo apt install certbot python3-certbot-nginx -y
sudo certbot --nginx -d your-domain.com
```

#### Step 6: Setup Systemd Service (Auto-restart)

```bash
# Create service file
sudo nano /etc/systemd/system/docgenie.service
```

```ini
[Unit]
Description=DocGenie API
After=docker.service
Requires=docker.service

[Service]
Type=oneshot
RemainAfterExit=yes
WorkingDirectory=/home/ubuntu/docgenie
ExecStart=/usr/bin/docker-compose up -d api worker redis
ExecStop=/usr/bin/docker-compose down
User=ubuntu

[Install]
WantedBy=multi-user.target
```

```bash
# Enable service
sudo systemctl daemon-reload
sudo systemctl enable docgenie
sudo systemctl start docgenie

# Check status
sudo systemctl status docgenie
```

## 🧪 Testing Production Deployment

### 1. Health Check
```bash
curl https://your-domain.com/health
```

### 2. Sync Generation (Fast)
```bash
curl -X POST https://your-domain.com/generate \
  -H "Content-Type: application/json" \
  -d '{
    "template_name": "DocGenie",
    "num_pages": 1
  }'
```

### 3. Async Generation (Batched, Cheap)
```bash
# Start async job
RESPONSE=$(curl -X POST https://your-domain.com/generate/async \
  -H "Content-Type: application/json" \
  -d '{
    "template_name": "DocGenie",
    "num_pages": 2
  }')

REQUEST_ID=$(echo $RESPONSE | jq -r '.request_id')
echo "Request ID: $REQUEST_ID"

# Poll status
while true; do
  STATUS=$(curl -s https://your-domain.com/jobs/$REQUEST_ID/status | jq -r '.status')
  echo "Status: $STATUS"
  if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then
    break
  fi
  sleep 10
done

# Get result
curl https://your-domain.com/jobs/$REQUEST_ID/status | jq
```

## 📊 Cost Breakdown

### Railway + RunPod (Recommended)
| Service | Cost | Notes |
|---------|------|-------|
| Railway (API + Worker) | $5-10/month | Includes 500 hours |
| Upstash Redis | FREE | 10K requests/day |
| RunPod Serverless GPU | $0.20/hr | Only charged when active |
| Supabase | FREE | 500MB database |
| **Total** | **~$10-15/month** | + $0.20/hr GPU usage |

### EC2 + RunPod
| Service | Cost | Notes |
|---------|------|-------|
| EC2 t3.medium | $30/month | 2 vCPU, 4GB RAM |
| Upstash Redis | FREE | External Redis |
| RunPod Serverless GPU | $0.20/hr | Only when needed |
| Supabase | FREE | External DB |
| **Total** | **~$30/month** | + $0.20/hr GPU usage |

### EC2 + Dedicated GPU (Production)
| Service | Cost | Notes |
|---------|------|-------|
| EC2 g4dn.xlarge | $150/month | 4 vCPU, 16GB RAM, T4 GPU |
| Supabase | FREE | External DB |
| **Total** | **~$150/month** | All-in-one solution |

## 🔧 Maintenance

### Update Deployment

**Railway:**
```bash
# Push to main branch (auto-deploy)
git push origin main

# Or manual deploy
railway up
```

**EC2:**
```bash
ssh ubuntu@your-ec2-ip
cd docgenie
git pull
docker-compose down
docker-compose up -d --build
```

### View Logs

**Railway:**
```bash
railway logs
```

**EC2:**
```bash
# API logs
docker-compose logs -f api

# Worker logs
docker-compose logs -f worker

# Nginx logs
sudo tail -f /var/log/nginx/access.log
sudo tail -f /var/log/nginx/error.log
```

### Monitor Redis Queue

```bash
# Connect to Redis
redis-cli -u $REDIS_URL

# Check queue status
> LLEN rq:queue:default
> LRANGE rq:queue:default 0 -1
```

## 🚨 Troubleshooting

### Issue: Worker can't import docgenie package
**Solution:** Dockerfile installs entire monorepo with `pip install -e .`

### Issue: Handwriting service connection timeout
**Solution:** Use RunPod's `/runsync` endpoint, not `/run` (synchronous)

### Issue: Google token expired during job
**Solution:** Ensure `GOOGLE_REFRESH_TOKEN`, `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` are set

### Issue: Railway build fails (too large)
**Solution:** Check `.dockerignore` excludes `data/` folders

### Issue: Worker heartbeat timeout
**Solution:** Job is still running, batched API takes 10-30 minutes

## 📚 Next Steps

1. **Monitor costs:** Railway dashboard, RunPod usage page
2. **Setup alerts:** Railway → Settings → Notifications
3. **Scale workers:** Railway → Worker service → Settings → Replicas
4. **Add caching:** Redis cache for generated documents
5. **Setup CI/CD:** GitHub Actions → Railway auto-deploy

## 🎉 You're Done!

Your DocGenie API is now deployed with:
- ✅ All docgenie package imports resolved
- ✅ GPU handwriting service on RunPod
- ✅ Background workers for batched API
- ✅ Auto-scaling and cost optimization
- ✅ Google token refresh working
- ✅ Database schema compatibility

**API URL:** `https://your-domain.com`  
**Docs:** `https://your-domain.com/docs`  
**Health:** `https://your-domain.com/health`

---

## 🖥️ Local Testing Guide

### Architecture

```
┌─────────────────────────────────┐
│   DocGenie API (Port 8000)      │──┐ HTTP
└─────────────────────────────────┘  │ localhost:8080
                                     ▼
┌─────────────────────────────────┐
│ Handwriting Service (Port 8080) │
│ - Loads WordStylist model       │
└─────────────────────────────────┘
```

### Prerequisites

1. **Python environment**: `source .venv/bin/activate`
2. **WordStylist Model** at `WordStylist/models/ckpt.pt` and `ema_ckpt.pt`
3. **`api/.env`** with `ANTHROPIC_API_KEY`, `HANDWRITING_SERVICE_ENABLED=true`, `HANDWRITING_SERVICE_URL=http://localhost:8080`

### Step-by-Step Setup

**Terminal 1 – Handwriting Service:**
```bash
cd handwriting_service
DEVICE=cpu ./start.sh          # CPU (no GPU required)
# DEVICE=cuda ./start.sh       # GPU (faster)
```

**Terminal 2 – DocGenie API:**
```bash
cd api
uvicorn main:app --reload
```

**Terminal 3 – Test:**
```bash
curl http://localhost:8080/health   # Handwriting service
curl http://localhost:8000/health   # API
cd api && python test_api.py
```

### Performance Notes
- CPU mode: ~5–10 s/word | GPU mode: ~0.5–1 s/word
- Service processes all words in one batch for efficiency

---

## ⚙️ Railway-Specific Configuration

### Critical Issues & Fixes

**1. `.dockerignore` – Keep required data folders:**
```
!data/prompt_templates/
!data/visual_element_prefabs/
```

**2. `railway.json` – Start both API and worker:**
```json
"startCommand": "cd api && uvicorn main:app --host 0.0.0.0 --port $PORT & rq worker --url $REDIS_URL & wait"
```

### Environment Variables

#### 🔴 Required
```bash
ANTHROPIC_API_KEY=sk-ant-api03-xxx
REDIS_URL=rediss://default:xxx@xxx.upstash.io:6379
HANDWRITING_SERVICE_URL=https://api.runpod.ai/v2/ht9ajgrduitgpr/runsync
HANDWRITING_SERVICE_ENABLED=true
SUPABASE_URL=https://xxx.supabase.co
SUPABASE_KEY=xxx
GOOGLE_CLIENT_ID=xxx.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=xxx
```

#### 🟡 Recommended
```bash
RUNPOD_API_KEY=xxx
OCR_SERVICE_ENABLED=true
OCR_USE_LOCAL=true
OCR_ENGINE=microsoft_di
OCR_DPI=300
HANDWRITING_SERVICE_TIMEOUT=300
HANDWRITING_SERVICE_MAX_RETRIES=3
RQ_QUEUE_NAME=docgenie
LOG_LEVEL=INFO
```

#### 🟢 Optional (defaults are fine)
```bash
API_HOST=0.0.0.0
API_PORT=8000
DEBUG_MODE=false
CLAUDE_MODEL=claude-sonnet-4-5-20250929
CORS_ORIGINS=*
GOOGLE_DRIVE_FOLDER_NAME=DocGenie Documents
TEMP_DIR=/tmp/docgenie_api
HANDWRITING_APPLY_BLUR=false
BBOX_NORMALIZATION_ENABLED=false
GT_VERIFICATION_ENABLED=false
ANALYSIS_ENABLED=false
DEBUG_VISUALIZATION_ENABLED=false
```

### Validation Steps

```bash
# 1. Health check
curl https://your-app.up.railway.app/health

# 2. Sync generation
curl -X POST https://your-app.up.railway.app/api/generate \
  -H "Content-Type: application/json" \
  -d '{"document_category": "invoice", "pages": 1}'

# 3. Async generation
curl -X POST https://your-app.up.railway.app/api/async/generate \
  -H "Content-Type: application/json" \
  -d '{"document_category": "invoice", "pages": 1, "google_access_token": "ya29.xxx"}'
```

### Common Railway Issues

| Issue | Cause | Solution |
|-------|-------|----------|
| Worker not starting | Missing `rq worker` in start command | Check `railway.json` `startCommand` |
| Missing prompt templates | `.dockerignore` too aggressive | Add `!data/prompt_templates/` |
| Playwright errors | Browser not installed | Ensure `playwright install chromium` in Dockerfile |
| Redis connection errors | Wrong `REDIS_URL` | Verify in Railway env variables |
| Handwriting timeout | Batch too large | Increase `HANDWRITING_SERVICE_TIMEOUT` |
| Large Docker image | `data/` folders included | Check `.dockerignore` excludes datasets/embeddings |

---

## ⚡ RunPod Batch Optimization

### Problem (Old Parallel Processing)
Each text was sent as a separate RunPod request → N texts = N workers = N× activation cost.

**Example:** 10 texts → 10 workers × 18 s = 180 worker-seconds + 10× activation fees

### Solution (New Batch Processing)
All texts sent in **one** RunPod request → 1 worker handles everything.

**Example:** 10 texts → 1 worker × 190 s = 190 worker-seconds + 1× activation fee  
**Savings: ~45–60% cost reduction** (activation fees dominate RunPod pricing)

### Batch Request Format (handler.py)

```json
{
  "input": {
    "texts": [
      {"text": "Hello", "author_id": 42, "hw_id": "hw_0"},
      {"text": "World", "author_id": 42, "hw_id": "hw_1"}
    ],
    "apply_blur": true
  }
}
```

**Response:**
```json
{
  "status": "COMPLETED",
  "output": {
    "images": [
      {"image_base64": "...", "width": 217, "height": 61, "text": "Hello", "author_id": 42, "hw_id": "hw_0"},
      {"image_base64": "...", "width": 195, "height": 58, "text": "World", "author_id": 42, "hw_id": "hw_1"}
    ],
    "total_generated": 2
  }
}
```

> **Note:** Backward-compatible – single text requests (old format) are still supported. Handler auto-detects batch vs single based on the `"texts"` key.

### Timeout Configuration
Timeout is dynamically calculated: `num_texts × 20 + 30` seconds.  
For large batches (20+ texts), set RunPod endpoint max execution time to 600 s.

### Cost Comparison

| Scenario | OLD (parallel) | NEW (batched) | Savings |
|----------|---------------|---------------|---------|
| 2 texts  | 2 workers × 18 s | 1 worker × 38 s | ~50% |
| 10 texts | 10 workers × 18 s | 1 worker × 190 s | ~55% |
| 25 texts | 25 workers × 18 s | 1 worker × 480 s | ~60% |

### Integration Test
```bash
cd api
python test_runpod_integration.py
```