Docgenie-API / DEPLOYMENT.md
Ahadhassan-2003
deploy: update HF Space
dc4e6da
# πŸš€ DocGenie Deployment Guide
Complete guide for deploying DocGenie API + Handwriting Service to production with all interdependencies resolved.
## πŸ“Š System Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Client β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Railway (CPU) β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ DocGenie API (Port 8000) β”‚ β”‚
β”‚ β”‚ - FastAPI server β”‚ β”‚
β”‚ β”‚ - Imports: docgenie.generation.* β”‚ β”‚
β”‚ β”‚ - Endpoints: /generate, /generate/pdf, /generate/asyncβ”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Background Worker β”‚ β”‚
β”‚ β”‚ - RQ worker (Redis Queue) β”‚ β”‚
β”‚ β”‚ - ClaudeBatchedClient (50% cost savings) β”‚ β”‚
β”‚ β”‚ - Imports: docgenie.generation.* β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β”‚ β”‚
β–Ό β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Redis (Upstash)β”‚ β”‚ Supabase β”‚ β”‚ Google Drive β”‚
β”‚ - Job queue β”‚ β”‚ - PostgreSQL β”‚ β”‚ - File storageβ”‚
β”‚ - Free tier β”‚ β”‚ - Document DB β”‚ β”‚ - OAuth 2.0 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ RunPod Serverless (GPU) β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Handwriting Service (Port 8080) β”‚ β”‚
β”‚ β”‚ - WordStylist diffusion model β”‚ β”‚
β”‚ β”‚ - PyTorch + CUDA 11.8 β”‚ β”‚
β”‚ β”‚ - NO docgenie imports (standalone) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## πŸ”— Dependency Resolution
### βœ… Problem: API imports from docgenie package
**Solution:** Deploy entire monorepo, install as package with `pip install -e .`
**API Service imports:**
```python
# api/worker.py
from docgenie.generation.pipeline_01.claude_batching import ClaudeBatchedClient
from docgenie import ENV
# api/utils.py
from docgenie.generation.constants import BS_PARSER, HANDWRITING_CLASS_NAME
from docgenie.generation.pipeline_01.claude_batching import create_message
from docgenie.generation.pipeline_03_process_response import process_response
from docgenie.generation.pipeline_04_render_pdf_and_extract_geos import render_pdf
```
**Dockerfile solution:**
```dockerfile
# Copy entire monorepo
COPY . .
# Install as editable package
RUN pip install -e .
# Install API requirements
RUN pip install -r api/requirements.txt
```
### βœ… Handwriting Service is Independent
**No docgenie imports!** Can be deployed standalone.
```python
# handwriting_service/main.py - NO docgenie imports
from handwriting_service.inference import HandwritingGenerator
from handwriting_service.models import HandwritingRequest
```
## πŸ“¦ Pre-Deployment Checklist
### 1. Environment Variables
Create `api/.env` with all required variables:
```bash
# Claude API
ANTHROPIC_API_KEY=sk-ant-xxxxx
# Redis (will be replaced with Upstash URL)
REDIS_URL=redis://localhost:6379
# Handwriting Service
HANDWRITING_SERVICE_URL=http://localhost:8080
# Supabase
SUPABASE_URL=https://xxxxx.supabase.co
SUPABASE_KEY=eyJxxxxx
# Google Drive (for token refresh only)
# The frontend handles OAuth and sends tokens in API requests
# These credentials are only needed to refresh expired tokens during long jobs
GOOGLE_CLIENT_ID=xxxxx.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=GOCSPX-xxxxx
GOOGLE_DRIVE_FOLDER_NAME=DocGenie Documents
```
### 2. Test Locally First
```bash
# Terminal 1: Start Redis
docker run -p 6379:6379 redis:7-alpine
# Terminal 2: Start Handwriting Service
cd handwriting_service
DEVICE=cpu uvicorn main:app --port 8080
# Terminal 3: Start API
cd api
source ../.venv/bin/activate
uvicorn main:app --reload --port 8000
# Terminal 4: Start Worker
cd api
source ../.venv/bin/activate
python worker.py
```
Test endpoints:
```bash
# Health check
curl http://localhost:8000/health
# Async generation (uses batched API)
curl -X POST http://localhost:8000/generate/async \
-H "Content-Type: application/json" \
-d '{"template_name": "DocGenie", "num_pages": 2}'
```
## 🚒 Deployment Steps
### Option A: Railway + RunPod (RECOMMENDED - $10/month)
#### Step 1: Deploy Redis to Upstash (FREE)
1. Go to https://upstash.com
2. Create account β†’ New Redis Database
3. Copy the `UPSTASH_REDIS_REST_URL` (looks like: `redis://default:xxxxx@xxxxx.upstash.io:6379`)
#### Step 2: Deploy Handwriting Service to RunPod
**Option A: Build from Git Repository (RECOMMENDED - No Docker Hub needed!)**
This builds directly on RunPod's servers, avoiding the need to upload 10GB over your internet.
1. **Prepare and push code to Git:**
```bash
cd /media/ahad-hassan/Volume_E/FYP/FYP/docgenie
# First, prepare optimized WordStylist (removes 432MB of unnecessary files)
cd handwriting_service
./prepare_build.sh
cd ..
# Now commit the optimized WordStylist
git add handwriting_service/
git status # Verify WordStylist is included (should show WordStylist/models/ema_ckpt.pt, etc.)
git commit -m "Add handwriting service with optimized WordStylist"
git push origin main
```
2. **Deploy to RunPod:**
- Go to https://runpod.io β†’ Serverless β†’ New Endpoint
- Click "Build from Git" (not Docker Image)
- Settings:
- Name: `docgenie-handwriting`
- Git URL: `https://github.com/Ahadhassan-2003/FYP.git`
- Git Branch: `main`
- Docker Build Context: `docgenie/handwriting_service`
- Dockerfile Path: `Dockerfile`
- GPU: RTX 4090 or A40
- Container Disk: 15GB
- Max Workers: 1
- Idle Timeout: 5 seconds
- Exposed Port: 8080
- Environment Variables:
```
DEVICE=cuda
PYTHONUNBUFFERED=1
```
- Build Args (prepare WordStylist):
```
PREPARE_WORDSTYLIST=true
```
- Click "Deploy"
RunPod will clone your repo and build the image on their fast servers!
**Option B: Pre-built Docker Image (if Git unavailable)**
<details>
<summary>Click to expand Docker Hub method</summary>
```bash
cd handwriting_service
# Prepare optimized build (removes 432MB)
./prepare_build.sh
# Login to Docker Hub
docker login
# Build image
docker buildx build --platform linux/amd64 \
-t yourusername/docgenie-handwriting:latest \
--build-arg BUILDKIT_INLINE_CACHE=1 \
.
# Push to Docker Hub (may take 20-30 minutes for 10GB)
docker push yourusername/docgenie-handwriting:latest
```
Then deploy on RunPod:
- Go to https://runpod.io β†’ Serverless β†’ New Endpoint
- Docker Image: `yourusername/docgenie-handwriting:latest`
- GPU: RTX 4090 or A40
- Port: 8080
- Environment Variables: `DEVICE=cuda`
</details>
docker push ahadhassan/docgenie-handwriting:v2
3. **Get endpoint URL:**
- Copy the URL (looks like: `https://api.runpod.ai/v2/xxxxx/runsync`)
- This is your `HANDWRITING_SERVICE_URL`
#### Step 3: Deploy API to Railway
1. **Install Railway CLI:**
```bash
# Install Railway CLI
npm i -g @railway/cli
# Or use curl
bash <(curl -fsSL cli.new) railway
```
2. **Initialize Railway project:**
```bash
cd /media/ahad-hassan/Volume_E/FYP/FYP/docgenie
# Login to Railway
railway login
# Create new project
railway init
# Link to project (creates railway.json)
railway link
```
3. **Set environment variables:**
```bash
# Set all environment variables from api/.env
railway variables set ANTHROPIC_API_KEY=sk-ant-xxxxx
railway variables set REDIS_URL=redis://default:xxxxx@xxxxx.upstash.io:6379
railway variables set HANDWRITING_SERVICE_URL=https://api.runpod.ai/v2/xxxxx/runsync
railway variables set SUPABASE_URL=https://xxxxx.supabase.co
railway variables set SUPABASE_KEY=eyJxxxxx
# Google OAuth (for token refresh only - frontend provides tokens in requests)
railway variables set GOOGLE_CLIENT_ID=xxxxx.apps.googleusercontent.com
railway variables set GOOGLE_CLIENT_SECRET=GOCSPX-xxxxx
railway variables set GOOGLE_DRIVE_FOLDER_NAME="DocGenie Documents"
```
**Note:** Google access/refresh tokens are NOT environment variables! The frontend authenticates with Google OAuth, then passes `google_drive_token` and `google_drive_refresh_token` in the API request body. See [API request schema](api/schemas.py#L108-L114).
4. **Deploy API + Worker:**
```bash
# Railway will detect Dockerfile and deploy automatically
railway up
# Or connect to GitHub and deploy from there
railway connect
```
5. **Option 1: Separate Worker Service (For Production Scale):**
*Note: Only needed if processing 50+ concurrent jobs. For most use cases, Option 2 (combined) is sufficient.*
**Method A: Connect to Same GitHub Repo (Recommended)**
- Go to Railway dashboard β†’ Your project β†’ **New Service**
- Click **"GitHub Repo"** β†’ Select your repo
- Name: `docgenie-worker`
- **Settings** β†’ **Deploy**:
- Builder: `DOCKERFILE`
- Dockerfile Path: `Dockerfile`
- Root Directory: `/` (same as API)
- **Custom Start Command**:
```bash
rq worker --url $REDIS_URL
```
- **Variables**: Add all environment variables (same as API service)
- **Deploy**
**Method B: Use Same Docker Image as API**
- Railway dashboard β†’ New Service β†’ **Empty Service**
- Name: `docgenie-worker`
- **Settings** β†’ **Source**: Link to API service's image
- **Custom Start Command**: `rq worker --url $REDIS_URL`
- **Variables**: Copy from API service
- **Deploy**
6. **Option 2: Combined API + Worker (Recommended for Getting Started):**
Update `railway.json` to run both in one service:
```json
{
"deploy": {
"startCommand": "uvicorn api.main:app --host 0.0.0.0 --port $PORT & rq worker --url $REDIS_URL & wait"
}
}
```
Then push:
```bash
git add railway.json
git commit -m "feat: Run API and worker in combined service"
git push
```
**Benefits:**
- βœ… Single service ($5/month instead of $10/month)
- βœ… Simpler logs and monitoring
- βœ… Automatic scaling together
- βœ… Good for 90% of use cases
7. **Get API URL:**
- Railway dashboard β†’ API service β†’ Settings β†’ Domains
- Generate domain (e.g., `docgenie-api.up.railway.app`)
#### Step 4: Update Frontend
Update your frontend API URL to Railway domain:
```javascript
const API_URL = 'https://docgenie-api.up.railway.app';
```
### Option B: AWS EC2 + RunPod (For Production)
#### Prerequisites
- AWS account with EC2 access
- Domain name (optional, for SSL)
#### Step 1: Launch EC2 Instance
```bash
# Launch t3.medium instance
aws ec2 run-instances \
--image-id ami-0c55b159cbfafe1f0 \
--instance-type t3.medium \
--key-name your-key-pair \
--security-group-ids sg-xxxxx \
--subnet-id subnet-xxxxx
```
**Security Group Rules:**
- Port 22 (SSH) - Your IP only
- Port 80 (HTTP) - 0.0.0.0/0
- Port 443 (HTTPS) - 0.0.0.0/0
- Port 8000 (API) - 0.0.0.0/0
#### Step 2: Setup EC2
```bash
# SSH into instance
ssh -i your-key.pem ubuntu@your-ec2-ip
# Update system
sudo apt update && sudo apt upgrade -y
# Install Docker
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu
# Install Docker Compose
sudo apt install docker-compose-plugin -y
# Install Git
sudo apt install git -y
# Clone repository
git clone https://gitlab.cs.hs-rm.de/diss_lamott/docgenie.git
cd docgenie
```
#### Step 3: Configure Environment
```bash
# Create .env file
cd api
nano .env
# Paste all environment variables
# Save: Ctrl+X, Y, Enter
# Update REDIS_URL to use Upstash
# Update HANDWRITING_SERVICE_URL to RunPod endpoint
```
#### Step 4: Deploy with Docker Compose
```bash
cd /home/ubuntu/docgenie
# Start services (API + Worker + Redis)
docker-compose up -d api worker redis
# Check logs
docker-compose logs -f api
docker-compose logs -f worker
```
#### Step 5: Setup Nginx Reverse Proxy
```bash
# Install Nginx
sudo apt install nginx -y
# Create config
sudo nano /etc/nginx/sites-available/docgenie
# Paste configuration:
```
```nginx
server {
listen 80;
server_name your-domain.com; # Or use EC2 IP
location / {
proxy_pass http://localhost:8000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Increase timeout for long-running requests
proxy_read_timeout 300s;
proxy_connect_timeout 75s;
}
}
```
```bash
# Enable site
sudo ln -s /etc/nginx/sites-available/docgenie /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl restart nginx
# Optional: Setup SSL with Let's Encrypt
sudo apt install certbot python3-certbot-nginx -y
sudo certbot --nginx -d your-domain.com
```
#### Step 6: Setup Systemd Service (Auto-restart)
```bash
# Create service file
sudo nano /etc/systemd/system/docgenie.service
```
```ini
[Unit]
Description=DocGenie API
After=docker.service
Requires=docker.service
[Service]
Type=oneshot
RemainAfterExit=yes
WorkingDirectory=/home/ubuntu/docgenie
ExecStart=/usr/bin/docker-compose up -d api worker redis
ExecStop=/usr/bin/docker-compose down
User=ubuntu
[Install]
WantedBy=multi-user.target
```
```bash
# Enable service
sudo systemctl daemon-reload
sudo systemctl enable docgenie
sudo systemctl start docgenie
# Check status
sudo systemctl status docgenie
```
## πŸ§ͺ Testing Production Deployment
### 1. Health Check
```bash
curl https://your-domain.com/health
```
### 2. Sync Generation (Fast)
```bash
curl -X POST https://your-domain.com/generate \
-H "Content-Type: application/json" \
-d '{
"template_name": "DocGenie",
"num_pages": 1
}'
```
### 3. Async Generation (Batched, Cheap)
```bash
# Start async job
RESPONSE=$(curl -X POST https://your-domain.com/generate/async \
-H "Content-Type: application/json" \
-d '{
"template_name": "DocGenie",
"num_pages": 2
}')
REQUEST_ID=$(echo $RESPONSE | jq -r '.request_id')
echo "Request ID: $REQUEST_ID"
# Poll status
while true; do
STATUS=$(curl -s https://your-domain.com/jobs/$REQUEST_ID/status | jq -r '.status')
echo "Status: $STATUS"
if [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ]; then
break
fi
sleep 10
done
# Get result
curl https://your-domain.com/jobs/$REQUEST_ID/status | jq
```
## πŸ“Š Cost Breakdown
### Railway + RunPod (Recommended)
| Service | Cost | Notes |
|---------|------|-------|
| Railway (API + Worker) | $5-10/month | Includes 500 hours |
| Upstash Redis | FREE | 10K requests/day |
| RunPod Serverless GPU | $0.20/hr | Only charged when active |
| Supabase | FREE | 500MB database |
| **Total** | **~$10-15/month** | + $0.20/hr GPU usage |
### EC2 + RunPod
| Service | Cost | Notes |
|---------|------|-------|
| EC2 t3.medium | $30/month | 2 vCPU, 4GB RAM |
| Upstash Redis | FREE | External Redis |
| RunPod Serverless GPU | $0.20/hr | Only when needed |
| Supabase | FREE | External DB |
| **Total** | **~$30/month** | + $0.20/hr GPU usage |
### EC2 + Dedicated GPU (Production)
| Service | Cost | Notes |
|---------|------|-------|
| EC2 g4dn.xlarge | $150/month | 4 vCPU, 16GB RAM, T4 GPU |
| Supabase | FREE | External DB |
| **Total** | **~$150/month** | All-in-one solution |
## πŸ”§ Maintenance
### Update Deployment
**Railway:**
```bash
# Push to main branch (auto-deploy)
git push origin main
# Or manual deploy
railway up
```
**EC2:**
```bash
ssh ubuntu@your-ec2-ip
cd docgenie
git pull
docker-compose down
docker-compose up -d --build
```
### View Logs
**Railway:**
```bash
railway logs
```
**EC2:**
```bash
# API logs
docker-compose logs -f api
# Worker logs
docker-compose logs -f worker
# Nginx logs
sudo tail -f /var/log/nginx/access.log
sudo tail -f /var/log/nginx/error.log
```
### Monitor Redis Queue
```bash
# Connect to Redis
redis-cli -u $REDIS_URL
# Check queue status
> LLEN rq:queue:default
> LRANGE rq:queue:default 0 -1
```
## 🚨 Troubleshooting
### Issue: Worker can't import docgenie package
**Solution:** Dockerfile installs entire monorepo with `pip install -e .`
### Issue: Handwriting service connection timeout
**Solution:** Use RunPod's `/runsync` endpoint, not `/run` (synchronous)
### Issue: Google token expired during job
**Solution:** Ensure `GOOGLE_REFRESH_TOKEN`, `GOOGLE_CLIENT_ID`, `GOOGLE_CLIENT_SECRET` are set
### Issue: Railway build fails (too large)
**Solution:** Check `.dockerignore` excludes `data/` folders
### Issue: Worker heartbeat timeout
**Solution:** Job is still running, batched API takes 10-30 minutes
## πŸ“š Next Steps
1. **Monitor costs:** Railway dashboard, RunPod usage page
2. **Setup alerts:** Railway β†’ Settings β†’ Notifications
3. **Scale workers:** Railway β†’ Worker service β†’ Settings β†’ Replicas
4. **Add caching:** Redis cache for generated documents
5. **Setup CI/CD:** GitHub Actions β†’ Railway auto-deploy
## πŸŽ‰ You're Done!
Your DocGenie API is now deployed with:
- βœ… All docgenie package imports resolved
- βœ… GPU handwriting service on RunPod
- βœ… Background workers for batched API
- βœ… Auto-scaling and cost optimization
- βœ… Google token refresh working
- βœ… Database schema compatibility
**API URL:** `https://your-domain.com`
**Docs:** `https://your-domain.com/docs`
**Health:** `https://your-domain.com/health`
---
## πŸ–₯️ Local Testing Guide
### Architecture
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ DocGenie API (Port 8000) │──┐ HTTP
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ localhost:8080
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Handwriting Service (Port 8080) β”‚
β”‚ - Loads WordStylist model β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
### Prerequisites
1. **Python environment**: `source .venv/bin/activate`
2. **WordStylist Model** at `WordStylist/models/ckpt.pt` and `ema_ckpt.pt`
3. **`api/.env`** with `ANTHROPIC_API_KEY`, `HANDWRITING_SERVICE_ENABLED=true`, `HANDWRITING_SERVICE_URL=http://localhost:8080`
### Step-by-Step Setup
**Terminal 1 – Handwriting Service:**
```bash
cd handwriting_service
DEVICE=cpu ./start.sh # CPU (no GPU required)
# DEVICE=cuda ./start.sh # GPU (faster)
```
**Terminal 2 – DocGenie API:**
```bash
cd api
uvicorn main:app --reload
```
**Terminal 3 – Test:**
```bash
curl http://localhost:8080/health # Handwriting service
curl http://localhost:8000/health # API
cd api && python test_api.py
```
### Performance Notes
- CPU mode: ~5–10 s/word | GPU mode: ~0.5–1 s/word
- Service processes all words in one batch for efficiency
---
## βš™οΈ Railway-Specific Configuration
### Critical Issues & Fixes
**1. `.dockerignore` – Keep required data folders:**
```
!data/prompt_templates/
!data/visual_element_prefabs/
```
**2. `railway.json` – Start both API and worker:**
```json
"startCommand": "cd api && uvicorn main:app --host 0.0.0.0 --port $PORT & rq worker --url $REDIS_URL & wait"
```
### Environment Variables
#### πŸ”΄ Required
```bash
ANTHROPIC_API_KEY=sk-ant-api03-xxx
REDIS_URL=rediss://default:xxx@xxx.upstash.io:6379
HANDWRITING_SERVICE_URL=https://api.runpod.ai/v2/ht9ajgrduitgpr/runsync
HANDWRITING_SERVICE_ENABLED=true
SUPABASE_URL=https://xxx.supabase.co
SUPABASE_KEY=xxx
GOOGLE_CLIENT_ID=xxx.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=xxx
```
#### 🟑 Recommended
```bash
RUNPOD_API_KEY=xxx
OCR_SERVICE_ENABLED=true
OCR_USE_LOCAL=true
OCR_ENGINE=microsoft_di
OCR_DPI=300
HANDWRITING_SERVICE_TIMEOUT=300
HANDWRITING_SERVICE_MAX_RETRIES=3
RQ_QUEUE_NAME=docgenie
LOG_LEVEL=INFO
```
#### 🟒 Optional (defaults are fine)
```bash
API_HOST=0.0.0.0
API_PORT=8000
DEBUG_MODE=false
CLAUDE_MODEL=claude-sonnet-4-5-20250929
CORS_ORIGINS=*
GOOGLE_DRIVE_FOLDER_NAME=DocGenie Documents
TEMP_DIR=/tmp/docgenie_api
HANDWRITING_APPLY_BLUR=false
BBOX_NORMALIZATION_ENABLED=false
GT_VERIFICATION_ENABLED=false
ANALYSIS_ENABLED=false
DEBUG_VISUALIZATION_ENABLED=false
```
### Validation Steps
```bash
# 1. Health check
curl https://your-app.up.railway.app/health
# 2. Sync generation
curl -X POST https://your-app.up.railway.app/api/generate \
-H "Content-Type: application/json" \
-d '{"document_category": "invoice", "pages": 1}'
# 3. Async generation
curl -X POST https://your-app.up.railway.app/api/async/generate \
-H "Content-Type: application/json" \
-d '{"document_category": "invoice", "pages": 1, "google_access_token": "ya29.xxx"}'
```
### Common Railway Issues
| Issue | Cause | Solution |
|-------|-------|----------|
| Worker not starting | Missing `rq worker` in start command | Check `railway.json` `startCommand` |
| Missing prompt templates | `.dockerignore` too aggressive | Add `!data/prompt_templates/` |
| Playwright errors | Browser not installed | Ensure `playwright install chromium` in Dockerfile |
| Redis connection errors | Wrong `REDIS_URL` | Verify in Railway env variables |
| Handwriting timeout | Batch too large | Increase `HANDWRITING_SERVICE_TIMEOUT` |
| Large Docker image | `data/` folders included | Check `.dockerignore` excludes datasets/embeddings |
---
## ⚑ RunPod Batch Optimization
### Problem (Old Parallel Processing)
Each text was sent as a separate RunPod request β†’ N texts = N workers = NΓ— activation cost.
**Example:** 10 texts β†’ 10 workers Γ— 18 s = 180 worker-seconds + 10Γ— activation fees
### Solution (New Batch Processing)
All texts sent in **one** RunPod request β†’ 1 worker handles everything.
**Example:** 10 texts β†’ 1 worker Γ— 190 s = 190 worker-seconds + 1Γ— activation fee
**Savings: ~45–60% cost reduction** (activation fees dominate RunPod pricing)
### Batch Request Format (handler.py)
```json
{
"input": {
"texts": [
{"text": "Hello", "author_id": 42, "hw_id": "hw_0"},
{"text": "World", "author_id": 42, "hw_id": "hw_1"}
],
"apply_blur": true
}
}
```
**Response:**
```json
{
"status": "COMPLETED",
"output": {
"images": [
{"image_base64": "...", "width": 217, "height": 61, "text": "Hello", "author_id": 42, "hw_id": "hw_0"},
{"image_base64": "...", "width": 195, "height": 58, "text": "World", "author_id": 42, "hw_id": "hw_1"}
],
"total_generated": 2
}
}
```
> **Note:** Backward-compatible – single text requests (old format) are still supported. Handler auto-detects batch vs single based on the `"texts"` key.
### Timeout Configuration
Timeout is dynamically calculated: `num_texts Γ— 20 + 30` seconds.
For large batches (20+ texts), set RunPod endpoint max execution time to 600 s.
### Cost Comparison
| Scenario | OLD (parallel) | NEW (batched) | Savings |
|----------|---------------|---------------|---------|
| 2 texts | 2 workers Γ— 18 s | 1 worker Γ— 38 s | ~50% |
| 10 texts | 10 workers Γ— 18 s | 1 worker Γ— 190 s | ~55% |
| 25 texts | 25 workers Γ— 18 s | 1 worker Γ— 480 s | ~60% |
### Integration Test
```bash
cd api
python test_runpod_integration.py
```