# Deployment Guide - Runpod Cloud After local testing is complete, follow this guide to deploy your Feedback Analysis Agent to Runpod. --- ## โœ… Pre-Deployment Checklist Before deploying to Runpod, ensure: - [ ] All local tests pass: `python3 scripts/validate_local.py` shows 7/7 โœ… - [ ] API server runs locally: `python3 run.py` starts without errors - [ ] Endpoints tested: Use TESTING_CHECKLIST.md or curl commands - [ ] Git repository clean: `git status` shows no uncommitted changes - [ ] All code committed: `git log --oneline | head -5` shows your commits - [ ] Docker image builds: `docker build -t feedback-analysis:latest .` succeeds - [ ] Requirements.txt updated: All dependencies listed --- ## ๐Ÿ“ฆ Step 1: Prepare Docker Image ### 1.1 Build Docker Image Locally ```bash cd /Users/galbd/Desktop/personal/software/ai_agent_gov/Feedback_Analysis_RAG_Agent_runpod # Build the image docker build -t feedback-analysis:latest . # Verify it built docker images | grep feedback-analysis ``` **Expected output:** ``` REPOSITORY TAG IMAGE ID CREATED SIZE feedback-analysis latest abc123def456 2 minutes ago 2.5GB ``` ### 1.2 Test Docker Image Locally (Optional) ```bash # Run container docker run -p 8001:8000 feedback-analysis:latest # In another terminal, test curl -X POST http://localhost:8001/health ``` **Expected:** `{"status":"ok"}` --- ## ๐Ÿ”‘ Step 2: Set Up Docker Registry ### Option A: Docker Hub (Easiest) **2A.1 Create Docker Hub Account** - Go to https://hub.docker.com - Sign up for free account - Note your username (e.g., `galbendavids`) **2A.2 Login to Docker** ```bash docker login # Enter your Docker Hub username and password ``` **2A.3 Tag and Push Image** ```bash # Tag with your Docker Hub username docker tag feedback-analysis:latest galbendavids/feedback-analysis:latest # Push to Docker Hub docker push galbendavids/feedback-analysis:latest # Verify it's uploaded # Visit https://hub.docker.com/r/YOUR_USERNAME/feedback-analysis ``` ### Option B: Private Registry (Advanced) - Use AWS ECR, Google Container Registry, or Azure Container Registry - Follow their documentation for authentication and push --- ## ๐Ÿš€ Step 3: Create Runpod Template ### 3.1 Access Runpod Console 1. Go to https://www.runpod.io 2. Sign in to your account (create if needed) 3. Click **"Console"** in top menu 4. Go to **"Serverless"** or **"Pods"** section ### 3.2 Create New Template **For Serverless Endpoints (Recommended):** 1. Click **"Create New"** โ†’ **"API Endpoint Template"** 2. Fill in: - **Template Name:** `feedback-analysis-sql` - **Docker Image:** `galbendavids/feedback-analysis:latest` - **Ports:** `8000` - **GPU:** None (CPU-only is fine) - **Memory:** 4GB minimum - **Environment Variables:** ``` GEMINI_API_KEY=your_key_here (optional) OPENAI_API_KEY=sk-... (optional) ``` 3. Click **"Save Template"** **For Pods (Traditional VM):** 1. Click **"Create"** โ†’ **"New Pod"** 2. Select template 3. Choose GPU type (optional, not needed for this workload) 4. Set min/max auto-scale settings 5. Click **"Run Pod"** ### 3.3 Configure Networking - **Expose Port:** 8000 - **HTTPS:** Enabled automatically - **Public URL:** Runpod generates automatically --- ## ๐Ÿงช Step 4: Test Deployed Endpoint ### 4.1 Get Endpoint URL After deployment, Runpod provides a URL like: ``` https://your-endpoint-id.runpod-pods.net/ ``` Or for Serverless: ``` https://api.runpod.io/v1/YOUR_ENDPOINT_ID/run ``` ### 4.2 Test Basic Connectivity ```bash # For Pods (direct connection) curl -X POST https://your-endpoint-id.runpod-pods.net/health # For Serverless (requires different format) # See Runpod API documentation ``` **Expected response:** ```json {"status":"ok"} ``` ### 4.3 Test Query Endpoint ```bash curl -X POST https://your-endpoint-id.runpod-pods.net/query \ -H "Content-Type: application/json" \ -d '{"query":"ื›ืžื” ืžืฉืชืžืฉื™ื ื›ืชื‘ื• ืชื•ื“ื”","top_k":5}' ``` **Expected response:** ```json { "query": "ื›ืžื” ืžืฉืชืžืฉื™ื ื›ืชื‘ื• ืชื•ื“ื”", "summary": "1168 ืžืฉื•ื‘ื™ื ืžื›ื™ืœื™ื ื‘ื™ื˜ื•ื™ื™ ืชื•ื“ื”.", "results": [...] } ``` ### 4.4 Test All Endpoints Use the same curl commands from TESTING_CHECKLIST.md, but replace: - `http://localhost:8000` โ†’ `https://your-endpoint-id.runpod-pods.net` Or use Swagger UI: - `https://your-endpoint-id.runpod-pods.net/docs` --- ## ๐Ÿ’ฐ Step 5: Configure Auto-Scaling (Optional) In Runpod Pod settings: 1. **Minimum GPUs:** 0 (not needed) 2. **Maximum GPUs:** 1 (if you add GPU support) 3. **Idle timeout:** 5 minutes 4. **Auto-pause:** Enabled (to save costs) --- ## ๐Ÿ” Step 6: Add API Keys (Optional) If you want LLM summaries (not required, system works without): ### 6.1 In Runpod Dashboard 1. Go to Pod settings 2. Add Environment Variables: ``` GEMINI_API_KEY=your_actual_key OPENAI_API_KEY=sk-your_actual_key ``` 3. Restart pod ### 6.2 Get API Keys **For Google Gemini:** 1. Go to https://makersuite.google.com/app/apikeys 2. Click "Create API Key" 3. Copy the key **For OpenAI:** 1. Go to https://platform.openai.com/api-keys 2. Create new secret key 3. Copy the key --- ## ๐Ÿ“Š Step 7: Monitor & Manage ### 7.1 Check Logs In Runpod dashboard: 1. Click on your pod/endpoint 2. View **Logs** tab 3. Look for errors or warnings ### 7.2 Performance Metrics Monitor: - **CPU usage:** Should be <50% at rest - **Memory:** Should be <80% usage - **Response times:** Query endpoint 1-3 seconds - **Uptime:** Should be 99%+ ### 7.3 Scale & Pricing - **Auto-scaling:** Runpod manages based on demand - **Costs:** Typically $0.25-$0.50/hour for 4GB CPU-only pod - **Savings:** Pod auto-pauses when idle (no charge) --- ## ๐Ÿ”„ Step 8: Update Deployment ### When You Update Code 1. **Make changes locally** ```bash # Edit code, test locally git add . git commit -m "feat: new feature" git push origin main ``` 2. **Rebuild Docker image** ```bash docker build -t feedback-analysis:v2 . docker tag feedback-analysis:v2 galbendavids/feedback-analysis:v2 docker push galbendavids/feedback-analysis:v2 ``` 3. **Update Runpod template** - Edit template image: `galbendavids/feedback-analysis:v2` - Save - Restart pod with new image 4. **Or redeploy** - Delete old pod - Create new pod from updated template --- ## โœจ Advanced: Optimization for Cloud ### A. Pre-download Models in Dockerfile To avoid long first-request delays in cloud, add to Dockerfile: ```dockerfile # After RUN pip install requirements.txt # Pre-download embedding model RUN python3 -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('paraphrase-multilingual-MiniLM-L12-v2')" # Pre-download sentiment model RUN python3 -c "from transformers import pipeline; pipeline('sentiment-analysis', model='nlptown/bert-base-multilingual-uncased-sentiment')" ``` This adds ~2GB to image, but eliminates download on first request. ### B. Use GPU for Faster Embeddings ```dockerfile # Install GPU support RUN pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 RUN pip install faiss-gpu ``` Then in Runpod, select a GPU pod (more expensive but faster). ### C. Enable Caching Add to `app/config.py`: ```python EMBEDDING_CACHE_SIZE = 10000 # Cache more embeddings INDEX_RELOAD_INTERVAL = 3600 # Reload index hourly ``` --- ## ๐Ÿ› Troubleshooting ### Problem: Pod won't start ``` Error: Container failed to start ``` **Fix:** Check Dockerfile syntax and ensure image builds locally first. ### Problem: Out of memory ``` OOMKilled or similar ``` **Fix:** Increase allocated memory in pod settings (go from 4GB to 8GB). ### Problem: Slow responses ``` Queries taking >10 seconds ``` **Fix:** - Add GPU support - Pre-download models (see optimization section) - Increase allocated CPU cores ### Problem: Model not found ``` Error: Model 'xyz' not found ``` **Fix:** Add model download to Dockerfile (see optimization section). ### Problem: HTTPS certificate error ``` SSL Certificate verification failed ``` **Fix:** Runpod handles this automatically, should not occur. --- ## ๐Ÿ“ˆ Monitoring & Alerts ### Set Up Alerts (Optional) 1. Go to Runpod **Billing** tab 2. Set max spend limit 3. Enable email alerts ### Check Status ```bash # Query your endpoint curl -X POST https://your-endpoint-id.runpod-pods.net/health # If it fails, pod may be down # Check Runpod dashboard for status ``` --- ## ๐Ÿ”„ Rollback Plan If deployment has issues: 1. **Keep previous image tagged** ```bash docker tag galbendavids/feedback-analysis:v1 galbendavids/feedback-analysis:latest-stable docker push galbendavids/feedback-analysis:latest-stable ``` 2. **If new deployment fails, revert** - Update Runpod template back to `latest-stable` - Restart pod - Investigate issue locally 3. **Don't delete old pods immediately** - Keep for at least 1 day - Then delete if new version stable --- ## ๐ŸŽฏ Testing Checklist Before Going Live Before sharing endpoint with users: - [ ] `/health` endpoint responds - [ ] `/query` endpoint returns results - [ ] Hebrew queries work correctly - [ ] Response times acceptable (<5s for most queries) - [ ] Error handling working (try invalid JSON) - [ ] Swagger UI accessible at `/docs` - [ ] SSL/HTTPS working (URL is secure) - [ ] Logs show no errors - [ ] Auto-scaling responding to load --- ## ๐Ÿ“‹ Production Deployment Checklist Before announcing to users: - [ ] Load tested with 100+ concurrent requests - [ ] Backup plan documented - [ ] Monitoring alerts set up - [ ] Support procedure documented - [ ] SLA defined (99.9% uptime target, etc.) - [ ] Rate limiting configured (optional) - [ ] API key authentication enforced (optional) - [ ] CORS settings reviewed - [ ] Backup of deployment config saved - [ ] Runpod support ticket submitted for any questions --- ## ๐Ÿ“ž Support & Resources - **Runpod Docs:** https://docs.runpod.io - **Runpod Community:** https://forums.runpod.io - **FastAPI Docs:** https://fastapi.tiangolo.com - **Docker Docs:** https://docs.docker.com --- ## ๐ŸŽ“ What's Next After successful deployment: 1. **Monitor the endpoint** - Check logs daily 2. **Gather feedback** - What works well, what needs improvement 3. **Iterate** - Make improvements, redeploy 4. **Scale** - Add more features, more data 5. **Secure** - Add authentication, rate limiting as needed --- ## โœ… Congratulations! Your SQL-based feedback analysis agent is now live in the cloud! ๐ŸŽ‰ **Summary:** - โœ… Local validation complete - โœ… Docker image built - โœ… Deployed to Runpod - โœ… Cloud endpoint tested - โœ… Ready for production **Next:** Share the endpoint URL with users or integrate into your application. --- *Last Updated: Today* *Version: 1.0* *Status: Production Ready* โœจ