Spaces:

Pulastya0
/

Data-Science-Agent

Running

Pulastya B commited on Dec 27, 2025

Commit

c10a976

1 Parent(s): 48520bd

feat: Add Render deployment configuration

- Add render.yaml for automated deployment via Blueprint
- Update Dockerfile to use uvicorn for better performance
- Create comprehensive RENDER_DEPLOYMENT.md guide
- Configure health check endpoint at /api/health
- Set up environment variables for Render
- Optimize for ephemeral storage with /tmp directories

Deployment instructions:
1. Connect GitHub repo to Render
2. Deploy via Blueprint (render.yaml)
3. Add GOOGLE_API_KEY as secret
4. App will be live at https://data-science-agent.onrender.com

Supports both free tier and paid plans with clear upgrade path

Files changed (3) hide show

Dockerfile +2 -5
RENDER_DEPLOYMENT.md +246 -0
render.yaml +37 -0

Dockerfile CHANGED Viewed

@@ -70,9 +70,6 @@ ENV ARTIFACT_BACKEND=local
 # Cloud Run expects the service to listen on the PORT env variable
 EXPOSE 8080
-# Health check (optional, Cloud Run handles this)
-HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
-    CMD python -c "import requests; requests.get('http://localhost:8080/health')" || exit 1
-# Run the FastAPI application
-CMD ["python", "src/api/app.py"]

 # Cloud Run expects the service to listen on the PORT env variable
 EXPOSE 8080
+# Run the FastAPI application with uvicorn
+CMD ["uvicorn", "src.api.app:app", "--host", "0.0.0.0", "--port", "8080"]

RENDER_DEPLOYMENT.md ADDED Viewed

	@@ -0,0 +1,246 @@

+# Render Deployment Guide
+## Prerequisites
+1. A [Render account](https://render.com/) (free tier available)
+2. Your GitHub repository connected to Render
+3. Google Gemini API key
+## Quick Deploy (Recommended)
+### Option 1: Using render.yaml (Infrastructure as Code)
+1. **Push your code to GitHub** (already done)
+2. **Create a new Web Service on Render:**
+   - Go to https://dashboard.render.com/
+   - Click "New +" → "Blueprint"
+   - Connect your GitHub repository: `Pulastya-B/DevSprint-Data-Science-Agent`
+   - Render will automatically detect the `render.yaml` file
+   - Click "Apply"
+3. **Add Secret Environment Variable:**
+   - Go to your service dashboard
+   - Navigate to "Environment" tab
+   - Add your `GOOGLE_API_KEY` (this is sensitive and not included in render.yaml)
+   - Click "Save Changes"
+4. **Deploy:**
+   - Render will automatically build and deploy your application
+   - Wait for the build to complete (~5-10 minutes for first deploy)
+   - Your app will be available at: `https://data-science-agent.onrender.com`
+### Option 2: Manual Setup
+1. **Create a new Web Service:**
+   - Go to https://dashboard.render.com/
+   - Click "New +" → "Web Service"
+   - Connect your GitHub repository
+2. **Configure the service:**
+   - **Name:** `data-science-agent`
+   - **Region:** Oregon (US West)
+   - **Branch:** `main`
+   - **Runtime:** Docker
+   - **Plan:** Free (or Starter for production)
+3. **Add Environment Variables:**
+   ```
+   LLM_PROVIDER=gemini
+   GOOGLE_API_KEY=<your-api-key-here>
+   GEMINI_MODEL=gemini-2.5-flash
+   REASONING_EFFORT=medium
+   CACHE_DB_PATH=/tmp/cache_db/cache.db
+   CACHE_TTL_SECONDS=86400
+   OUTPUT_DIR=/tmp/outputs
+   DATA_DIR=/tmp/data
+   MAX_PARALLEL_TOOLS=5
+   MAX_RETRIES=3
+   TIMEOUT_SECONDS=300
+   PORT=8080
+   ARTIFACT_BACKEND=local
+   ```
+4. **Configure Health Check:**
+   - **Health Check Path:** `/api/health`
+5. **Deploy:**
+   - Click "Create Web Service"
+   - Wait for the build to complete
+## Important Notes
+### Free Tier Limitations
+- **Spin down after inactivity:** Free tier services spin down after 15 minutes of inactivity
+- **Cold starts:** First request after spin-down will take 30-60 seconds
+- **Memory:** 512 MB RAM (may be tight for large ML models)
+- **Build time:** Free tier has slower build times
+### Upgrading to Paid Plan
+For production use, consider upgrading to at least the **Starter plan ($7/month)**:
+- No spin-down
+- Faster builds
+- More memory (512 MB → 2 GB)
+- Better performance
+### Storage Considerations
+- Render uses **ephemeral storage** - files are lost on restart
+- For persistent storage, consider:
+  - Connecting to external storage (S3, GCS)
+  - Using Render's persistent disk (paid plans only)
+  - Storing only temporary analysis results
+### Performance Optimization
+1. **Use caching:** The app includes SQLite caching for repeated queries
+2. **Monitor memory usage:** Large datasets may exceed free tier limits
+3. **Optimize docker image:** The multi-stage build already optimizes image size
+4. **Regional selection:** Choose a region close to your users
+## Deployment Commands
+### Manual Rebuild (if needed)
+```bash
+# Trigger rebuild via Render Dashboard
+# or use Render API
+curl -X POST https://api.render.com/v1/services/<service-id>/deploys \
+  -H "Authorization: Bearer <your-api-key>"
+```
+### Check Logs
+```bash
+# View logs in Render Dashboard
+# or use Render CLI
+render logs -s data-science-agent
+```
+## Custom Domain (Optional)
+1. Go to your service dashboard
+2. Click "Settings" → "Custom Domain"
+3. Add your domain (e.g., `agent.yourdomain.com`)
+4. Update your DNS records as instructed
+5. Render automatically provisions SSL certificates
+## Troubleshooting
+### Build Fails
+**Issue:** Docker build timeout
+- **Solution:** Increase build timeout in Render settings
+- **Alternative:** Optimize Dockerfile to reduce build time
+**Issue:** Out of memory during build
+- **Solution:** Upgrade to paid plan with more memory
+- **Alternative:** Reduce dependencies in requirements.txt
+### App Crashes on Startup
+**Issue:** Missing environment variables
+- **Solution:** Verify all required env vars are set in Render dashboard
+**Issue:** Port binding error
+- **Solution:** Ensure app listens on `0.0.0.0` and PORT env variable
+### Slow Performance
+**Issue:** Cold starts on free tier
+- **Solution:** Upgrade to paid plan to prevent spin-down
+- **Workaround:** Use a cron job to ping your app every 10 minutes
+**Issue:** Large dataset processing timeout
+- **Solution:** Increase TIMEOUT_SECONDS env variable
+- **Consider:** Processing large datasets asynchronously
+## Monitoring
+### Health Check
+Your app exposes a health check endpoint at `/api/health`:
+```bash
+curl https://data-science-agent.onrender.com/api/health
+```
+### Logs
+- View real-time logs in Render Dashboard
+- Configure log drains for external monitoring (paid plans)
+### Metrics
+Render provides built-in metrics:
+- CPU usage
+- Memory usage
+- Request count
+- Response time
+## Security Best Practices
+1. **Never commit API keys** to Git (use environment variables)
+2. **Enable CORS** only for trusted domains in production
+3. **Use HTTPS** (Render provides this automatically)
+4. **Rotate API keys** regularly
+5. **Monitor usage** to detect anomalies
+## Cost Estimation
+### Free Tier
+- Cost: $0/month
+- Best for: Development, testing, hackathons
+- Limitations: Spin-down, slower builds, 512MB RAM
+### Starter Plan ($7/month)
+- No spin-down
+- 512MB RAM → 2GB RAM
+- Faster builds
+- Better for: Small production apps
+### Standard Plan ($25/month)
+- 4GB RAM
+- High performance
+- Best for: Production apps with moderate traffic
+## Deployment Checklist
+- [ ] Code pushed to GitHub
+- [ ] `render.yaml` committed to repository
+- [ ] Render account created
+- [ ] GitHub repository connected to Render
+- [ ] Blueprint deployed (or manual service created)
+- [ ] `GOOGLE_API_KEY` added as secret environment variable
+- [ ] Health check endpoint verified
+- [ ] Application accessible at Render URL
+- [ ] Custom domain configured (optional)
+- [ ] Monitoring and alerts set up
+## Support
+- **Render Documentation:** https://render.com/docs
+- **Render Community:** https://community.render.com/
+- **GitHub Issues:** https://github.com/Pulastya-B/DevSprint-Data-Science-Agent/issues
+## Next Steps
+After successful deployment:
+1. **Test the deployment:**
+   ```bash
+   curl https://data-science-agent.onrender.com/api/health
+   ```
+2. **Upload a test dataset** via the web interface
+3. **Monitor logs** for any errors
+4. **Configure custom domain** (optional)
+5. **Set up monitoring** and alerts
+6. **Share your deployed app!** 🚀
+---
+**Your app will be live at:**
+`https://data-science-agent.onrender.com`
+(URL will be different if you choose a different service name)

render.yaml ADDED Viewed

	@@ -0,0 +1,37 @@

+services:
+  - type: web
+    name: data-science-agent
+    runtime: docker
+    plan: free  # Change to 'starter' or higher for production
+    region: oregon  # Change to your preferred region
+    branch: main
+    dockerfilePath: ./Dockerfile
+    envVars:
+      - key: LLM_PROVIDER
+        value: gemini
+      - key: GOOGLE_API_KEY
+        sync: false  # Mark as secret - add via Render dashboard
+      - key: GEMINI_MODEL
+        value: gemini-2.5-flash
+      - key: REASONING_EFFORT
+        value: medium
+      - key: CACHE_DB_PATH
+        value: /tmp/cache_db/cache.db
+      - key: CACHE_TTL_SECONDS
+        value: 86400
+      - key: OUTPUT_DIR
+        value: /tmp/outputs
+      - key: DATA_DIR
+        value: /tmp/data
+      - key: MAX_PARALLEL_TOOLS
+        value: 5
+      - key: MAX_RETRIES
+        value: 3
+      - key: TIMEOUT_SECONDS
+        value: 300
+      - key: PORT
+        value: 8080
+      - key: ARTIFACT_BACKEND
+        value: local
+    healthCheckPath: /api/health
+    autoDeploy: true