Agentic-RagBot / DEPLOYMENT.md
MediGuard AI
feat: Initial release of MediGuard AI v2.0
c4f5f25
# Deployment Guide
This guide covers deploying MediGuard AI to various environments.
## Table of Contents
1. [Prerequisites](#prerequisites)
2. [Environment Configuration](#environment-configuration)
3. [Local Development](#local-development)
4. [Docker Deployment](#docker-deployment)
5. [Kubernetes Deployment](#kubernetes-deployment)
6. [Cloud Deployment](#cloud-deployment)
7. [Monitoring and Logging](#monitoring-and-logging)
8. [Security Considerations](#security-considerations)
9. [Troubleshooting](#troubleshooting)
## Prerequisites
### System Requirements
- **CPU**: 4+ cores recommended
- **RAM**: 8GB+ minimum, 16GB+ recommended
- **Storage**: 10GB+ for vector stores
- **Network**: Stable internet connection for LLM APIs
### Software Requirements
- Python 3.11+
- Docker & Docker Compose
- Node.js 18+ (for frontend development)
- Git
## Environment Configuration
Create a `.env` file from the template:
```bash
cp .env.example .env
```
### Required Environment Variables
```bash
# API Configuration
API__HOST=127.0.0.1
API__PORT=8000
API__WORKERS=4
# LLM Configuration (choose one)
GROQ_API_KEY=your_groq_api_key
# OR
OLLAMA_BASE_URL=http://localhost:11434
# Database Configuration
OPENSEARCH_HOST=localhost
OPENSEARCH_PORT=9200
OPENSEARCH_USERNAME=admin
OPENSEARCH_PASSWORD=StrongPassword123!
# Cache Configuration
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=
# Security
SECRET_KEY=your_secret_key_here
CORS_ALLOWED_ORIGINS=http://localhost:3000,http://localhost:7860
# Optional: Monitoring
LANGFUSE_HOST=http://localhost:3000
LANGFUSE_SECRET_KEY=your_langfuse_secret
LANGFUSE_PUBLIC_KEY=your_langfuse_public
```
## Local Development
### Quick Start
```bash
# Clone repository
git clone https://github.com/yourusername/Agentic-RagBot.git
cd Agentic-RagBot
# Setup environment
python -m venv .venv
source .venv/bin/activate # Linux/Mac
.venv\\Scripts\\activate # Windows
# Install dependencies
pip install -r requirements.txt
# Initialize embeddings
python scripts/setup_embeddings.py
# Start development server
uvicorn src.main:app --reload --host 0.0.0.0 --port 8000
```
### Using Docker Compose
```bash
# Start all services
docker compose up -d
# View logs
docker compose logs -f api
# Stop services
docker compose down -v
```
## Docker Deployment
### Single Container
```bash
# Build image
docker build -t mediguard-ai .
# Run container
docker run -d \
--name mediguard \
-p 8000:8000 \
-p 7860:7860 \
--env-file .env \
-v $(pwd)/data:/app/data \
mediguard-ai
```
### Production with Docker Compose
```bash
# Use production compose file
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
# Scale API services
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d --scale api=3
```
### Production Docker Compose Override
Create `docker-compose.prod.yml`:
```yaml
version: '3.8'
services:
api:
environment:
- API__WORKERS=8
- API__RELOAD=false
deploy:
replicas: 3
resources:
limits:
cpus: '1'
memory: 2G
reservations:
cpus: '0.5'
memory: 1G
nginx:
image: nginx:alpine
ports:
- "80:80"
- "443:443"
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./nginx/ssl:/etc/nginx/ssl:ro
depends_on:
- api
opensearch:
environment:
- cluster.name=mediguard-prod
- "OPENSEARCH_JAVA_OPTS=-Xms2g -Xmx2g"
deploy:
resources:
limits:
memory: 4G
```
## Kubernetes Deployment
### Namespace and ConfigMap
```yaml
# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: mediguard
---
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: mediguard-config
namespace: mediguard
data:
API__HOST: "0.0.0.0"
API__PORT: "8000"
OPENSEARCH__HOST: "opensearch"
OPENSEARCH__PORT: "9200"
REDIS__HOST: "redis"
REDIS__PORT: "6379"
```
### Secret
```yaml
# secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: mediguard-secrets
namespace: mediguard
type: Opaque
data:
GROQ_API_KEY: <base64-encoded-key>
SECRET_KEY: <base64-encoded-secret>
OPENSEARCH_PASSWORD: <base64-encoded-password>
```
### Deployment
```yaml
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: mediguard-api
namespace: mediguard
spec:
replicas: 3
selector:
matchLabels:
app: mediguard-api
template:
metadata:
labels:
app: mediguard-api
spec:
containers:
- name: api
image: mediguard-ai:latest
ports:
- containerPort: 8000
envFrom:
- configMapRef:
name: mediguard-config
- secretRef:
name: mediguard-secrets
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 5
periodSeconds: 5
```
### Service and Ingress
```yaml
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: mediguard-service
namespace: mediguard
spec:
selector:
app: mediguard-api
ports:
- port: 80
targetPort: 8000
type: ClusterIP
---
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: mediguard-ingress
namespace: mediguard
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
tls:
- hosts:
- api.mediguard-ai.com
secretName: mediguard-tls
rules:
- host: api.mediguard-ai.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: mediguard-service
port:
number: 80
```
## Cloud Deployment
### AWS ECS
1. Create ECR repository:
```bash
aws ecr create-repository --repository-name mediguard-ai
```
2. Push image:
```bash
aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin <account-id>.dkr.ecr.us-west-2.amazonaws.com
docker tag mediguard-ai:latest <account-id>.dkr.ecr.us-west-2.amazonaws.com/mediguard-ai:latest
docker push <account-id>.dkr.ecr.us-west-2.amazonaws.com/mediguard-ai:latest
```
3. Deploy using ECS task definition
### Google Cloud Run
```bash
# Build and push
gcloud builds submit --tag gcr.io/PROJECT-ID/mediguard-ai
# Deploy
gcloud run deploy mediguard-ai \
--image gcr.io/PROJECT-ID/mediguard-ai \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--memory 2Gi \
--cpu 1 \
--max-instances 10
```
### Azure Container Instances
```bash
# Create resource group
az group create --name mediguard-rg --location eastus
# Deploy container
az container create \
--resource-group mediguard-rg \
--name mediguard-ai \
--image mediguard-ai:latest \
--cpu 1 \
--memory 2 \
--ports 8000 \
--environment-variables \
API__HOST=0.0.0.0 \
API__PORT=8000
```
## Monitoring and Logging
### Prometheus Metrics
Add to your FastAPI app:
```python
from prometheus_fastapi_instrumentator import Instrumentator
Instrumentator().instrument(app).expose(app)
```
### ELK Stack
```yaml
# docker-compose.monitoring.yml
version: '3.8'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
environment:
- discovery.type=single-node
- xpack.security.enabled=false
ports:
- "9200:9200"
volumes:
- elasticsearch-data:/usr/share/elasticsearch/data
logstash:
image: docker.elastic.co/logstash/logstash:8.11.0
volumes:
- ./logstash/pipeline:/usr/share/logstash/pipeline
ports:
- "5044:5044"
depends_on:
- elasticsearch
kibana:
image: docker.elastic.co/kibana/kibana:8.11.0
ports:
- "5601:5601"
environment:
ELASTICSEARCH_HOSTS: http://elasticsearch:9200
depends_on:
- elasticsearch
volumes:
elasticsearch-data:
```
### Health Checks
The application includes built-in health checks:
```bash
# Basic health
curl http://localhost:8000/health
# Detailed health with dependencies
curl http://localhost:8000/health/detailed
```
## Security Considerations
### SSL/TLS Configuration
```nginx
# nginx/nginx.conf
server {
listen 443 ssl http2;
server_name api.mediguard-ai.com;
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers HIGH:!aNULL:!MD5;
location / {
proxy_pass http://api:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
```
### Rate Limiting
```python
# Add to main.py
from slowapi import Limiter
from slowapi.util import get_remote_address
limiter = Limiter(key_func=get_remote_address)
@app.get("/api/analyze")
@limiter.limit("10/minute")
async def analyze():
pass
```
### Security Headers
```python
# Already included in src/middlewares.py
SecurityHeadersMiddleware adds:
- X-Content-Type-Options: nosniff
- X-Frame-Options: DENY
- X-XSS-Protection: 1; mode=block
- Strict-Transport-Security
```
## Troubleshooting
### Common Issues
1. **Memory Issues**:
- Increase container memory limits
- Optimize vector store size
- Use Redis for caching
2. **Slow Response Times**:
- Check LLM provider latency
- Optimize retriever settings
- Add caching layers
3. **Database Connection Errors**:
- Verify OpenSearch is running
- Check network connectivity
- Validate credentials
### Debug Mode
Enable debug logging:
```bash
export LOG_LEVEL=DEBUG
python -m src.main
```
### Performance Tuning
1. **Vector Store Optimization**:
```python
# Adjust in config
RETRIEVAL_K=10 # Reduce for faster retrieval
EMBEDDING_BATCH_SIZE=32 # Optimize based on GPU memory
```
2. **Async Optimization**:
```python
# Use connection pooling
HTTPX_LIMITS=httpx.Limits(max_connections=100, max_keepalive_connections=20)
```
3. **Caching Strategy**:
```python
# Cache frequent queries
CACHE_TTL=3600 # 1 hour
CACHE_MAX_SIZE=1000
```
## Backup and Recovery
### Data Backup
```bash
# Backup vector stores
docker exec opensearch tar czf /backup/$(date +%Y%m%d)_opensearch.tar.gz /usr/share/opensearch/data
# Backup Redis
docker exec redis redis-cli BGSAVE
docker cp redis:/data/dump.rdb ./backup/redis_$(date +%Y%m%d).rdb
```
### Disaster Recovery
1. Restore from backups
2. Verify data integrity
3. Update configuration if needed
4. Restart services
## Scaling Guidelines
### Horizontal Scaling
- Use load balancer (nginx/HAProxy)
- Deploy multiple API instances
- Consider session affinity if needed
### Vertical Scaling
- Monitor resource usage
- Adjust CPU/memory limits
- Optimize database queries
### Auto-scaling (Kubernetes)
```yaml
# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: mediguard-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: mediguard-api
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
```
## Support
For deployment issues:
- Check logs: `docker compose logs -f`
- Review monitoring dashboards
- Consult troubleshooting guide
- Contact support at deploy@mediguard-ai.com