Spaces:

T0X1N
/

Agentic-RagBot

Sleeping

File size: 11,912 Bytes

c4f5f25

# Deployment Guide

This guide covers deploying MediGuard AI to various environments.

## Table of Contents

1. [Prerequisites](#prerequisites)
2. [Environment Configuration](#environment-configuration)
3. [Local Development](#local-development)
4. [Docker Deployment](#docker-deployment)
5. [Kubernetes Deployment](#kubernetes-deployment)
6. [Cloud Deployment](#cloud-deployment)
7. [Monitoring and Logging](#monitoring-and-logging)
8. [Security Considerations](#security-considerations)
9. [Troubleshooting](#troubleshooting)

## Prerequisites

### System Requirements

- **CPU**: 4+ cores recommended
- **RAM**: 8GB+ minimum, 16GB+ recommended
- **Storage**: 10GB+ for vector stores
- **Network**: Stable internet connection for LLM APIs

### Software Requirements

- Python 3.11+
- Docker & Docker Compose
- Node.js 18+ (for frontend development)
- Git

## Environment Configuration

Create a `.env` file from the template:

```bash
cp .env.example .env
```

### Required Environment Variables

```bash
# API Configuration
API__HOST=127.0.0.1
API__PORT=8000
API__WORKERS=4

# LLM Configuration (choose one)
GROQ_API_KEY=your_groq_api_key
# OR
OLLAMA_BASE_URL=http://localhost:11434

# Database Configuration
OPENSEARCH_HOST=localhost
OPENSEARCH_PORT=9200
OPENSEARCH_USERNAME=admin
OPENSEARCH_PASSWORD=StrongPassword123!

# Cache Configuration
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=

# Security
SECRET_KEY=your_secret_key_here
CORS_ALLOWED_ORIGINS=http://localhost:3000,http://localhost:7860

# Optional: Monitoring
LANGFUSE_HOST=http://localhost:3000
LANGFUSE_SECRET_KEY=your_langfuse_secret
LANGFUSE_PUBLIC_KEY=your_langfuse_public
```

## Local Development

### Quick Start

```bash
# Clone repository
git clone https://github.com/yourusername/Agentic-RagBot.git
cd Agentic-RagBot

# Setup environment
python -m venv .venv
source .venv/bin/activate  # Linux/Mac
.venv\\Scripts\\activate   # Windows

# Install dependencies
pip install -r requirements.txt

# Initialize embeddings
python scripts/setup_embeddings.py

# Start development server
uvicorn src.main:app --reload --host 0.0.0.0 --port 8000
```

### Using Docker Compose

```bash
# Start all services
docker compose up -d

# View logs
docker compose logs -f api

# Stop services
docker compose down -v
```

## Docker Deployment

### Single Container

```bash
# Build image
docker build -t mediguard-ai .

# Run container
docker run -d \
  --name mediguard \
  -p 8000:8000 \
  -p 7860:7860 \
  --env-file .env \
  -v $(pwd)/data:/app/data \
  mediguard-ai
```

### Production with Docker Compose

```bash
# Use production compose file
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d

# Scale API services
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d --scale api=3
```

### Production Docker Compose Override

Create `docker-compose.prod.yml`:

```yaml
version: '3.8'

services:
  api:
    environment:
      - API__WORKERS=8
      - API__RELOAD=false
    deploy:
      replicas: 3
      resources:
        limits:
          cpus: '1'
          memory: 2G
        reservations:
          cpus: '0.5'
          memory: 1G

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
      - ./nginx/ssl:/etc/nginx/ssl:ro
    depends_on:
      - api

  opensearch:
    environment:
      - cluster.name=mediguard-prod
      - "OPENSEARCH_JAVA_OPTS=-Xms2g -Xmx2g"
    deploy:
      resources:
        limits:
          memory: 4G
```

## Kubernetes Deployment

### Namespace and ConfigMap

```yaml
# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: mediguard

---
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: mediguard-config
  namespace: mediguard
data:
  API__HOST: "0.0.0.0"
  API__PORT: "8000"
  OPENSEARCH__HOST: "opensearch"
  OPENSEARCH__PORT: "9200"
  REDIS__HOST: "redis"
  REDIS__PORT: "6379"
```

### Secret

```yaml
# secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: mediguard-secrets
  namespace: mediguard
type: Opaque
data:
  GROQ_API_KEY: <base64-encoded-key>
  SECRET_KEY: <base64-encoded-secret>
  OPENSEARCH_PASSWORD: <base64-encoded-password>
```

### Deployment

```yaml
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mediguard-api
  namespace: mediguard
spec:
  replicas: 3
  selector:
    matchLabels:
      app: mediguard-api
  template:
    metadata:
      labels:
        app: mediguard-api
    spec:
      containers:
      - name: api
        image: mediguard-ai:latest
        ports:
        - containerPort: 8000
        envFrom:
        - configMapRef:
            name: mediguard-config
        - secretRef:
            name: mediguard-secrets
        resources:
          requests:
            memory: "1Gi"
            cpu: "500m"
          limits:
            memory: "2Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 5
          periodSeconds: 5
```

### Service and Ingress

```yaml
# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: mediguard-service
  namespace: mediguard
spec:
  selector:
    app: mediguard-api
  ports:
  - port: 80
    targetPort: 8000
  type: ClusterIP

---
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: mediguard-ingress
  namespace: mediguard
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  tls:
  - hosts:
    - api.mediguard-ai.com
    secretName: mediguard-tls
  rules:
  - host: api.mediguard-ai.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: mediguard-service
            port:
              number: 80
```

## Cloud Deployment

### AWS ECS

1. Create ECR repository:
```bash
aws ecr create-repository --repository-name mediguard-ai
```

2. Push image:
```bash
aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin <account-id>.dkr.ecr.us-west-2.amazonaws.com
docker tag mediguard-ai:latest <account-id>.dkr.ecr.us-west-2.amazonaws.com/mediguard-ai:latest
docker push <account-id>.dkr.ecr.us-west-2.amazonaws.com/mediguard-ai:latest
```

3. Deploy using ECS task definition

### Google Cloud Run

```bash
# Build and push
gcloud builds submit --tag gcr.io/PROJECT-ID/mediguard-ai

# Deploy
gcloud run deploy mediguard-ai \
  --image gcr.io/PROJECT-ID/mediguard-ai \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --memory 2Gi \
  --cpu 1 \
  --max-instances 10
```

### Azure Container Instances

```bash
# Create resource group
az group create --name mediguard-rg --location eastus

# Deploy container
az container create \
  --resource-group mediguard-rg \
  --name mediguard-ai \
  --image mediguard-ai:latest \
  --cpu 1 \
  --memory 2 \
  --ports 8000 \
  --environment-variables \
    API__HOST=0.0.0.0 \
    API__PORT=8000
```

## Monitoring and Logging

### Prometheus Metrics

Add to your FastAPI app:

```python
from prometheus_fastapi_instrumentator import Instrumentator

Instrumentator().instrument(app).expose(app)
```

### ELK Stack

```yaml
# docker-compose.monitoring.yml
version: '3.8'

services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
    environment:
      - discovery.type=single-node
      - xpack.security.enabled=false
    ports:
      - "9200:9200"
    volumes:
      - elasticsearch-data:/usr/share/elasticsearch/data

  logstash:
    image: docker.elastic.co/logstash/logstash:8.11.0
    volumes:
      - ./logstash/pipeline:/usr/share/logstash/pipeline
    ports:
      - "5044:5044"
    depends_on:
      - elasticsearch

  kibana:
    image: docker.elastic.co/kibana/kibana:8.11.0
    ports:
      - "5601:5601"
    environment:
      ELASTICSEARCH_HOSTS: http://elasticsearch:9200
    depends_on:
      - elasticsearch

volumes:
  elasticsearch-data:
```

### Health Checks

The application includes built-in health checks:

```bash
# Basic health
curl http://localhost:8000/health

# Detailed health with dependencies
curl http://localhost:8000/health/detailed
```

## Security Considerations

### SSL/TLS Configuration

```nginx
# nginx/nginx.conf
server {
    listen 443 ssl http2;
    server_name api.mediguard-ai.com;
    
    ssl_certificate /etc/nginx/ssl/cert.pem;
    ssl_certificate_key /etc/nginx/ssl/key.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers HIGH:!aNULL:!MD5;
    
    location / {
        proxy_pass http://api:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}
```

### Rate Limiting

```python
# Add to main.py
from slowapi import Limiter
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)

@app.get("/api/analyze")
@limiter.limit("10/minute")
async def analyze():
    pass
```

### Security Headers

```python
# Already included in src/middlewares.py
SecurityHeadersMiddleware adds:
- X-Content-Type-Options: nosniff
- X-Frame-Options: DENY
- X-XSS-Protection: 1; mode=block
- Strict-Transport-Security
```

## Troubleshooting

### Common Issues

1. **Memory Issues**:
   - Increase container memory limits
   - Optimize vector store size
   - Use Redis for caching

2. **Slow Response Times**:
   - Check LLM provider latency
   - Optimize retriever settings
   - Add caching layers

3. **Database Connection Errors**:
   - Verify OpenSearch is running
   - Check network connectivity
   - Validate credentials

### Debug Mode

Enable debug logging:

```bash
export LOG_LEVEL=DEBUG
python -m src.main
```

### Performance Tuning

1. **Vector Store Optimization**:
   ```python
   # Adjust in config
   RETRIEVAL_K=10  # Reduce for faster retrieval
   EMBEDDING_BATCH_SIZE=32  # Optimize based on GPU memory
   ```

2. **Async Optimization**:
   ```python
   # Use connection pooling
   HTTPX_LIMITS=httpx.Limits(max_connections=100, max_keepalive_connections=20)
   ```

3. **Caching Strategy**:
   ```python
   # Cache frequent queries
   CACHE_TTL=3600  # 1 hour
   CACHE_MAX_SIZE=1000
   ```

## Backup and Recovery

### Data Backup

```bash
# Backup vector stores
docker exec opensearch tar czf /backup/$(date +%Y%m%d)_opensearch.tar.gz /usr/share/opensearch/data

# Backup Redis
docker exec redis redis-cli BGSAVE
docker cp redis:/data/dump.rdb ./backup/redis_$(date +%Y%m%d).rdb
```

### Disaster Recovery

1. Restore from backups
2. Verify data integrity
3. Update configuration if needed
4. Restart services

## Scaling Guidelines

### Horizontal Scaling

- Use load balancer (nginx/HAProxy)
- Deploy multiple API instances
- Consider session affinity if needed

### Vertical Scaling

- Monitor resource usage
- Adjust CPU/memory limits
- Optimize database queries

### Auto-scaling (Kubernetes)

```yaml
# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: mediguard-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: mediguard-api
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
```

## Support

For deployment issues:
- Check logs: `docker compose logs -f`
- Review monitoring dashboards
- Consult troubleshooting guide
- Contact support at deploy@mediguard-ai.com