# Deployment Guide

Comprehensive deployment guide for the MCP Orchestration Platform across different environments and platforms.

## Table of Contents

1. [Prerequisites](#prerequisites)
2. [Environment Setup](#environment-setup)
3. [Local Development](#local-development)
4. [Docker Deployment](#docker-deployment)
5. [Kubernetes Deployment](#kubernetes-deployment)
6. [Cloud Platform Deployment](#cloud-platform-deployment)
7. [Production Configuration](#production-configuration)
8. [Monitoring and Logging](#monitoring-and-logging)
9. [Security Configuration](#security-configuration)
10. [Troubleshooting](#troubleshooting)

## Prerequisites

### System Requirements

**Minimum Requirements:**
- CPU: 2 cores
- RAM: 4GB
- Storage: 20GB SSD
- Network: 100 Mbps

**Recommended Production Requirements:**
- CPU: 4+ cores
- RAM: 8GB+
- Storage: 50GB+ NVMe SSD
- Network: 1 Gbps

### Software Dependencies

**Required:**
- Python 3.8+
- pip (Python package manager)
- git (for cloning repository)

**Optional (depending on deployment):**
- Docker 20.10+
- Docker Compose 2.0+
- kubectl (for Kubernetes)
- Terraform (for infrastructure as code)

### Infrastructure Dependencies

**Database:**
- PostgreSQL 12+ (recommended)
- Redis 6.0+ (for caching)
- Optional: MongoDB (for audit logs)

**Monitoring:**
- Prometheus (metrics collection)
- Grafana (dashboard visualization)
- ELK Stack (log aggregation)

**Security:**
- HashiCorp Vault (enterprise secrets management)
- AWS Secrets Manager (cloud deployment)
- TLS certificates

## Environment Setup

### Development Environment

1. **Clone the repository**
```bash
git clone https://github.com/your-org/mcp-orchestration-platform.git
cd mcp-orchestration-platform/orchestration_platform
```

2. **Create virtual environment**
```bash
python -m venv venv
source venv/bin/activate  # Linux/Mac
# or
venv\Scripts\activate  # Windows
```

3. **Install dependencies**
```bash
pip install -r requirements.txt
pip install -r requirements-dev.txt  # For development
```

4. **Set up environment variables**
```bash
cp .env.example .env
# Edit .env with your configuration
```

5. **Initialize database**
```bash
python -c "from orchestration_platform.mcp_orchestrator import MCPOrchestrator; import asyncio; asyncio.run(MCPOrchestrator().initialize())"
```

### Testing the Setup

```bash
# Run tests
python -m pytest test_orchestrator.py

# Run demo application
python demo.py
```

## Local Development

### Quick Start

1. **Start required services**
```bash
# Start PostgreSQL and Redis
docker-compose up -d postgres redis

# Or use local installations
sudo service postgresql start
sudo service redis-server start
```

2. **Run the orchestrator**
```bash
python demo.py
```

3. **Start sample servers (separate terminals)**
```bash
# Terminal 1: Weather server
python sample_servers/weather_server.py

# Terminal 2: CRM server  
python sample_servers/crm_server.py
```

### Development Configuration

Create `.env` file:
```bash
# Core Configuration
ORCHESTRATOR_HOST=localhost
ORCHESTRATOR_PORT=7860
LOG_LEVEL=DEBUG

# Database
DATABASE_URL=postgresql://postgres:password@localhost:5432/orchestrator_dev
CACHE_URL=redis://localhost:6379

# Security
JWT_SECRET=your-development-secret-key
ENCRYPTION_KEY=your-development-encryption-key

# Secrets (Development)
SECRETS_BACKEND=local
SECRETS_ENCRYPTION_KEY=dev-encryption-key

# Monitoring
PROMETHEUS_ENABLED=true
METRICS_PORT=9090
```

### Hot Reloading

For development with auto-reload:
```bash
pip install watchdog
watchmedo auto-restart --patterns="*.py" --recursive -- python demo.py
```

## Docker Deployment

### Single Container Deployment

1. **Build image**
```dockerfile
FROM python:3.11-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    gcc \
    curl \
    && rm -rf /var/lib/apt/lists/*

# Copy requirements and install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Create non-root user
RUN useradd -m -u 1000 orchestrator
USER orchestrator

# Expose port
EXPOSE 7860

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:7860/health/ready || exit 1

# Run application
CMD ["python", "demo.py"]
```

2. **Build and run**
```bash
docker build -t mcp-orchestrator:latest .
docker run -p 7860:7860 --env-file .env mcp-orchestrator:latest
```

### Docker Compose Deployment

1. **Create docker-compose.yml**
```yaml
version: '3.8'

services:
  orchestrator:
    build: .
    ports:
      - "7860:7860"
    environment:
      - DATABASE_URL=postgresql://postgres:${POSTGRES_PASSWORD}@postgres:5432/orchestrator
      - CACHE_URL=redis://redis:6379
      - SECRETS_BACKEND=vault
      - VAULT_ADDR=http://vault:8200
    depends_on:
      - postgres
      - redis
      - vault
    volumes:
      - ./logs:/app/logs
      - ./config:/app/config
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:7860/health/ready"]
      interval: 30s
      timeout: 10s
      retries: 3

  postgres:
    image: postgres:15-alpine
    environment:
      - POSTGRES_DB=orchestrator
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data
      - ./init.sql:/docker-entrypoint-initdb.d/init.sql
    restart: unless-stopped

  redis:
    image: redis:7-alpine
    command: redis-server --appendonly yes
    volumes:
      - redis_data:/data
    restart: unless-stopped

  vault:
    image: hashicorp/vault:latest
    cap_add:
      - IPC_LOCK
    environment:
      - VAULT_DEV_ROOT_TOKEN_ID=dev-root-token
      - VAULT_DEV_LISTEN_ADDRESS=0.0.0.0:8200
    ports:
      - "8200:8200"
    restart: unless-stopped

  prometheus:
    image: prom/prometheus:latest
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/etc/prometheus/console_libraries'
      - '--web.console.templates=/etc/prometheus/consoles'
      - '--web.enable-lifecycle'
    restart: unless-stopped

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3000:3000"
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
    volumes:
      - grafana_data:/var/lib/grafana
      - ./grafana/dashboards:/etc/grafana/provisioning/dashboards
      - ./grafana/datasources:/etc/grafana/provisioning/datasources
    restart: unless-stopped

volumes:
  postgres_data:
  redis_data:
  prometheus_data:
  grafana_data:

networks:
  default:
    driver: bridge
```

2. **Create environment file**
```bash
# .env
POSTGRES_PASSWORD=secure-password-here
GRAFANA_PASSWORD=admin-password-here
VAULT_TOKEN=dev-root-token
```

3. **Deploy with Docker Compose**
```bash
docker-compose up -d
```

4. **Verify deployment**
```bash
docker-compose ps
curl http://localhost:7860/health/ready
curl http://localhost:3000  # Grafana
```

### Production Docker Configuration

1. **Use multi-stage build for optimization**
```dockerfile
# Build stage
FROM python:3.11-slim as builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

# Runtime stage
FROM python:3.11-slim
WORKDIR /app
RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*

# Copy installed packages
COPY --from=builder /root/.local /root/.local
COPY --chown=1000:1000 . .

USER 1000
EXPOSE 7860

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:7860/health/ready || exit 1

CMD ["python", "demo.py"]
```

2. **Security optimizations**
```dockerfile
# Run as non-root user
USER 1000

# Remove unnecessary packages
RUN apt-get clean && rm -rf /var/lib/apt/lists/*

# Use read-only filesystem where possible
VOLUME ["/app/logs", "/app/config"]
```

## Kubernetes Deployment

### Basic Deployment

1. **Create namespace**
```yaml
# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: mcp-orchestrator
```

2. **Create ConfigMap**
```yaml
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: orchestrator-config
  namespace: mcp-orchestrator
data:
  ORCHESTRATOR_HOST: "0.0.0.0"
  ORCHESTRATOR_PORT: "7860"
  LOG_LEVEL: "INFO"
  PROMETHEUS_ENABLED: "true"
  METRICS_PORT: "9090"
```

3. **Create Secret**
```yaml
# secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: orchestrator-secrets
  namespace: mcp-orchestrator
type: Opaque
data:
  DATABASE_URL: cG9zdGdyZXNxbDovL3VzZXI6cGFzc3dvcmRAcG9zdGdyZXM6NTQzMi9vcmNoZXN0cmF0b3I=  # base64 encoded
  JWT_SECRET: eW91ci1qd3Qtc2VjcmV0LWtleQ==  # base64 encoded
  ENCRYPTION_KEY: eW91ci1lbmNyeXB0aW9uLWtleQ==  # base64 encoded
```

4. **Create Deployment**
```yaml
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: orchestrator
  namespace: mcp-orchestrator
  labels:
    app: mcp-orchestrator
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: mcp-orchestrator
  template:
    metadata:
      labels:
        app: mcp-orchestrator
    spec:
      containers:
      - name: orchestrator
        image: mcp-orchestrator:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 7860
          name: http
        - containerPort: 9090
          name: metrics
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: orchestrator-secrets
              key: DATABASE_URL
        - name: JWT_SECRET
          valueFrom:
            secretKeyRef:
              name: orchestrator-secrets
              key: JWT_SECRET
        - name: ENCRYPTION_KEY
          valueFrom:
            secretKeyRef:
              name: orchestrator-secrets
              key: ENCRYPTION_KEY
        envFrom:
        - configMapRef:
            name: orchestrator-config
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /health/live
            port: 7860
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health/ready
            port: 7860
          initialDelaySeconds: 5
          periodSeconds: 5
        volumeMounts:
        - name: config-volume
          mountPath: /app/config
        - name: logs-volume
          mountPath: /app/logs
      volumes:
      - name: config-volume
        configMap:
          name: orchestrator-config
      - name: logs-volume
        emptyDir: {}
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 1000
```

5. **Create Service**
```yaml
# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: orchestrator-service
  namespace: mcp-orchestrator
  labels:
    app: mcp-orchestrator
spec:
  type: ClusterIP
  ports:
  - port: 80
    targetPort: 7860
    protocol: TCP
    name: http
  - port: 9090
    targetPort: 9090
    protocol: TCP
    name: metrics
  selector:
    app: mcp-orchestrator
```

6. **Create Ingress**
```yaml
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: orchestrator-ingress
  namespace: mcp-orchestrator
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/proxy-body-size: "10m"
spec:
  tls:
  - hosts:
    - orchestrator.yourdomain.com
    secretName: orchestrator-tls
  rules:
  - host: orchestrator.yourdomain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: orchestrator-service
            port:
              number: 80
```

### Deploy to Kubernetes

```bash
# Apply all resources
kubectl apply -f namespace.yaml
kubectl apply -f configmap.yaml
kubectl apply -f secret.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
kubectl apply -f ingress.yaml

# Verify deployment
kubectl get pods -n mcp-orchestrator
kubectl get services -n mcp-orchestrator
kubectl get ingress -n mcp-orchestrator
```

### Helm Chart Deployment

1. **Create Helm chart structure**
```bash
helm create mcp-orchestrator
```

2. **Configure values.yaml**
```yaml
# values.yaml
replicaCount: 3

image:
  repository: mcp-orchestrator
  tag: latest
  pullPolicy: Always

service:
  type: ClusterIP
  port: 80
  targetPort: 7860

ingress:
  enabled: true
  className: nginx
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod
  hosts:
    - host: orchestrator.yourdomain.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: orchestrator-tls
      hosts:
        - orchestrator.yourdomain.com

resources:
  limits:
    cpu: 500m
    memory: 1Gi
  requests:
    cpu: 250m
    memory: 512Mi

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

nodeSelector: {}
tolerations: []
affinity: {}

config:
  ORCHESTRATOR_HOST: "0.0.0.0"
  ORCHESTRATOR_PORT: "7860"
  LOG_LEVEL: "INFO"
  PROMETHEUS_ENABLED: "true"
  METRICS_PORT: "9090"
```

3. **Deploy with Helm**
```bash
# Install
helm install orchestrator ./mcp-orchestrator -n mcp-orchestrator

# Upgrade
helm upgrade orchestrator ./mcp-orchestrator -n mcp-orchestrator

# Uninstall
helm uninstall orchestrator -n mcp-orchestrator
```

## Cloud Platform Deployment

### AWS Deployment

#### ECS with Fargate

1. **Create task definition**
```json
{
  "family": "mcp-orchestrator",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024",
  "executionRoleArn": "arn:aws:iam::ACCOUNT:role/ecsTaskExecutionRole",
  "taskRoleArn": "arn:aws:iam::ACCOUNT:role/ecsTaskRole",
  "containerDefinitions": [
    {
      "name": "orchestrator",
      "image": "ACCOUNT.dkr.ecr.REGION.amazonaws.com/mcp-orchestrator:latest",
      "portMappings": [
        {
          "containerPort": 7860,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {
          "name": "ORCHESTRATOR_HOST",
          "value": "0.0.0.0"
        },
        {
          "name": "ORCHESTRATOR_PORT", 
          "value": "7860"
        }
      ],
      "secrets": [
        {
          "name": "DATABASE_URL",
          "valueFrom": "arn:aws:ssm:REGION:ACCOUNT:parameter/orchestrator/database-url"
        },
        {
          "name": "JWT_SECRET",
          "valueFrom": "arn:aws:ssm:REGION:ACCOUNT:parameter/orchestrator/jwt-secret"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/mcp-orchestrator",
          "awslogs-region": "REGION",
          "awslogs-stream-prefix": "ecs"
        }
      }
    }
  ]
}
```

2. **Deploy with CloudFormation**
```yaml
# cloudformation-template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Description: 'MCP Orchestrator Platform'

Parameters:
  DatabasePassword:
    Type: String
    NoEcho: true
    Description: 'Database password'

Resources:
  # ECR Repository
  ECRRepository:
    Type: AWS::ECR::Repository
    Properties:
      RepositoryName: mcp-orchestrator

  # ECS Cluster
  ECSCluster:
    Type: AWS::ECS::Cluster
    Properties:
      ClusterName: mcp-orchestrator-cluster

  # Task Definition
  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: mcp-orchestrator
      NetworkMode: awsvpc
      RequiresCompatibilities:
        - FARGATE
      Cpu: 512
      Memory: 1024
      ExecutionRoleArn: !Ref ECSExecutionRole
      TaskRoleArn: !Ref ECSTaskRole
      ContainerDefinitions:
        - Name: orchestrator
          Image: !Sub '${AWS::AccountId}.dkr.ecr.${AWS::Region}.amazonaws.com/mcp-orchestrator:latest'
          PortMappings:
            - ContainerPort: 7860
          Environment:
            - Name: ORCHESTRATOR_HOST
              Value: 0.0.0.0
            - Name: ORCHESTRATOR_PORT
              Value: '7860'
          Secrets:
            - Name: DATABASE_URL
              ValueFrom: !Ref DatabaseSecret
          LogConfiguration:
            LogDriver: awslogs
            Options:
              awslogs-group: !Ref CloudWatchLogsGroup
              awslogs-region: !Ref AWS::Region
              awslogs-stream-prefix: ecs

  # Service
  ECSService:
    Type: AWS::ECS::Service
    Properties:
      ServiceName: mcp-orchestrator-service
      Cluster: !Ref ECSCluster
      TaskDefinition: !Ref TaskDefinition
      DesiredCount: 2
      LaunchType: FARGATE
      NetworkConfiguration:
        AwsvpcConfiguration:
          AssignPublicIp: ENABLED
          SecurityGroups:
            - !Ref ECSSecurityGroup
          Subnets:
            - !Ref PublicSubnet1
            - !Ref PublicSubnet2
      LoadBalancers:
        - ContainerName: orchestrator
          ContainerPort: 7860
          TargetGroupArn: !Ref TargetGroup

  # Load Balancer
  ApplicationLoadBalancer:
    Type: AWS::ElasticLoadBalancingV2::LoadBalancer
    Properties:
      Name: mcp-orchestrator-alb
      Scheme: internet-facing
      Type: application
      SecurityGroups:
        - !Ref ALBSecurityGroup
      Subnets:
        - !Ref PublicSubnet1
        - !Ref PublicSubnet2

  # Target Group
  TargetGroup:
    Type: AWS::ElasticLoadBalancingV2::TargetGroup
    Properties:
      Name: mcp-orchestrator-tg
      Port: 7860
      Protocol: HTTP
      VpcId: !Ref VPC
      TargetGroupAttributes:
        - Key: deregistration_delay.timeout_seconds
          Value: 30

  # Listener
  Listener:
    Type: AWS::ElasticLoadBalancingV2::Listener
    Properties:
      DefaultActions:
        - Type: forward
          TargetGroupArn: !Ref TargetGroup
      LoadBalancerArn: !Ref ApplicationLoadBalancer
      Port: 80
      Protocol: HTTP

Outputs:
  ServiceURL:
    Value: !GetAtt ApplicationLoadBalancer.DNSName
    Description: URL for the MCP Orchestrator service
```

3. **Deploy**
```bash
# Build and push image
aws ecr get-login-password --region REGION | docker login --username AWS --password-stdin ACCOUNT.dkr.ecr.REGION.amazonaws.com
docker build -t mcp-orchestrator .
docker tag mcp-orchestrator:latest ACCOUNT.dkr.ecr.REGION.amazonaws.com/mcp-orchestrator:latest
docker push ACCOUNT.dkr.ecr.REGION.amazonaws.com/mcp-orchestrator:latest

# Deploy with CloudFormation
aws cloudformation deploy \
  --template-file cloudformation-template.yaml \
  --stack-name mcp-orchestrator \
  --parameter-overrides DatabasePassword=your-secure-password \
  --capabilities CAPABILITY_IAM
```

#### AWS EKS Deployment

```yaml
# eks-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mcp-orchestrator
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: mcp-orchestrator
  template:
    metadata:
      labels:
        app: mcp-orchestrator
    spec:
      containers:
      - name: orchestrator
        image: ACCOUNT.dkr.ecr.REGION.amazonaws.com/mcp-orchestrator:latest
        ports:
        - containerPort: 7860
        env:
        - name: ORCHESTRATOR_HOST
          value: "0.0.0.0"
        - name: ORCHESTRATOR_PORT
          value: "7860"
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: orchestrator-secrets
              key: database-url
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"
---
apiVersion: v1
kind: Service
metadata:
  name: mcp-orchestrator-service
spec:
  selector:
    app: mcp-orchestrator
  ports:
  - port: 80
    targetPort: 7860
  type: LoadBalancer
```

### Azure Container Instances

1. **Create resource group**
```bash
az group create --name mcp-orchestrator-rg --location eastus
```

2. **Deploy container**
```bash
az container create \
  --resource-group mcp-orchestrator-rg \
  --name mcp-orchestrator \
  --image mcp-orchestrator:latest \
  --cpu 2 \
  --memory 4 \
  --ports 7860 \
  --environment-variables \
    ORCHESTRATOR_HOST=0.0.0.0 \
    ORCHESTRATOR_PORT=7860 \
    LOG_LEVEL=INFO \
  --secure-environment-variables \
    DATABASE_URL=postgresql://user:pass@server:5432/db \
    JWT_SECRET=your-jwt-secret \
  --restart-policy Always
```

3. **Create Azure Database for PostgreSQL**
```bash
az postgres server create \
  --resource-group mcp-orchestrator-rg \
  --name mcp-orchestrator-db \
  --location eastus \
  --admin-user orchestrator \
  --admin-password secure-password \
  --sku-name B_Gen5_1
```

### Google Cloud Run Deployment

1. **Build and push image**
```bash
gcloud builds submit --tag gcr.io/PROJECT-ID/mcp-orchestrator
```

2. **Deploy to Cloud Run**
```bash
gcloud run deploy mcp-orchestrator \
  --image gcr.io/PROJECT-ID/mcp-orchestrator \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --port 7860 \
  --memory 1Gi \
  --cpu 2 \
  --set-env-vars ORCHESTRATOR_HOST=0.0.0.0,ORCHESTRATOR_PORT=7860,LOG_LEVEL=INFO \
  --set-secrets DATABASE_URL=mcp-orchestrator-db-url:latest \
  --set-secrets JWT_SECRET=mcp-orchestrator-jwt-secret:latest
```

## Production Configuration

### Environment Variables

```bash
# Core Application
ORCHESTRATOR_HOST=0.0.0.0
ORCHESTRATOR_PORT=7860
LOG_LEVEL=INFO
DEBUG=false

# Database Configuration
DATABASE_URL=postgresql://user:password@host:5432/database
DATABASE_POOL_SIZE=20
DATABASE_MAX_OVERFLOW=30
DATABASE_POOL_TIMEOUT=30

# Cache Configuration  
CACHE_URL=redis://redis:6379/0
CACHE_POOL_SIZE=20
CACHE_TTL=3600

# Security
JWT_SECRET=your-super-secure-jwt-secret-key
ENCRYPTION_KEY=your-32-byte-encryption-key
SECRET_KEY_ROTATION_DAYS=90
SESSION_TTL=3600
MAX_SESSIONS=10000

# Secrets Management
SECRETS_BACKEND=vault  # local, vault, aws, environment
VAULT_ADDR=http://vault:8200
VAULT_TOKEN=your-vault-token
AWS_REGION=us-east-1

# Rate Limiting
RATE_LIMIT_REQUESTS=1000
RATE_LIMIT_WINDOW=3600
RATE_LIMIT_STORAGE=redis

# Monitoring
PROMETHEUS_ENABLED=true
METRICS_PORT=9090
HEALTH_CHECK_INTERVAL=30
METRICS_RETENTION_DAYS=30

# Performance
MAX_CONNECTIONS=200
CONNECTION_TIMEOUT=30
REQUEST_TIMEOUT=60
MAX_RETRIES=3
CIRCUIT_BREAKER_FAILURE_THRESHOLD=5
CIRCUIT_BREAKER_RECOVERY_TIMEOUT=60

# SSL/TLS
SSL_ENABLED=true
SSL_CERT_PATH=/app/certs/orchestrator.crt
SSL_KEY_PATH=/app/certs/orchestrator.key
SSL_VERIFY=true

# CORS
CORS_ORIGINS=https://yourdomain.com,https://app.yourdomain.com
CORS_METHODS=GET,POST,PUT,DELETE,OPTIONS
CORS_HEADERS=Content-Type,Authorization,X-Requested-With

# Feature Flags
FEATURE_REAL_TIME_UPDATES=true
FEATURE_ADVANCED_ANALYTICS=true
FEATURE_PLUGIN_SYSTEM=true
```

### Database Configuration

#### PostgreSQL Optimization
```sql
-- postgresql.conf
shared_buffers = 256MB
effective_cache_size = 1GB
maintenance_work_mem = 64MB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
```

#### Redis Configuration
```bash
# redis.conf
maxmemory 512mb
maxmemory-policy allkeys-lru
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
```

### Nginx Reverse Proxy

```nginx
# /etc/nginx/sites-available/mcp-orchestrator
upstream orchestrator_backend {
    server orchestrator1:7860 weight=3 max_fails=3 fail_timeout=30s;
    server orchestrator2:7860 weight=3 max_fails=3 fail_timeout=30s;
    server orchestrator3:7860 weight=3 max_fails=3 fail_timeout=30s backup;
}

server {
    listen 80;
    server_name orchestrator.yourdomain.com;
    return 301 https://$server_name$request_uri;
}

server {
    listen 443 ssl http2;
    server_name orchestrator.yourdomain.com;

    ssl_certificate /etc/letsencrypt/live/orchestrator.yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/orchestrator.yourdomain.com/privkey.pem;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES256-GCM-SHA384;
    ssl_prefer_server_ciphers off;

    client_max_body_size 50M;
    client_body_timeout 60s;
    client_header_timeout 60s;

    location / {
        proxy_pass http://orchestrator_backend;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_cache_bypass $http_upgrade;
        proxy_read_timeout 300s;
        proxy_connect_timeout 75s;
    }

    location /metrics {
        proxy_pass http://orchestrator_backend:9090/metrics;
        allow 127.0.0.1;
        allow 10.0.0.0/8;
        allow 172.16.0.0/12;
        allow 192.168.0.0/16;
        deny all;
    }

    location /health {
        proxy_pass http://orchestrator_backend/health;
        access_log off;
    }
}
```

## Monitoring and Logging

### Prometheus Configuration

```yaml
# prometheus.yml
global:
  scrape_interval: 15s
  evaluation_interval: 15s

rule_files:
  - "orchestrator_alerts.yml"

alerting:
  alertmanagers:
    - static_configs:
        - targets:
          - alertmanager:9093

scrape_configs:
  - job_name: 'mcp-orchestrator'
    static_configs:
      - targets: ['orchestrator:9090']
    metrics_path: /metrics
    scrape_interval: 10s
    scrape_timeout: 5s

  - job_name: 'kubernetes-pods'
    kubernetes_sd_configs:
      - role: pod
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
        action: keep
        regex: true
      - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
        action: replace
        target_label: __metrics_path__
        regex: (.+)
```

### Grafana Dashboards

1. **Orchestrator Overview Dashboard**
```json
{
  "dashboard": {
    "title": "MCP Orchestrator Overview",
    "panels": [
      {
        "title": "Request Rate",
        "type": "graph",
        "targets": [
          {
            "expr": "rate(orchestrator_requests_total[5m])",
            "legendFormat": "{{method}} {{status}}"
          }
        ]
      },
      {
        "title": "Response Time",
        "type": "graph", 
        "targets": [
          {
            "expr": "histogram_quantile(0.95, rate(orchestrator_request_duration_seconds_bucket[5m]))",
            "legendFormat": "95th percentile"
          },
          {
            "expr": "histogram_quantile(0.50, rate(orchestrator_request_duration_seconds_bucket[5m]))",
            "legendFormat": "50th percentile"
          }
        ]
      },
      {
        "title": "Active Connections",
        "type": "singlestat",
        "targets": [
          {
            "expr": "orchestrator_active_connections"
          }
        ]
      }
    ]
  }
}
```

### Structured Logging

```python
import structlog

# Configure structured logging
structlog.configure(
    processors=[
        structlog.stdlib.filter_by_level,
        structlog.stdlib.add_logger_name,
        structlog.stdlib.add_log_level,
        structlog.stdlib.PositionalArgumentsFormatter(),
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.StackInfoRenderer(),
        structlog.processors.format_exc_info,
        structlog.processors.UnicodeDecoder(),
        structlog.processors.JSONRenderer()
    ],
    context_class=dict,
    logger_factory=structlog.stdlib.LoggerFactory(),
    wrapper_class=structlog.stdlib.BoundLogger,
    cache_logger_on_first_use=True,
)
```

## Security Configuration

### TLS/SSL Setup

1. **Generate self-signed certificates (development)**
```bash
openssl req -x509 -newkey rsa:4096 -keyout orchestrator.key -out orchestrator.crt -days 365 -nodes
```

2. **Let's Encrypt certificates (production)**
```bash
certbot certonly --standalone -d orchestrator.yourdomain.com
```

### Security Headers

```python
# security_headers.py
from starlette.middleware.cors import CORSMiddleware
from starlette.middleware.sessions import SessionMiddleware

app.add_middleware(
    CORSMiddleware,
    allow_origins=["https://yourdomain.com"],
    allow_credentials=True,
    allow_methods=["GET", "POST", "PUT", "DELETE"],
    allow_headers=["Authorization", "Content-Type"],
)

# Add security headers
@app.middleware("http")
async def add_security_headers(request, call_next):
    response = await call_next(request)
    response.headers["X-Content-Type-Options"] = "nosniff"
    response.headers["X-Frame-Options"] = "DENY"
    response.headers["X-XSS-Protection"] = "1; mode=block"
    response.headers["Strict-Transport-Security"] = "max-age=31536000; includeSubDomains"
    response.headers["Referrer-Policy"] = "strict-origin-when-cross-origin"
    return response
```

### Authentication

```python
# auth.py
import jwt
from datetime import datetime, timedelta

def create_access_token(data: dict, expires_delta: timedelta = None):
    to_encode = data.copy()
    if expires_delta:
        expire = datetime.utcnow() + expires_delta
    else:
        expire = datetime.utcnow() + timedelta(minutes=15)
    to_encode.update({"exp": expire})
    encoded_jwt = jwt.encode(to_encode, SECRET_KEY, algorithm=ALGORITHM)
    return encoded_jwt

def verify_token(token: str):
    try:
        payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
        return payload
    except jwt.PyJWTError:
        return None
```

## Troubleshooting

### Common Deployment Issues

#### 1. Pod CrashLoopBackOff
```bash
# Check pod logs
kubectl logs -f pod-name -n mcp-orchestrator

# Check events
kubectl get events -n mcp-orchestrator --sort-by='.lastTimestamp'

# Debug pod
kubectl debug -it pod-name -n mcp-orchestrator --image=busybox
```

#### 2. Database Connection Issues
```bash
# Test database connectivity
kubectl exec -it pod-name -n mcp-orchestrator -- python -c "
import asyncpg
import asyncio
async def test():
    try:
        conn = await asyncpg.connect('postgresql://user:pass@host:5432/db')
        await conn.execute('SELECT 1')
        print('Database connection successful')
        await conn.close()
    except Exception as e:
        print(f'Database connection failed: {e}')
asyncio.run(test())
"
```

#### 3. Memory Issues
```bash
# Check resource usage
kubectl top pods -n mcp-orchestrator

# Check node resources
kubectl top nodes

# Increase memory limits
kubectl patch deployment orchestrator -n mcp-orchestrator -p '{"spec":{"template":{"spec":{"containers":[{"name":"orchestrator","resources":{"limits":{"memory":"2Gi"}}}]}}}}'
```

### Performance Tuning

#### 1. Connection Pool Optimization
```python
# Tune connection pool settings
DATABASE_POOL_SIZE=20    # Increase for high load
DATABASE_MAX_OVERFLOW=30  # Allow overflow connections
DATABASE_POOL_TIMEOUT=30  # Timeout for acquiring connection
```

#### 2. Cache Optimization
```python
# Redis configuration
CACHE_TTL=3600           # Adjust based on use case
CACHE_COMPRESSION=true   # Enable for large responses
```

#### 3. Horizontal Pod Autoscaling
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: orchestrator-hpa
  namespace: mcp-orchestrator
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: orchestrator
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
```

### Health Checks

#### Application Health Check
```python
# health_check.py
from fastapi import FastAPI
from prometheus_client import generate_latest, CONTENT_TYPE_LATEST

app = FastAPI()

@app.get("/health/live")
async def liveness_check():
    return {"status": "alive"}

@app.get("/health/ready")
async def readiness_check():
    # Check database connectivity
    # Check cache connectivity  
    # Check external services
    return {"status": "ready"}

@app.get("/health/detailed")
async def detailed_health():
    return {
        "status": "healthy",
        "checks": {
            "database": await check_database(),
            "cache": await check_cache(),
            "external_services": await check_external_services()
        }
    }
```

This completes the comprehensive deployment guide. The platform can now be deployed across various environments with proper configuration, monitoring, and security measures in place.