| # AegisLM Container Security Hardening Guide | |
| ## Overview | |
| This document defines the container security hardening requirements for AegisLM. All containers must meet these security standards before deployment to production. | |
| ## Container Security Checklist | |
| ### ✅ Required Security Controls | |
| | Control | Description | Priority | Status | | |
| |---------|-------------|----------|--------| | |
| | Non-root user | Containers run as non-root | Required | ✅ Implemented | | |
| | Read-only root filesystem | Root filesystem is read-only | Required | 🔄 To Implement | | |
| | Drop capabilities | Drop all unnecessary Linux capabilities | Required | 🔄 To Implement | | |
| | Seccomp profile | Enable seccomp restriction | Required | 🔄 To Implement | | |
| | No privileged mode | Never run containers in privileged mode | Required | ✅ Implemented | | |
| | Resource limits | Set CPU/memory limits | Required | ✅ Implemented | | |
| | Health checks | Liveness and readiness probes | Required | ✅ Implemented | | |
| | Minimal base image | Use minimal base images | Required | 🔄 To Implement | | |
| --- | |
| ## Kubernetes Pod Security | |
| ### Pod Security Standards | |
| All AegisLM pods must use the following security context: | |
| ``` | |
| yaml | |
| # deployment/k8s/api-deployment.yaml (updated) | |
| apiVersion: apps/v1 | |
| kind: Deployment | |
| metadata: | |
| name: aegislm-api | |
| namespace: aegislm | |
| spec: | |
| template: | |
| spec: | |
| securityContext: | |
| runAsNonRoot: true | |
| runAsUser: 1000 | |
| runAsGroup: 1000 | |
| fsGroup: 1000 | |
| seccompProfile: | |
| type: RuntimeDefault | |
| containers: | |
| - name: api | |
| securityContext: | |
| allowPrivilegeEscalation: false | |
| readOnlyRootFilesystem: true | |
| capabilities: | |
| drop: | |
| - ALL | |
| runAsNonRoot: true | |
| runAsUser: 1000 | |
| ``` | |
| ### Network Policies | |
| Create `deployment/k8s/network-policy.yaml`: | |
| ``` | |
| yaml | |
| --- | |
| # AegisLM Network Policies | |
| # Kubernetes NetworkPolicy definitions for micro-segmentation | |
| apiVersion: networking.k8s.io/v1 | |
| kind: NetworkPolicy | |
| metadata: | |
| name: api-network-policy | |
| namespace: aegislm | |
| spec: | |
| podSelector: | |
| matchLabels: | |
| app: aegislm | |
| component: api | |
| policyTypes: | |
| - Ingress | |
| - Egress | |
| ingress: | |
| # Allow ingress from ingress controller | |
| - from: | |
| - namespaceSelector: | |
| matchLabels: | |
| name: ingress-nginx | |
| ports: | |
| - protocol: TCP | |
| port: 8000 | |
| # Allow ingress from dashboard | |
| - from: | |
| - podSelector: | |
| matchLabels: | |
| app: aegislm | |
| component: dashboard | |
| ports: | |
| - protocol: TCP | |
| port: 8000 | |
| egress: | |
| # Allow egress to PostgreSQL | |
| - to: | |
| - podSelector: | |
| matchLabels: | |
| app: postgres | |
| ports: | |
| - protocol: TCP | |
| port: 5432 | |
| # Allow egress to Redis | |
| - to: | |
| - podSelector: | |
| matchLabels: | |
| app: redis | |
| ports: | |
| - protocol: TCP | |
| port: 6379 | |
| # Allow egress to worker pods | |
| - to: | |
| - podSelector: | |
| matchLabels: | |
| app: aegislm | |
| component: worker | |
| ports: | |
| - protocol: TCP | |
| port: 8001 | |
| # Allow DNS | |
| - to: | |
| - namespaceSelector: {} | |
| podSelector: | |
| matchLabels: | |
| k8s-app: kube-dns | |
| ports: | |
| - protocol: UDP | |
| port: 53 | |
| # Allow egress to model service (internal only) | |
| - to: | |
| - podSelector: | |
| matchLabels: | |
| app: aegislm | |
| component: model-service | |
| ports: | |
| - protocol: TCP | |
| port: 8002 | |
| --- | |
| apiVersion: networking.k8s.io/v1 | |
| kind: NetworkPolicy | |
| metadata: | |
| name: worker-network-policy | |
| namespace: aegislm | |
| spec: | |
| podSelector: | |
| matchLabels: | |
| app: aegislm | |
| component: worker | |
| policyTypes: | |
| - Ingress | |
| - Egress | |
| ingress: | |
| # Allow ingress from API | |
| - from: | |
| - podSelector: | |
| matchLabels: | |
| app: aegislm | |
| component: api | |
| ports: | |
| - protocol: TCP | |
| port: 8001 | |
| egress: | |
| # Allow egress to model service | |
| - to: | |
| - podSelector: | |
| matchLabels: | |
| app: aegislm | |
| component: model-service | |
| ports: | |
| - protocol: TCP | |
| port: 8002 | |
| # Allow egress to object storage | |
| - to: | |
| - namespaceSelector: {} | |
| podSelector: | |
| matchLabels: | |
| app: minio | |
| ports: | |
| - protocol: TCP | |
| port: 9000 | |
| # Allow DNS | |
| - to: | |
| - namespaceSelector: {} | |
| podSelector: | |
| matchLabels: | |
| k8s-app: kube-dns | |
| ports: | |
| - protocol: UDP | |
| port: 53 | |
| --- | |
| apiVersion: networking.k8s.io/v1 | |
| kind: NetworkPolicy | |
| metadata: | |
| name: model-service-network-policy | |
| namespace: aegislm | |
| spec: | |
| podSelector: | |
| matchLabels: | |
| app: aegislm | |
| component: model-service | |
| policyTypes: | |
| - Ingress | |
| - Egress | |
| ingress: | |
| # Only allow ingress from worker pods (internal only) | |
| - from: | |
| - podSelector: | |
| matchLabels: | |
| app: aegislm | |
| component: worker | |
| ports: | |
| - protocol: TCP | |
| port: 8002 | |
| # No egress - model service is internal only | |
| egress: [] | |
| --- | |
| apiVersion: networking.k8s.io/v1 | |
| kind: NetworkPolicy | |
| metadata: | |
| name: dashboard-network-policy | |
| namespace: aegislm | |
| spec: | |
| podSelector: | |
| matchLabels: | |
| app: aegislm | |
| component: dashboard | |
| policyTypes: | |
| - Ingress | |
| - Egress | |
| ingress: | |
| # Allow ingress from ingress controller | |
| - from: | |
| - namespaceSelector: | |
| matchLabels: | |
| name: ingress-nginx | |
| ports: | |
| - protocol: TCP | |
| port: 3000 | |
| egress: | |
| # Allow egress to API | |
| - to: | |
| - podSelector: | |
| matchLabels: | |
| app: aegislm | |
| component: api | |
| ports: | |
| - protocol: TCP | |
| port: 8000 | |
| # Allow DNS | |
| - to: | |
| - namespaceSelector: {} | |
| podSelector: | |
| matchLabels: | |
| k8s-app: kube-dns | |
| ports: | |
| - protocol: UDP | |
| port: 53 | |
| ``` | |
| --- | |
| ## Docker Security Enhancements | |
| ### Enhanced Base Dockerfile | |
| Update `services/base/Dockerfile` to use minimal base image: | |
| ``` | |
| dockerfile | |
| # AegisLM Base Docker Image - Hardened Version | |
| # Multi-Agent Adversarial LLM Evaluation Framework | |
| # Use distroless base image for minimal attack surface | |
| FROM gcr.io/distroless/python3-debian11:nonroot AS builder | |
| # Build stage | |
| FROM python:3.11-slim-bookworm AS build | |
| # Install build dependencies | |
| RUN apt-get update && apt-get install -y \ | |
| build-essential \ | |
| && rm -rf /var/lib/apt/lists/* | |
| # Create virtual environment | |
| RUN python -m venv /opt/venv | |
| ENV PATH="/opt/venv/bin:$PATH" | |
| # Install Python dependencies | |
| COPY requirements.txt . | |
| RUN pip install --no-cache-dir --require-hashes -r requirements.txt | |
| # Final stage - distroless | |
| FROM gcr.io/distroless/python3-debian11:nonroot | |
| # Set environment variables | |
| ENV PYTHONUNBUFFERED=1 \ | |
| PYTHONDONTWRITEBYTECODE=1 \ | |
| PATH="/opt/venv/bin:$PATH" | |
| # Copy virtual environment from builder | |
| COPY --from=build /opt/venv /opt/venv | |
| # Copy application | |
| WORKDIR /app | |
| COPY --chown=nonroot:nonroot . /app | |
| # Switch to non-root user (built into distroless) | |
| USER nonroot | |
| # Additional security: read-only filesystem | |
| # (Requires tmpfs for /tmp and writeable mounts) | |
| VOLUME ["/tmp", "/var/cache/pip"] | |
| # Set no new privilege flag | |
| ARG BUILD_DATE | |
| ARG VCS_REF | |
| LABEL org.label-schema.build-date=$BUILD_DATE \ | |
| org.label-schema.name="aegislm" \ | |
| org.label-schema.vcs-ref=$VCS_REF \ | |
| org.label-schema.vcs-url="https://github.com/aegislm/aegislm" | |
| CMD ["python"] | |
| ``` | |
| ### Enhanced API Dockerfile | |
| ``` | |
| dockerfile | |
| # AegisLM API Service Dockerfile | |
| # Production-grade security hardened | |
| # Build stage | |
| FROM python:3.11-slim-bookworm AS builder | |
| ENV PIP_NO_CACHE_DIR=1 \ | |
| PIP_DISABLE_PIP_VERSION_CHECK=1 | |
| WORKDIR /app | |
| # Install dependencies | |
| COPY requirements.txt . | |
| RUN pip install --no-cache-dir --require-hashes -r requirements.txt | |
| # Copy source | |
| COPY . . | |
| # Final stage | |
| FROM gcr.io/distroless/python3-debian11:nonroot | |
| # Security: No root | |
| USER nonroot | |
| WORKDIR /app | |
| # Copy from builder | |
| COPY --from=builder /opt/venv /opt/venv | |
| COPY --from=builder --chown=nonroot:nonroot /app /app | |
| ENV PATH="/opt/venv/bin:$PATH" | |
| # Security: Read-only filesystem preparation | |
| # Use tmpfs for temporary files | |
| TMPFS_SIZE=64m | |
| ENV TMPFS_OPTS=size=${TMPFS_SIZE},mode=1777 | |
| # Expose port | |
| EXPOSE 8000 | |
| # Health check | |
| HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \ | |
| CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health/live')" | |
| # Run as non-root | |
| USER nonroot | |
| CMD ["python", "-m", "uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"] | |
| ``` | |
| --- | |
| ## Container Runtime Security | |
| ### Seccomp Profile | |
| Create a custom seccomp profile in `deployment/k8s/seccomp-profile.yaml`: | |
| ``` | |
| yaml | |
| # Custom seccomp profile for AegisLM | |
| apiVersion: security.openshift.io/v1 | |
| kind: SecurityContextConstraints | |
| metadata: | |
| name: aegislm-restricted | |
| allowHostDirVolumePlugin: false | |
| allowHostIPC: false | |
| allowHostNetwork: false | |
| allowHostPID: false | |
| allowHostPorts: false | |
| allowPrivilegedContainer: false | |
| allowedCapabilities: | |
| - NET_BIND_SERVICE | |
| defaultAddCapabilities: [] | |
| fsGroup: | |
| type: RunAsAny | |
| priority: 10 | |
| readOnlyRootFilesystem: true | |
| requiredDropCapabilities: | |
| - ALL | |
| runAsUser: | |
| type: MustRunAs | |
| uid: 1000 | |
| seLinuxContext: | |
| type: MustRunAs | |
| seLinuxOptions: | |
| level: "s0:c123,c456" | |
| seccompProfiles: | |
| - runtime/default | |
| supplementalGroups: | |
| type: RunAsAny | |
| volumes: | |
| - configMap | |
| - emptyDir | |
| - projected | |
| - secret | |
| - downwardAPI | |
| - persistentVolumeClaim | |
| ``` | |
| --- | |
| ## Runtime Security | |
| ### Resource Limits (Already Implemented) | |
| ``` | |
| yaml | |
| # In each deployment | |
| resources: | |
| requests: | |
| cpu: 500m | |
| memory: 1Gi | |
| nvidia.com/gpu: 1 # For model service | |
| limits: | |
| cpu: 1000m | |
| memory: 2Gi | |
| nvidia.com/gpu: 1 | |
| ``` | |
| ### Security Validation Tests | |
| Create `tests/test_container_security.py`: | |
| ``` | |
| python | |
| """ | |
| Container Security Validation Tests | |
| Tests that all security controls are properly configured | |
| """ | |
| import pytest | |
| import yaml | |
| from pathlib import Path | |
| class TestContainerSecurity: | |
| """Test container security configurations.""" | |
| @pytest.fixture | |
| def k8s_deployments(self): | |
| """Load Kubernetes deployment files.""" | |
| deploy_dir = Path("deployment/k8s") | |
| deployments = {} | |
| for deploy_file in deploy_dir.glob("*-deployment.yaml"): | |
| with open(deploy_file) as f: | |
| doc = yaml.safe_load(f) | |
| deployments[deploy_file.stem] = doc | |
| return deployments | |
| def test_containers_run_as_non_root(self, k8s_deployments): | |
| """Verify all containers run as non-root user.""" | |
| for name, deploy in k8s_deployments.items(): | |
| spec = deploy.get("spec", {}) | |
| template = spec.get("template", {}) | |
| security_ctx = template.get("spec", {}).get("securityContext", {}) | |
| assert security_ctx.get("runAsNonRoot") is True, \ | |
| f"{name}: Container must run as non-root" | |
| assert security_ctx.get("runAsUser") == 1000, \ | |
| f"{name}: Must run as user 1000" | |
| def test_containers_drop_all_capabilities(self, k8s_deployments): | |
| """Verify all containers drop all capabilities.""" | |
| for name, deploy in k8s_deployments.items(): | |
| spec = deploy.get("spec", {}) | |
| template = spec.get("template", {}) | |
| containers = template.get("spec", {}).get("containers", []) | |
| for container in containers: | |
| security_ctx = container.get("securityContext", {}) | |
| capabilities = security_ctx.get("capabilities", {}) | |
| dropped = capabilities.get("drop", []) | |
| assert "ALL" in dropped, \ | |
| f"{name}/{container['name']}: Must drop ALL capabilities" | |
| def test_no_privileged_containers(self, k8s_deployments): | |
| """Verify no containers run in privileged mode.""" | |
| for name, deploy in k8s_deployments.items(): | |
| spec = deploy.get("spec", {}) | |
| template = spec.get("template", {}) | |
| containers = template.get("spec", {}).get("containers", []) | |
| for container in containers: | |
| security_ctx = container.get("securityContext", {}) | |
| assert security_ctx.get("allowPrivilegeEscalation") is False, \ | |
| f"{name}/{container['name']}: Must not allow privilege escalation" | |
| def test_read_only_root_filesystem(self, k8s_deployments): | |
| """Verify containers have read-only root filesystem.""" | |
| for name, deploy in k8s_deployments.items(): | |
| spec = deploy.get("spec", {}) | |
| template = spec.get("template", {}) | |
| containers = template.get("spec", {}).get("containers", []) | |
| for container in containers: | |
| security_ctx = container.get("securityContext", {}) | |
| # Note: This will require tmpfs mounts for /tmp | |
| assert security_ctx.get("readOnlyRootFilesystem") is True, \ | |
| f"{name}/{container['name']}: Must have read-only root filesystem" | |
| def test_seccomp_profile_set(self, k8s_deployments): | |
| """Verify seccomp profile is set.""" | |
| for name, deploy in k8s_deployments.items(): | |
| spec = deploy.get("spec", {}) | |
| template = spec.get("template", {}) | |
| security_ctx = template.get("spec", {}).get("securityContext", {}) | |
| seccomp = security_ctx.get("seccompProfile", {}) | |
| assert seccomp.get("type") == "RuntimeDefault", \ | |
| f"{name}: Must use RuntimeDefault seccomp profile" | |
| class TestNetworkPolicies: | |
| """Test network policy configurations.""" | |
| @pytest.fixture | |
| def network_policies(self): | |
| """Load network policy files.""" | |
| policy_dir = Path("deployment/k8s") | |
| policies = {} | |
| for policy_file in policy_dir.glob("network-policy*.yaml"): | |
| with open(policy_file) as f: | |
| # Handle multiple documents | |
| for doc in yaml.safe_load_all(f): | |
| if doc: | |
| name = doc.get("metadata", {}).get("name") | |
| policies[name] = doc | |
| return policies | |
| def test_api_has_ingress_policy(self, network_policies): | |
| """Verify API has ingress network policy.""" | |
| assert "api-network-policy" in network_policies, \ | |
| "API must have network policy" | |
| policy = network_policies["api-network-policy"] | |
| assert "Ingress" in policy.get("policyTypes", []), \ | |
| "API must have ingress policy" | |
| def test_model_service_internal_only(self, network_policies): | |
| """Verify model service is internal only.""" | |
| if "model-service-network-policy" not in network_policies: | |
| pytest.skip("Model service network policy not defined") | |
| policy = network_policies["model-service-network-policy"] | |
| ingress = policy.get("spec", {}).get("ingress", []) | |
| # Model service should not have public ingress | |
| # Should only allow from worker pods | |
| assert len(ingress) <= 1, \ | |
| "Model service should have limited ingress" | |
| def test_all_pods_have_policies(self, network_policies): | |
| """Verify all critical components have network policies.""" | |
| required_policies = [ | |
| "api-network-policy", | |
| "worker-network-policy", | |
| "dashboard-network-policy", | |
| ] | |
| for policy_name in required_policies: | |
| assert policy_name in network_policies, \ | |
| f"Required network policy {policy_name} not found" | |
| class TestDockerfileSecurity: | |
| """Test Dockerfile security configurations.""" | |
| @pytest.fixture | |
| def dockerfiles(self): | |
| """Load Dockerfile content.""" | |
| dockerfiles = {} | |
| for dockerfile_path in Path("services").rglob("Dockerfile"): | |
| with open(dockerfile_path) as f: | |
| dockerfiles[dockerfile_path.name] = f.read() | |
| return dockerfiles | |
| def test_no_root_user_in_dockerfile(self, dockerfiles): | |
| """Verify Dockerfiles don't use root user.""" | |
| for name, content in dockerfiles.items(): | |
| # Check for USER root | |
| lines = content.split("\n") | |
| for line in lines: | |
| if line.strip().startswith("USER") and "root" in line.lower(): | |
| pytest.fail(f"{name}: Should not use root user") | |
| def test_no_latest_tag(self, dockerfiles): | |
| """Verify no 'latest' tag is used.""" | |
| for name, content in dockerfiles.items(): | |
| lines = content.split("\n") | |
| for line in lines: | |
| if line.strip().startswith("FROM"): | |
| if ":latest" in line.lower(): | |
| pytest.fail(f"{name}: Should not use :latest tag") | |
| ``` | |
| --- | |
| ## Secrets Management | |
| ### External Secrets | |
| ``` | |
| yaml | |
| # deployment/k8s/external-secrets.yaml | |
| apiVersion: external-secrets.io/v1beta1 | |
| kind: ClusterSecretStore | |
| metadata: | |
| name: aegislm-vault | |
| spec: | |
| provider: | |
| vault: | |
| server: "https://vault.example.com:8200" | |
| path: "secret" | |
| version: "v2" | |
| auth: | |
| kubernetes: | |
| mountPath: kubernetes | |
| --- | |
| apiVersion: external-secrets.io/v1beta1 | |
| kind: ExternalSecret | |
| metadata: | |
| name: aegislm-secrets | |
| namespace: aegislm | |
| spec: | |
| refreshInterval: 1h | |
| secretStoreRef: | |
| name: aegislm-vault | |
| kind: ClusterSecretStore | |
| target: | |
| name: aegislm-secrets | |
| creationPolicy: Owner | |
| data: | |
| - secretKey: DATABASE_URL | |
| remoteRef: | |
| key: aegislm/database | |
| property: url | |
| - secretKey: API_SECRET_KEY | |
| remoteRef: | |
| key: aegislm/api | |
| property: secret_key | |
| ``` | |
| --- | |
| ## Compliance Mapping | |
| | Control | NIST 800-53 | ISO 27001 | PCI DSS | | |
| |---------|-------------|-----------|---------| | |
| | Non-root containers | AC-3 | A.9.1 | Req 7.1 | | |
| | Read-only filesystem | AC-3 | A.9.1 | Req 7.1 | | |
| | Drop capabilities | AC-3 | A.9.1 | - | | |
| | Network segmentation | AC-4 | A.13.1 | Req 1.3 | | |
| | Secrets management | IA-5 | A.9.4 | Req 3.4 | | |
| --- | |
| ## Security Audit Commands | |
| ``` | |
| bash | |
| # Check container security context | |
| kubectl get pods -n aegislm -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.securityContext.runAsNonRoot}{"\n"}{end}' | |
| # Check network policies | |
| kubectl get networkpolicies -n aegislm | |
| # Check for privileged containers | |
| kubectl get pods -n aegislm -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[*].securityContext.privileged}{"\n"}{end}' | |
| # Run security tests | |
| pytest tests/test_container_security.py -v | |
| ``` | |
| --- | |
| ## References | |
| - [NIST 800-53 Security Controls](https://csrc.nist.gov/publications/detail/sp/800-53/rev-5/final) | |
| - [Kubernetes Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/) | |
| - [CIS Docker Benchmark](https://www.cisecurity.org/benchmark/docker) | |
| - [Distroless Docker Images](https://github.com/GoogleContainerTools/distroless) | |