AegisLM Container Security Hardening Guide
Overview
This document defines the container security hardening requirements for AegisLM. All containers must meet these security standards before deployment to production.
Container Security Checklist
β Required Security Controls
| Control | Description | Priority | Status |
|---|---|---|---|
| Non-root user | Containers run as non-root | Required | β Implemented |
| Read-only root filesystem | Root filesystem is read-only | Required | π To Implement |
| Drop capabilities | Drop all unnecessary Linux capabilities | Required | π To Implement |
| Seccomp profile | Enable seccomp restriction | Required | π To Implement |
| No privileged mode | Never run containers in privileged mode | Required | β Implemented |
| Resource limits | Set CPU/memory limits | Required | β Implemented |
| Health checks | Liveness and readiness probes | Required | β Implemented |
| Minimal base image | Use minimal base images | Required | π To Implement |
Kubernetes Pod Security
Pod Security Standards
All AegisLM pods must use the following security context:
yaml
# deployment/k8s/api-deployment.yaml (updated)
apiVersion: apps/v1
kind: Deployment
metadata:
name: aegislm-api
namespace: aegislm
spec:
template:
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: api
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
Network Policies
Create deployment/k8s/network-policy.yaml:
yaml
---
# AegisLM Network Policies
# Kubernetes NetworkPolicy definitions for micro-segmentation
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: api-network-policy
namespace: aegislm
spec:
podSelector:
matchLabels:
app: aegislm
component: api
policyTypes:
- Ingress
- Egress
ingress:
# Allow ingress from ingress controller
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 8000
# Allow ingress from dashboard
- from:
- podSelector:
matchLabels:
app: aegislm
component: dashboard
ports:
- protocol: TCP
port: 8000
egress:
# Allow egress to PostgreSQL
- to:
- podSelector:
matchLabels:
app: postgres
ports:
- protocol: TCP
port: 5432
# Allow egress to Redis
- to:
- podSelector:
matchLabels:
app: redis
ports:
- protocol: TCP
port: 6379
# Allow egress to worker pods
- to:
- podSelector:
matchLabels:
app: aegislm
component: worker
ports:
- protocol: TCP
port: 8001
# Allow DNS
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
# Allow egress to model service (internal only)
- to:
- podSelector:
matchLabels:
app: aegislm
component: model-service
ports:
- protocol: TCP
port: 8002
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: worker-network-policy
namespace: aegislm
spec:
podSelector:
matchLabels:
app: aegislm
component: worker
policyTypes:
- Ingress
- Egress
ingress:
# Allow ingress from API
- from:
- podSelector:
matchLabels:
app: aegislm
component: api
ports:
- protocol: TCP
port: 8001
egress:
# Allow egress to model service
- to:
- podSelector:
matchLabels:
app: aegislm
component: model-service
ports:
- protocol: TCP
port: 8002
# Allow egress to object storage
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
app: minio
ports:
- protocol: TCP
port: 9000
# Allow DNS
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: model-service-network-policy
namespace: aegislm
spec:
podSelector:
matchLabels:
app: aegislm
component: model-service
policyTypes:
- Ingress
- Egress
ingress:
# Only allow ingress from worker pods (internal only)
- from:
- podSelector:
matchLabels:
app: aegislm
component: worker
ports:
- protocol: TCP
port: 8002
# No egress - model service is internal only
egress: []
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: dashboard-network-policy
namespace: aegislm
spec:
podSelector:
matchLabels:
app: aegislm
component: dashboard
policyTypes:
- Ingress
- Egress
ingress:
# Allow ingress from ingress controller
- from:
- namespaceSelector:
matchLabels:
name: ingress-nginx
ports:
- protocol: TCP
port: 3000
egress:
# Allow egress to API
- to:
- podSelector:
matchLabels:
app: aegislm
component: api
ports:
- protocol: TCP
port: 8000
# Allow DNS
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
Docker Security Enhancements
Enhanced Base Dockerfile
Update services/base/Dockerfile to use minimal base image:
dockerfile
# AegisLM Base Docker Image - Hardened Version
# Multi-Agent Adversarial LLM Evaluation Framework
# Use distroless base image for minimal attack surface
FROM gcr.io/distroless/python3-debian11:nonroot AS builder
# Build stage
FROM python:3.11-slim-bookworm AS build
# Install build dependencies
RUN apt-get update && apt-get install -y \
build-essential \
&& rm -rf /var/lib/apt/lists/*
# Create virtual environment
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir --require-hashes -r requirements.txt
# Final stage - distroless
FROM gcr.io/distroless/python3-debian11:nonroot
# Set environment variables
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PATH="/opt/venv/bin:$PATH"
# Copy virtual environment from builder
COPY --from=build /opt/venv /opt/venv
# Copy application
WORKDIR /app
COPY --chown=nonroot:nonroot . /app
# Switch to non-root user (built into distroless)
USER nonroot
# Additional security: read-only filesystem
# (Requires tmpfs for /tmp and writeable mounts)
VOLUME ["/tmp", "/var/cache/pip"]
# Set no new privilege flag
ARG BUILD_DATE
ARG VCS_REF
LABEL org.label-schema.build-date=$BUILD_DATE \
org.label-schema.name="aegislm" \
org.label-schema.vcs-ref=$VCS_REF \
org.label-schema.vcs-url="https://github.com/aegislm/aegislm"
CMD ["python"]
Enhanced API Dockerfile
dockerfile
# AegisLM API Service Dockerfile
# Production-grade security hardened
# Build stage
FROM python:3.11-slim-bookworm AS builder
ENV PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1
WORKDIR /app
# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir --require-hashes -r requirements.txt
# Copy source
COPY . .
# Final stage
FROM gcr.io/distroless/python3-debian11:nonroot
# Security: No root
USER nonroot
WORKDIR /app
# Copy from builder
COPY --from=builder /opt/venv /opt/venv
COPY --from=builder --chown=nonroot:nonroot /app /app
ENV PATH="/opt/venv/bin:$PATH"
# Security: Read-only filesystem preparation
# Use tmpfs for temporary files
TMPFS_SIZE=64m
ENV TMPFS_OPTS=size=${TMPFS_SIZE},mode=1777
# Expose port
EXPOSE 8000
# Health check
HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8000/health/live')"
# Run as non-root
USER nonroot
CMD ["python", "-m", "uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "8000"]
Container Runtime Security
Seccomp Profile
Create a custom seccomp profile in deployment/k8s/seccomp-profile.yaml:
yaml
# Custom seccomp profile for AegisLM
apiVersion: security.openshift.io/v1
kind: SecurityContextConstraints
metadata:
name: aegislm-restricted
allowHostDirVolumePlugin: false
allowHostIPC: false
allowHostNetwork: false
allowHostPID: false
allowHostPorts: false
allowPrivilegedContainer: false
allowedCapabilities:
- NET_BIND_SERVICE
defaultAddCapabilities: []
fsGroup:
type: RunAsAny
priority: 10
readOnlyRootFilesystem: true
requiredDropCapabilities:
- ALL
runAsUser:
type: MustRunAs
uid: 1000
seLinuxContext:
type: MustRunAs
seLinuxOptions:
level: "s0:c123,c456"
seccompProfiles:
- runtime/default
supplementalGroups:
type: RunAsAny
volumes:
- configMap
- emptyDir
- projected
- secret
- downwardAPI
- persistentVolumeClaim
Runtime Security
Resource Limits (Already Implemented)
yaml
# In each deployment
resources:
requests:
cpu: 500m
memory: 1Gi
nvidia.com/gpu: 1 # For model service
limits:
cpu: 1000m
memory: 2Gi
nvidia.com/gpu: 1
Security Validation Tests
Create tests/test_container_security.py:
python
"""
Container Security Validation Tests
Tests that all security controls are properly configured
"""
import pytest
import yaml
from pathlib import Path
class TestContainerSecurity:
"""Test container security configurations."""
@pytest.fixture
def k8s_deployments(self):
"""Load Kubernetes deployment files."""
deploy_dir = Path("deployment/k8s")
deployments = {}
for deploy_file in deploy_dir.glob("*-deployment.yaml"):
with open(deploy_file) as f:
doc = yaml.safe_load(f)
deployments[deploy_file.stem] = doc
return deployments
def test_containers_run_as_non_root(self, k8s_deployments):
"""Verify all containers run as non-root user."""
for name, deploy in k8s_deployments.items():
spec = deploy.get("spec", {})
template = spec.get("template", {})
security_ctx = template.get("spec", {}).get("securityContext", {})
assert security_ctx.get("runAsNonRoot") is True, \
f"{name}: Container must run as non-root"
assert security_ctx.get("runAsUser") == 1000, \
f"{name}: Must run as user 1000"
def test_containers_drop_all_capabilities(self, k8s_deployments):
"""Verify all containers drop all capabilities."""
for name, deploy in k8s_deployments.items():
spec = deploy.get("spec", {})
template = spec.get("template", {})
containers = template.get("spec", {}).get("containers", [])
for container in containers:
security_ctx = container.get("securityContext", {})
capabilities = security_ctx.get("capabilities", {})
dropped = capabilities.get("drop", [])
assert "ALL" in dropped, \
f"{name}/{container['name']}: Must drop ALL capabilities"
def test_no_privileged_containers(self, k8s_deployments):
"""Verify no containers run in privileged mode."""
for name, deploy in k8s_deployments.items():
spec = deploy.get("spec", {})
template = spec.get("template", {})
containers = template.get("spec", {}).get("containers", [])
for container in containers:
security_ctx = container.get("securityContext", {})
assert security_ctx.get("allowPrivilegeEscalation") is False, \
f"{name}/{container['name']}: Must not allow privilege escalation"
def test_read_only_root_filesystem(self, k8s_deployments):
"""Verify containers have read-only root filesystem."""
for name, deploy in k8s_deployments.items():
spec = deploy.get("spec", {})
template = spec.get("template", {})
containers = template.get("spec", {}).get("containers", [])
for container in containers:
security_ctx = container.get("securityContext", {})
# Note: This will require tmpfs mounts for /tmp
assert security_ctx.get("readOnlyRootFilesystem") is True, \
f"{name}/{container['name']}: Must have read-only root filesystem"
def test_seccomp_profile_set(self, k8s_deployments):
"""Verify seccomp profile is set."""
for name, deploy in k8s_deployments.items():
spec = deploy.get("spec", {})
template = spec.get("template", {})
security_ctx = template.get("spec", {}).get("securityContext", {})
seccomp = security_ctx.get("seccompProfile", {})
assert seccomp.get("type") == "RuntimeDefault", \
f"{name}: Must use RuntimeDefault seccomp profile"
class TestNetworkPolicies:
"""Test network policy configurations."""
@pytest.fixture
def network_policies(self):
"""Load network policy files."""
policy_dir = Path("deployment/k8s")
policies = {}
for policy_file in policy_dir.glob("network-policy*.yaml"):
with open(policy_file) as f:
# Handle multiple documents
for doc in yaml.safe_load_all(f):
if doc:
name = doc.get("metadata", {}).get("name")
policies[name] = doc
return policies
def test_api_has_ingress_policy(self, network_policies):
"""Verify API has ingress network policy."""
assert "api-network-policy" in network_policies, \
"API must have network policy"
policy = network_policies["api-network-policy"]
assert "Ingress" in policy.get("policyTypes", []), \
"API must have ingress policy"
def test_model_service_internal_only(self, network_policies):
"""Verify model service is internal only."""
if "model-service-network-policy" not in network_policies:
pytest.skip("Model service network policy not defined")
policy = network_policies["model-service-network-policy"]
ingress = policy.get("spec", {}).get("ingress", [])
# Model service should not have public ingress
# Should only allow from worker pods
assert len(ingress) <= 1, \
"Model service should have limited ingress"
def test_all_pods_have_policies(self, network_policies):
"""Verify all critical components have network policies."""
required_policies = [
"api-network-policy",
"worker-network-policy",
"dashboard-network-policy",
]
for policy_name in required_policies:
assert policy_name in network_policies, \
f"Required network policy {policy_name} not found"
class TestDockerfileSecurity:
"""Test Dockerfile security configurations."""
@pytest.fixture
def dockerfiles(self):
"""Load Dockerfile content."""
dockerfiles = {}
for dockerfile_path in Path("services").rglob("Dockerfile"):
with open(dockerfile_path) as f:
dockerfiles[dockerfile_path.name] = f.read()
return dockerfiles
def test_no_root_user_in_dockerfile(self, dockerfiles):
"""Verify Dockerfiles don't use root user."""
for name, content in dockerfiles.items():
# Check for USER root
lines = content.split("\n")
for line in lines:
if line.strip().startswith("USER") and "root" in line.lower():
pytest.fail(f"{name}: Should not use root user")
def test_no_latest_tag(self, dockerfiles):
"""Verify no 'latest' tag is used."""
for name, content in dockerfiles.items():
lines = content.split("\n")
for line in lines:
if line.strip().startswith("FROM"):
if ":latest" in line.lower():
pytest.fail(f"{name}: Should not use :latest tag")
Secrets Management
External Secrets
yaml
# deployment/k8s/external-secrets.yaml
apiVersion: external-secrets.io/v1beta1
kind: ClusterSecretStore
metadata:
name: aegislm-vault
spec:
provider:
vault:
server: "https://vault.example.com:8200"
path: "secret"
version: "v2"
auth:
kubernetes:
mountPath: kubernetes
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: aegislm-secrets
namespace: aegislm
spec:
refreshInterval: 1h
secretStoreRef:
name: aegislm-vault
kind: ClusterSecretStore
target:
name: aegislm-secrets
creationPolicy: Owner
data:
- secretKey: DATABASE_URL
remoteRef:
key: aegislm/database
property: url
- secretKey: API_SECRET_KEY
remoteRef:
key: aegislm/api
property: secret_key
Compliance Mapping
| Control | NIST 800-53 | ISO 27001 | PCI DSS |
|---|---|---|---|
| Non-root containers | AC-3 | A.9.1 | Req 7.1 |
| Read-only filesystem | AC-3 | A.9.1 | Req 7.1 |
| Drop capabilities | AC-3 | A.9.1 | - |
| Network segmentation | AC-4 | A.13.1 | Req 1.3 |
| Secrets management | IA-5 | A.9.4 | Req 3.4 |
Security Audit Commands
bash
# Check container security context
kubectl get pods -n aegislm -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.securityContext.runAsNonRoot}{"\n"}{end}'
# Check network policies
kubectl get networkpolicies -n aegislm
# Check for privileged containers
kubectl get pods -n aegislm -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.containers[*].securityContext.privileged}{"\n"}{end}'
# Run security tests
pytest tests/test_container_security.py -v