feat(deploy): add AWS deployment scripts and configurations for AntiAtropos
Browse files- Add deploy/aws/deploy.sh to automate full AWS deployment on EKS
- Create eksctl cluster config with managed node groups and addons
- Add Kubernetes manifests for namespace, configmap, deployment, service,
secret, and optional ALB ingress
- Provide Prometheus agent Helm values for remote write to AMP workspace
- Include Grafana dashboard JSON placeholders for overview and live views
- Add IAM trust policy JSON for Grafana role setup
- Write comprehensive README.md with step-by-step AWS deployment guide,
instructions, and troubleshooting tips
- Enable IRSA setup for Prometheus and AntiAtropos service accounts
- Integrate building and pushing AntiAtropos Docker image to ECR
- Support ALB ingress with AWS Load Balancer Controller annotations
- Configurefeat(deploy/aws): add complete AWS deployment setup for AntiAtropos
- Add deploy script to create E metrics scraping, liveness/readiness probes, and resource limitsKS cluster, AMP workspace, IAM roles, and deploy app
- Provide eksctl cluster config with managed node
- deploy/aws/README.md +365 -0
- deploy/aws/deploy.sh +208 -0
- deploy/aws/eksctl-cluster.yaml +65 -0
- deploy/aws/grafana-dashboard-live.json +37 -0
- deploy/aws/grafana-dashboard-overview.json +36 -0
- deploy/aws/grafana-trust-policy.json +17 -0
- deploy/aws/k8s-configmap.yaml +39 -0
- deploy/aws/k8s-deployment.yaml +82 -0
- deploy/aws/k8s-ingress.yaml +34 -0
- deploy/aws/k8s-namespace.yaml +7 -0
- deploy/aws/k8s-secret.yaml +18 -0
- deploy/aws/k8s-service.yaml +21 -0
- deploy/aws/prometheus-agent-values.yaml +86 -0
|
@@ -0,0 +1,365 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# AntiAtropos AWS Deployment Guide
|
| 2 |
+
|
| 3 |
+
Complete guide for deploying AntiAtropos on AWS with EKS, Amazon Managed Prometheus (AMP), and Amazon Managed Grafana (AMG).
|
| 4 |
+
|
| 5 |
+
## Architecture
|
| 6 |
+
|
| 7 |
+
```
|
| 8 |
+
AWS Region (us-east-1)
|
| 9 |
+
├── EKS Cluster
|
| 10 |
+
│ ├── AntiAtropos FastAPI pod
|
| 11 |
+
│ ├── Prometheus Agent pod (remote-writes to AMP)
|
| 12 |
+
│ └── Sample workload pods (optional, for live mode)
|
| 13 |
+
├── Amazon Managed Prometheus (AMP)
|
| 14 |
+
│ └── Workspace: antiatropos-metrics
|
| 15 |
+
├── Amazon Managed Grafana (AMG)
|
| 16 |
+
│ └── Workspace: antiatropos-dashboards
|
| 17 |
+
├── ALB (Application Load Balancer)
|
| 18 |
+
│ └── / → FastAPI, /grafana → AMG
|
| 19 |
+
└── ECR (Container Registry)
|
| 20 |
+
└── antiatropos:latest
|
| 21 |
+
```
|
| 22 |
+
|
| 23 |
+
---
|
| 24 |
+
|
| 25 |
+
## Phase 0: Prerequisites
|
| 26 |
+
|
| 27 |
+
```bash
|
| 28 |
+
# Install CLI tools (if not already installed)
|
| 29 |
+
# AWS CLI v2
|
| 30 |
+
curl "https://awscli.amazonaws.com/AWSCLIV2.msi" -o "AWSCLIV2.msi"
|
| 31 |
+
msiexec /i AWSCLIV2.msi
|
| 32 |
+
|
| 33 |
+
# eksctl (EKS management)
|
| 34 |
+
choco install eksctl # or: winget install --id=FluxCD.eksctl
|
| 35 |
+
|
| 36 |
+
# kubectl
|
| 37 |
+
choco install kubernetes-cli
|
| 38 |
+
|
| 39 |
+
# Helm
|
| 40 |
+
choco install kubernetes-helm
|
| 41 |
+
|
| 42 |
+
# Authenticate AWS
|
| 43 |
+
aws configure
|
| 44 |
+
# Enter: Access Key ID, Secret Access Key, Region (us-east-1), Output (json)
|
| 45 |
+
```
|
| 46 |
+
|
| 47 |
+
---
|
| 48 |
+
|
| 49 |
+
## Phase 1: Create the EKS Cluster
|
| 50 |
+
|
| 51 |
+
### Option A: eksctl (recommended, fastest)
|
| 52 |
+
|
| 53 |
+
Create file `deploy/aws/eksctl-cluster.yaml` then run:
|
| 54 |
+
|
| 55 |
+
```bash
|
| 56 |
+
eksctl create cluster -f deploy/aws/eksctl-cluster.yaml
|
| 57 |
+
```
|
| 58 |
+
|
| 59 |
+
### Option B: AWS Console
|
| 60 |
+
|
| 61 |
+
1. Go to EKS → Create Cluster
|
| 62 |
+
2. Name: `antiatropos`, Kubernetes 1.30
|
| 63 |
+
3. Cluster service role: Create new (let EKS create it)
|
| 64 |
+
4. Networking: Default VPC, all AZs
|
| 65 |
+
5. Add node group: `linux-nodes`, t3.medium, 2-4 nodes
|
| 66 |
+
6. Create and wait ~15 minutes
|
| 67 |
+
|
| 68 |
+
### Verify
|
| 69 |
+
|
| 70 |
+
```bash
|
| 71 |
+
aws eks update-kubeconfig --name antiatropos --region us-east-1
|
| 72 |
+
kubectl get nodes
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
---
|
| 76 |
+
|
| 77 |
+
## Phase 2: Set Up Amazon Managed Prometheus (AMP)
|
| 78 |
+
|
| 79 |
+
### Create AMP Workspace
|
| 80 |
+
|
| 81 |
+
```bash
|
| 82 |
+
aws amp create-workspace \
|
| 83 |
+
--alias antiatropos-metrics \
|
| 84 |
+
--region us-east-1
|
| 85 |
+
|
| 86 |
+
# Note the workspace ARN and ID from the output
|
| 87 |
+
aws amp list-workspaces --alias antiatropos-metrics --region us-east-1
|
| 88 |
+
```
|
| 89 |
+
|
| 90 |
+
### Install Prometheus Agent on EKS (remote-writes to AMP)
|
| 91 |
+
|
| 92 |
+
```bash
|
| 93 |
+
# Add the Prometheus Community Helm repo
|
| 94 |
+
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
|
| 95 |
+
helm repo update
|
| 96 |
+
|
| 97 |
+
# Install prometheus agent with AMP remote write
|
| 98 |
+
# Replace WORKSPACE_ID with your AMP workspace ID
|
| 99 |
+
helm install prometheus-agent prometheus-community/prometheus \
|
| 100 |
+
--namespace monitoring --create-namespace \
|
| 101 |
+
-f deploy/aws/prometheus-agent-values.yaml \
|
| 102 |
+
--set prometheus.prometheusSpec.remoteWrite[0].url="https://aps-workspaces.us-east-1.amazonaws.com/workspaces/WORKSPACE_ID/api/v1/remote_write"
|
| 103 |
+
```
|
| 104 |
+
|
| 105 |
+
### Verify AMP is Receiving Data
|
| 106 |
+
|
| 107 |
+
```bash
|
| 108 |
+
# Port-forward to query AMP directly
|
| 109 |
+
aws amp query-status --workspace-id WORKSPACE_ID --region us-east-1
|
| 110 |
+
|
| 111 |
+
# Or use awscurl for instant queries
|
| 112 |
+
pip install awscurl
|
| 113 |
+
awscurl --service aps "https://aps-workspaces.us-east-1.amazonaws.com/workspaces/WORKSPACE_ID/api/v1/query?query=up" --region us-east-1
|
| 114 |
+
```
|
| 115 |
+
|
| 116 |
+
---
|
| 117 |
+
|
| 118 |
+
## Phase 3: Set Up Amazon Managed Grafana (AMG)
|
| 119 |
+
|
| 120 |
+
### Create AMG Workspace
|
| 121 |
+
|
| 122 |
+
```bash
|
| 123 |
+
# First, create the IAM role for Grafana (allows it to read AMP)
|
| 124 |
+
aws iam create-role \
|
| 125 |
+
--role-name AntiAtroposGrafanaRole \
|
| 126 |
+
--assume-role-policy-document file://deploy/aws/grafana-trust-policy.json
|
| 127 |
+
|
| 128 |
+
aws iam attach-role-policy \
|
| 129 |
+
--role-name AntiAtroposGrafanaRole \
|
| 130 |
+
--policy-arn arn:aws:iam::aws:policy/AmazonPrometheusQueryAccess
|
| 131 |
+
|
| 132 |
+
aws iam attach-role-policy \
|
| 133 |
+
--role-name AntiAtroposGrafanaRole \
|
| 134 |
+
--policy-arn arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess
|
| 135 |
+
|
| 136 |
+
# Create the Grafana workspace
|
| 137 |
+
aws grafana create-workspace \
|
| 138 |
+
--workspace-name antiatropos-dashboards \
|
| 139 |
+
--account-access-type CURRENT_ACCOUNT \
|
| 140 |
+
--authentication-method AWS_SSO \
|
| 141 |
+
--permission-type SERVICE_MANAGED \
|
| 142 |
+
--data-sources PROMETHEUS \
|
| 143 |
+
--region us-east-1
|
| 144 |
+
|
| 145 |
+
# Note the workspace URL from the output
|
| 146 |
+
aws grafana list-workspaces --region us-east-1
|
| 147 |
+
```
|
| 148 |
+
|
| 149 |
+
### Add AMP as a Data Source in AMG
|
| 150 |
+
|
| 151 |
+
1. Open the AMG workspace URL in your browser
|
| 152 |
+
2. Sign in with AWS SSO
|
| 153 |
+
3. Go to Configuration → Data Sources
|
| 154 |
+
4. AMP should auto-discover if in same account/region
|
| 155 |
+
5. Select the `antiatropos-metrics` workspace
|
| 156 |
+
|
| 157 |
+
### Import AntiAtropos Dashboards
|
| 158 |
+
|
| 159 |
+
```bash
|
| 160 |
+
# Use the Grafana API to import dashboards
|
| 161 |
+
# Replace GRAFANA_URL and API_KEY
|
| 162 |
+
GRAFANA_URL="https://YOUR-WORKSPACE-id.grafana.us-east-1.amazonaws.com"
|
| 163 |
+
API_KEY="YOUR-API-KEY"
|
| 164 |
+
|
| 165 |
+
# Import the overview dashboard
|
| 166 |
+
curl -X POST "$GRAFANA_URL/api/dashboards/db" \
|
| 167 |
+
-H "Authorization: Bearer $API_KEY" \
|
| 168 |
+
-H "Content-Type: application/json" \
|
| 169 |
+
-d @deploy/aws/grafana-dashboard-overview.json
|
| 170 |
+
|
| 171 |
+
# Import the live dashboard
|
| 172 |
+
curl -X POST "$GRAFANA_URL/api/dashboards/db" \
|
| 173 |
+
-H "Authorization: Bearer $API_KEY" \
|
| 174 |
+
-H "Content-Type: application/json" \
|
| 175 |
+
-d @deploy/aws/grafana-dashboard-live.json
|
| 176 |
+
```
|
| 177 |
+
|
| 178 |
+
---
|
| 179 |
+
|
| 180 |
+
## Phase 4: Build and Push the Docker Image
|
| 181 |
+
|
| 182 |
+
```bash
|
| 183 |
+
# Create ECR repository
|
| 184 |
+
aws ecr create-repository \
|
| 185 |
+
--repository-name antiatropos \
|
| 186 |
+
--region us-east-1
|
| 187 |
+
|
| 188 |
+
# Login to ECR
|
| 189 |
+
aws ecr get-login-password --region us-east-1 | \
|
| 190 |
+
docker login --username AWS --password-stdin \
|
| 191 |
+
$(aws sts get-caller-identity --query Account --output text).dkr.ecr.us-east-1.amazonaws.com
|
| 192 |
+
|
| 193 |
+
# Build and push
|
| 194 |
+
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
|
| 195 |
+
ECR_URI=$ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/antiatropos
|
| 196 |
+
|
| 197 |
+
docker build -t antiatropos:latest .
|
| 198 |
+
docker tag antiatropos:latest $ECR_URI:latest
|
| 199 |
+
docker push $ECR_URI:latest
|
| 200 |
+
```
|
| 201 |
+
|
| 202 |
+
---
|
| 203 |
+
|
| 204 |
+
## Phase 5: Deploy AntiAtropos to EKS
|
| 205 |
+
|
| 206 |
+
```bash
|
| 207 |
+
# Apply Kubernetes manifests
|
| 208 |
+
kubectl apply -f deploy/aws/k8s-namespace.yaml
|
| 209 |
+
kubectl apply -f deploy/aws/k8s-configmap.yaml
|
| 210 |
+
kubectl apply -f deploy/aws/k8s-deployment.yaml
|
| 211 |
+
kubectl apply -f deploy/aws/k8s-service.yaml
|
| 212 |
+
|
| 213 |
+
# If using AWS Load Balancer Controller (recommended)
|
| 214 |
+
kubectl apply -f deploy/aws/k8s-ingress.yaml
|
| 215 |
+
|
| 216 |
+
# Check rollout
|
| 217 |
+
kubectl rollout status deployment/antiatropos -n antiatropos
|
| 218 |
+
kubectl get pods -n antiatropos
|
| 219 |
+
kubectl logs -f deployment/antiatropos -n antiatropos
|
| 220 |
+
```
|
| 221 |
+
|
| 222 |
+
### Environment Variables for Live Mode
|
| 223 |
+
|
| 224 |
+
The deployment manifest sets these to connect AntiAtropos to real infrastructure:
|
| 225 |
+
|
| 226 |
+
```yaml
|
| 227 |
+
env:
|
| 228 |
+
- name: ANTIATROPOS_ENV_MODE
|
| 229 |
+
value: "live"
|
| 230 |
+
- name: PROMETHEUS_URL
|
| 231 |
+
value: "https://aps-workspaces.us-east-1.amazonaws.com/workspaces/WORKSPACE_ID"
|
| 232 |
+
- name: KUBECONFIG
|
| 233 |
+
value: "" # Empty = use in-cluster config
|
| 234 |
+
- name: ANTIATROPOS_WORKLOAD_MAP
|
| 235 |
+
value: '{"node-0":{"deployment":"payments","namespace":"prod-sre"},"node-1":{"deployment":"checkout","namespace":"prod-sre"}}'
|
| 236 |
+
```
|
| 237 |
+
|
| 238 |
+
---
|
| 239 |
+
|
| 240 |
+
## Phase 6: Access Your Deployment
|
| 241 |
+
|
| 242 |
+
### Get the ALB URL
|
| 243 |
+
|
| 244 |
+
```bash
|
| 245 |
+
kubectl get ingress -n antiatropos
|
| 246 |
+
# Copy the ADDRESS column
|
| 247 |
+
```
|
| 248 |
+
|
| 249 |
+
### Endpoints
|
| 250 |
+
|
| 251 |
+
| Endpoint | URL |
|
| 252 |
+
|---|---|
|
| 253 |
+
| Landing Page | `http://ALB_ADDRESS/` |
|
| 254 |
+
| API Health | `http://ALB_ADDRESS/health` |
|
| 255 |
+
| Prometheus Metrics | `http://ALB_ADDRESS/metrics` |
|
| 256 |
+
| Grafana Dashboards | AMG workspace URL (separate) |
|
| 257 |
+
|
| 258 |
+
### Port-Forward for Local Debugging
|
| 259 |
+
|
| 260 |
+
```bash
|
| 261 |
+
# FastAPI
|
| 262 |
+
kubectl port-forward -n antiatropos deployment/antiatropos 8000:8000
|
| 263 |
+
|
| 264 |
+
# Direct pod metrics
|
| 265 |
+
curl http://localhost:8000/metrics
|
| 266 |
+
```
|
| 267 |
+
|
| 268 |
+
---
|
| 269 |
+
|
| 270 |
+
## Phase 7: IRSA (IAM Roles for Service Accounts)
|
| 271 |
+
|
| 272 |
+
This lets the AntiAtropos pod authenticate with AMP without hardcoded credentials.
|
| 273 |
+
|
| 274 |
+
```bash
|
| 275 |
+
# Create OIDC provider for the EKS cluster
|
| 276 |
+
eksctl utils associate-iam-oidc-provider \
|
| 277 |
+
--cluster antiatropos --region us-east-1 --approve
|
| 278 |
+
|
| 279 |
+
# Create IAM role for the AntiAtropos service account
|
| 280 |
+
eksctl create iamserviceaccount \
|
| 281 |
+
--cluster antiatropos \
|
| 282 |
+
--namespace antiatropos \
|
| 283 |
+
--name antiatropos-sa \
|
| 284 |
+
--attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusQueryAccess \
|
| 285 |
+
--attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess \
|
| 286 |
+
--approve \
|
| 287 |
+
--override-existing-serviceaccounts
|
| 288 |
+
|
| 289 |
+
# Redeploy to pick up the new service account
|
| 290 |
+
kubectl rollout restart deployment/antiatropos -n antiatropos
|
| 291 |
+
```
|
| 292 |
+
|
| 293 |
+
---
|
| 294 |
+
|
| 295 |
+
## Cost Estimates
|
| 296 |
+
|
| 297 |
+
| Resource | Config | Monthly Cost (approx) |
|
| 298 |
+
|---|---|---|
|
| 299 |
+
| EKS Control Plane | 1 cluster | $73 |
|
| 300 |
+
| EKS Nodes | 2x t3.medium | $60 |
|
| 301 |
+
| AMP | <10GB ingest | ~$3-5 |
|
| 302 |
+
| AMG | 1 editor + viewers | Free tier or ~$9 |
|
| 303 |
+
| ALB | 1 load balancer | $16 |
|
| 304 |
+
| ECR | <1GB storage | <$1 |
|
| 305 |
+
| **Total** | | **~$150-160/month** |
|
| 306 |
+
|
| 307 |
+
### Cost-Saving Tips
|
| 308 |
+
|
| 309 |
+
- Use `t3.spot` for node groups (60-70% cheaper)
|
| 310 |
+
- Scale nodes to 0 when not training: `kubectl cordon` + drain
|
| 311 |
+
- Use Fargate profiles for the AntiAtropos pod (pay-per-pod-second)
|
| 312 |
+
- Delete the cluster between training runs with `eksctl delete cluster`
|
| 313 |
+
|
| 314 |
+
---
|
| 315 |
+
|
| 316 |
+
## Teardown
|
| 317 |
+
|
| 318 |
+
```bash
|
| 319 |
+
# Delete everything in reverse order
|
| 320 |
+
kubectl delete -f deploy/aws/k8s-ingress.yaml
|
| 321 |
+
kubectl delete -f deploy/aws/k8s-service.yaml
|
| 322 |
+
kubectl delete -f deploy/aws/k8s-deployment.yaml
|
| 323 |
+
kubectl delete -f deploy/aws/k8s-configmap.yaml
|
| 324 |
+
kubectl delete -f deploy/aws/k8s-namespace.yaml
|
| 325 |
+
|
| 326 |
+
aws grafana delete-workspace --workspace-id AMG_WORKSPACE_ID
|
| 327 |
+
aws amp delete-workspace --workspace-id AMP_WORKSPACE_ID
|
| 328 |
+
aws ecr delete-repository --repository-name antiatropos --force
|
| 329 |
+
|
| 330 |
+
eksctl delete cluster --name antiatropos --region us-east-1
|
| 331 |
+
```
|
| 332 |
+
|
| 333 |
+
---
|
| 334 |
+
|
| 335 |
+
## Troubleshooting
|
| 336 |
+
|
| 337 |
+
### Pods not starting
|
| 338 |
+
```bash
|
| 339 |
+
kubectl describe pod -n antiatropos -l app=antiatropos
|
| 340 |
+
kubectl logs -n antiatropos -l app=antiatropos --previous
|
| 341 |
+
```
|
| 342 |
+
|
| 343 |
+
### AMP not receiving metrics
|
| 344 |
+
```bash
|
| 345 |
+
# Check the prometheus agent logs
|
| 346 |
+
kubectl logs -n monitoring -l app.kubernetes.io/name=prometheus
|
| 347 |
+
|
| 348 |
+
# Verify remote-write endpoint
|
| 349 |
+
aws amp describe-workspace --workspace-id WORKSPACE_ID
|
| 350 |
+
```
|
| 351 |
+
|
| 352 |
+
### Can't reach AMP from pod
|
| 353 |
+
```bash
|
| 354 |
+
# Verify IRSA is attached
|
| 355 |
+
kubectl get pod -n antiatropos -o yaml | grep -A5 serviceAccount
|
| 356 |
+
|
| 357 |
+
# Check pod can reach AMP
|
| 358 |
+
kubectl exec -n antiatropos deployment/antiatropos -- \
|
| 359 |
+
curl -s "https://aps-workspaces.us-east-1.amazonaws.com/workspaces/WORKSPACE_ID/api/v1/query?query=up"
|
| 360 |
+
```
|
| 361 |
+
|
| 362 |
+
### Grafana dashboard shows no data
|
| 363 |
+
1. Verify the data source URL in AMG points to the correct AMP workspace
|
| 364 |
+
2. Check time range (AMP has a retention period; default 30 days)
|
| 365 |
+
3. Verify the PromQL queries in dashboards match your metric names
|
|
@@ -0,0 +1,208 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env bash
|
| 2 |
+
# AntiAtropos AWS Quick Deploy Script
|
| 3 |
+
#
|
| 4 |
+
# Prerequisites: aws cli, eksctl, kubectl, helm, docker
|
| 5 |
+
#
|
| 6 |
+
# Usage:
|
| 7 |
+
# chmod +x deploy/aws/deploy.sh
|
| 8 |
+
# ./deploy/aws/deploy.sh
|
| 9 |
+
#
|
| 10 |
+
# This script creates all AWS resources and deploys AntiAtropos to EKS.
|
| 11 |
+
# Set these environment variables before running:
|
| 12 |
+
# OPENAI_API_KEY - Your OpenAI API key (required)
|
| 13 |
+
# AWS_REGION - AWS region (default: us-east-1)
|
| 14 |
+
# CLUSTER_NAME - EKS cluster name (default: antiatropos)
|
| 15 |
+
|
| 16 |
+
set -euo pipefail
|
| 17 |
+
|
| 18 |
+
REGION="${AWS_REGION:-us-east-1}"
|
| 19 |
+
CLUSTER_NAME="${CLUSTER_NAME:-antiatropos}"
|
| 20 |
+
AWS_DIR="$(cd "$(dirname "$0")" && pwd)"
|
| 21 |
+
|
| 22 |
+
echo "=== AntiAtropos AWS Deployment ==="
|
| 23 |
+
echo "Region: $REGION"
|
| 24 |
+
echo "Cluster: $CLUSTER_NAME"
|
| 25 |
+
echo ""
|
| 26 |
+
|
| 27 |
+
# --- Check prerequisites ---
|
| 28 |
+
for cmd in aws eksctl kubectl helm docker; do
|
| 29 |
+
if ! command -v "$cmd" &>/dev/null; then
|
| 30 |
+
echo "ERROR: $cmd is not installed. Please install it first."
|
| 31 |
+
exit 1
|
| 32 |
+
fi
|
| 33 |
+
done
|
| 34 |
+
|
| 35 |
+
if [ -z "${OPENAI_API_KEY:-}" ]; then
|
| 36 |
+
echo "ERROR: OPENAI_API_KEY environment variable is not set."
|
| 37 |
+
exit 1
|
| 38 |
+
fi
|
| 39 |
+
|
| 40 |
+
# --- Phase 1: Create EKS Cluster ---
|
| 41 |
+
echo ""
|
| 42 |
+
echo ">>> Phase 1: Creating EKS cluster..."
|
| 43 |
+
if eksctl get cluster --name "$CLUSTER_NAME" --region "$REGION" &>/dev/null; then
|
| 44 |
+
echo "Cluster $CLUSTER_NAME already exists, skipping creation."
|
| 45 |
+
else
|
| 46 |
+
eksctl create cluster -f "$AWS_DIR/eksctl-cluster.yaml"
|
| 47 |
+
echo "Cluster created."
|
| 48 |
+
fi
|
| 49 |
+
|
| 50 |
+
aws eks update-kubeconfig --name "$CLUSTER_NAME" --region "$REGION"
|
| 51 |
+
echo "kubeconfig updated."
|
| 52 |
+
|
| 53 |
+
# --- Phase 2: Create AMP Workspace ---
|
| 54 |
+
echo ""
|
| 55 |
+
echo ">>> Phase 2: Creating Amazon Managed Prometheus workspace..."
|
| 56 |
+
AMP_WS_ID=$(aws amp list-workspaces --alias antiatropos-metrics --region "$REGION" --query 'workspaces[0].workspaceId' --output text 2>/dev/null || echo "")
|
| 57 |
+
|
| 58 |
+
if [ -z "$AMP_WS_ID" ] || [ "$AMP_WS_ID" = "None" ]; then
|
| 59 |
+
AMP_WS_ID=$(aws amp create-workspace \
|
| 60 |
+
--alias antiatropos-metrics \
|
| 61 |
+
--region "$REGION" \
|
| 62 |
+
--query 'workspaceId' \
|
| 63 |
+
--output text)
|
| 64 |
+
echo "AMP workspace created: $AMP_WS_ID"
|
| 65 |
+
else
|
| 66 |
+
echo "AMP workspace already exists: $AMP_WS_ID"
|
| 67 |
+
fi
|
| 68 |
+
|
| 69 |
+
AMP_URL="https://aps-workspaces.$REGION.amazonaws.com/workspaces/$AMP_WS_ID"
|
| 70 |
+
echo "AMP URL: $AMP_URL"
|
| 71 |
+
|
| 72 |
+
# --- Phase 3: Set up IAM Roles for Service Accounts (IRSA) ---
|
| 73 |
+
echo ""
|
| 74 |
+
echo ">>> Phase 3: Setting up IRSA..."
|
| 75 |
+
CLUSTER_OIDC=$(aws eks describe-cluster --name "$CLUSTER_NAME" --region "$REGION" --query 'cluster.identity.oidc.issuer' --output text | sed 's|https://||')
|
| 76 |
+
ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
|
| 77 |
+
|
| 78 |
+
# Prometheus service account
|
| 79 |
+
if kubectl get serviceaccount prometheus-sa -n monitoring &>/dev/null; then
|
| 80 |
+
echo "prometheus-sa already exists."
|
| 81 |
+
else
|
| 82 |
+
eksctl create iamserviceaccount \
|
| 83 |
+
--cluster "$CLUSTER_NAME" \
|
| 84 |
+
--namespace monitoring \
|
| 85 |
+
--name prometheus-sa \
|
| 86 |
+
--attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess \
|
| 87 |
+
--approve \
|
| 88 |
+
--override-existing-serviceaccounts
|
| 89 |
+
echo "prometheus-sa created."
|
| 90 |
+
fi
|
| 91 |
+
|
| 92 |
+
# AntiAtropos service account
|
| 93 |
+
if kubectl get serviceaccount antiatropos-sa -n antiatropos &>/dev/null; then
|
| 94 |
+
echo "antiatropos-sa already exists."
|
| 95 |
+
else
|
| 96 |
+
eksctl create iamserviceaccount \
|
| 97 |
+
--cluster "$CLUSTER_NAME" \
|
| 98 |
+
--namespace antiatropos \
|
| 99 |
+
--name antiatropos-sa \
|
| 100 |
+
--attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusQueryAccess \
|
| 101 |
+
--approve \
|
| 102 |
+
--override-existing-serviceaccounts
|
| 103 |
+
echo "antiatropos-sa created."
|
| 104 |
+
fi
|
| 105 |
+
|
| 106 |
+
# --- Phase 4: Install Prometheus Agent ---
|
| 107 |
+
echo ""
|
| 108 |
+
echo ">>> Phase 4: Installing Prometheus Agent (remote-writes to AMP)..."
|
| 109 |
+
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts 2>/dev/null || true
|
| 110 |
+
helm repo update
|
| 111 |
+
|
| 112 |
+
if helm status prometheus-agent -n monitoring &>/dev/null; then
|
| 113 |
+
echo "prometheus-agent already installed, upgrading..."
|
| 114 |
+
helm upgrade prometheus-agent prometheus-community/prometheus \
|
| 115 |
+
--namespace monitoring \
|
| 116 |
+
-f "$AWS_DIR/prometheus-agent-values.yaml" \
|
| 117 |
+
--set "prometheus.prometheusSpec.remoteWrite[0].url=$AMP_URL/api/v1/remote_write"
|
| 118 |
+
else
|
| 119 |
+
helm install prometheus-agent prometheus-community/prometheus \
|
| 120 |
+
--namespace monitoring --create-namespace \
|
| 121 |
+
-f "$AWS_DIR/prometheus-agent-values.yaml" \
|
| 122 |
+
--set "prometheus.prometheusSpec.remoteWrite[0].url=$AMP_URL/api/v1/remote_write"
|
| 123 |
+
echo "prometheus-agent installed."
|
| 124 |
+
fi
|
| 125 |
+
|
| 126 |
+
# --- Phase 5: Build and Push Docker Image ---
|
| 127 |
+
echo ""
|
| 128 |
+
echo ">>> Phase 5: Building and pushing Docker image to ECR..."
|
| 129 |
+
ECR_URI="$ACCOUNT_ID.dkr.ecr.$REGION.amazonaws.com/antiatropos"
|
| 130 |
+
|
| 131 |
+
if aws ecr describe-repositories --repository-name antiatropos --region "$REGION" &>/dev/null; then
|
| 132 |
+
echo "ECR repository already exists."
|
| 133 |
+
else
|
| 134 |
+
aws ecr create-repository --repository-name antiatropos --region "$REGION"
|
| 135 |
+
echo "ECR repository created."
|
| 136 |
+
fi
|
| 137 |
+
|
| 138 |
+
aws ecr get-login-password --region "$REGION" | \
|
| 139 |
+
docker login --username AWS --password-stdin \
|
| 140 |
+
"$ACCOUNT_ID.dkr.ecr.$REGION.amazonaws.com"
|
| 141 |
+
|
| 142 |
+
docker build -t antiatropos:latest "$AWS_DIR/../.."
|
| 143 |
+
docker tag antiatropos:latest "$ECR_URI:latest"
|
| 144 |
+
docker push "$ECR_URI:latest"
|
| 145 |
+
echo "Image pushed to $ECR_URI:latest"
|
| 146 |
+
|
| 147 |
+
# --- Phase 6: Deploy AntiAtropos ---
|
| 148 |
+
echo ""
|
| 149 |
+
echo ">>> Phase 6: Deploying AntiAtropos to EKS..."
|
| 150 |
+
|
| 151 |
+
# Apply namespace first
|
| 152 |
+
kubectl apply -f "$AWS_DIR/k8s-namespace.yaml"
|
| 153 |
+
|
| 154 |
+
# Create/update configmap with the real AMP URL
|
| 155 |
+
kubectl create configmap antiatropos-config \
|
| 156 |
+
--from-literal=ANTIATROPOS_ENV_MODE=live \
|
| 157 |
+
--from-literal=ANTIATROPOS_STRICT_REAL=false \
|
| 158 |
+
--from-literal="PROMETHEUS_URL=$AMP_URL" \
|
| 159 |
+
--from-literal=KUBECONFIG="" \
|
| 160 |
+
--from-literal=ANTIATROPOS_K8S_NAMESPACE=default \
|
| 161 |
+
--from-literal=ANTIATROPOS_DEPLOYMENT_PREFIX="" \
|
| 162 |
+
--from-literal=ANTIATROPOS_MIN_REPLICAS=1 \
|
| 163 |
+
--from-literal=ANTIATROPOS_MAX_REPLICAS=20 \
|
| 164 |
+
--from-literal=ANTIATROPOS_SCALE_STEP=3 \
|
| 165 |
+
--from-literal=ANTIATROPOS_WORKLOAD_MAP='{}' \
|
| 166 |
+
--from-literal=ANTIATROPOS_NODE_DEPLOYMENT_MAP='{}' \
|
| 167 |
+
--from-literal=ANTIATROPOS_PROM_TIMEOUT_S=5.0 \
|
| 168 |
+
--from-literal=ANTIATROPOS_METRIC_AGGREGATION=sum \
|
| 169 |
+
-n antiatropos \
|
| 170 |
+
--dry-run=client -o yaml | kubectl apply -f -
|
| 171 |
+
|
| 172 |
+
# Create secret with OpenAI API key
|
| 173 |
+
kubectl create secret generic antiatropos-secrets \
|
| 174 |
+
--from-literal=openai-api-key="$OPENAI_API_KEY" \
|
| 175 |
+
-n antiatropos \
|
| 176 |
+
--dry-run=client -o yaml | kubectl apply -f -
|
| 177 |
+
|
| 178 |
+
# Apply deployment with the correct ECR image
|
| 179 |
+
sed "s|ACCOUNT_ID|$ACCOUNT_ID|g; s|us-east-1|$REGION|g" "$AWS_DIR/k8s-deployment.yaml" | kubectl apply -f -
|
| 180 |
+
kubectl apply -f "$AWS_DIR/k8s-service.yaml"
|
| 181 |
+
|
| 182 |
+
# Wait for rollout
|
| 183 |
+
echo "Waiting for deployment to be ready..."
|
| 184 |
+
kubectl rollout status deployment/antiatropos -n antiatropos --timeout=300s
|
| 185 |
+
|
| 186 |
+
# --- Done ---
|
| 187 |
+
echo ""
|
| 188 |
+
echo "=========================================="
|
| 189 |
+
echo " AntiAtropos AWS Deployment Complete!"
|
| 190 |
+
echo "=========================================="
|
| 191 |
+
echo ""
|
| 192 |
+
echo "AMP Workspace ID: $AMP_WS_ID"
|
| 193 |
+
echo "AMP URL: $AMP_URL"
|
| 194 |
+
echo ""
|
| 195 |
+
echo "Next steps:"
|
| 196 |
+
echo " 1. Create an Amazon Managed Grafana workspace:"
|
| 197 |
+
echo " aws grafana create-workspace --workspace-name antiatropos-dashboards \\"
|
| 198 |
+
echo " --account-access-type CURRENT_ACCOUNT --authentication-method AWS_SSO \\"
|
| 199 |
+
echo " --permission-type SERVICE_MANAGED --data-sources PROMETHEUS --region $REGION"
|
| 200 |
+
echo ""
|
| 201 |
+
echo " 2. Add AMP as a data source in AMG and import dashboards from:"
|
| 202 |
+
echo " deploy/grafana/provisioning/dashboards/json/"
|
| 203 |
+
echo ""
|
| 204 |
+
echo " 3. Get the AntiAtropos service URL:"
|
| 205 |
+
echo " kubectl get svc -n antiatropos antiatropos"
|
| 206 |
+
echo ""
|
| 207 |
+
echo " 4. Port-forward for local testing:"
|
| 208 |
+
echo " kubectl port-forward -n antiatropos deployment/antiatropos 8000:8000"
|
|
@@ -0,0 +1,65 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
apiVersion: eksctl.io/v1alpha5
|
| 2 |
+
kind: ClusterConfig
|
| 3 |
+
|
| 4 |
+
metadata:
|
| 5 |
+
name: antiatropos
|
| 6 |
+
region: us-east-1
|
| 7 |
+
version: "1.30"
|
| 8 |
+
tags:
|
| 9 |
+
Project: AntiAtropos
|
| 10 |
+
Environment: production
|
| 11 |
+
|
| 12 |
+
iam:
|
| 13 |
+
withOIDC: true
|
| 14 |
+
|
| 15 |
+
addons:
|
| 16 |
+
- name: vpc-cni
|
| 17 |
+
version: latest
|
| 18 |
+
- name: coredns
|
| 19 |
+
version: latest
|
| 20 |
+
- name: kube-proxy
|
| 21 |
+
version: latest
|
| 22 |
+
- name: aws-ebs-csi-driver
|
| 23 |
+
version: latest
|
| 24 |
+
wellKnownPolicies:
|
| 25 |
+
ebsCSIController: true
|
| 26 |
+
|
| 27 |
+
managedNodeGroups:
|
| 28 |
+
- name: linux-nodes
|
| 29 |
+
instanceType: t3.medium
|
| 30 |
+
desiredCapacity: 2
|
| 31 |
+
minSize: 1
|
| 32 |
+
maxSize: 4
|
| 33 |
+
volumeSize: 50
|
| 34 |
+
volumeType: gp3
|
| 35 |
+
labels:
|
| 36 |
+
role: worker
|
| 37 |
+
tags:
|
| 38 |
+
Project: AntiAtropos
|
| 39 |
+
NodeGroup: linux-nodes
|
| 40 |
+
iam:
|
| 41 |
+
withAddonPolicies:
|
| 42 |
+
ebs: true
|
| 43 |
+
cloudWatch: true
|
| 44 |
+
autoScaler: true
|
| 45 |
+
|
| 46 |
+
# Optional: Spot instance group for cost savings
|
| 47 |
+
# - name: spot-nodes
|
| 48 |
+
# instanceType: t3.medium
|
| 49 |
+
# desiredCapacity: 1
|
| 50 |
+
# minSize: 0
|
| 51 |
+
# maxSize: 3
|
| 52 |
+
# volumeSize: 30
|
| 53 |
+
# volumeType: gp3
|
| 54 |
+
# spot: true
|
| 55 |
+
# labels:
|
| 56 |
+
# role: spot-worker
|
| 57 |
+
# tags:
|
| 58 |
+
# Project: AntiAtropos
|
| 59 |
+
|
| 60 |
+
cloudWatch:
|
| 61 |
+
clusterLogging:
|
| 62 |
+
enableTypes:
|
| 63 |
+
- api
|
| 64 |
+
- audit
|
| 65 |
+
- authenticator
|
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"dashboard": {
|
| 3 |
+
"__inputs": [],
|
| 4 |
+
"__requires": [],
|
| 5 |
+
"annotations": {
|
| 6 |
+
"list": []
|
| 7 |
+
},
|
| 8 |
+
"description": "AntiAtropos SRE Environment Live Dashboard - AWS Deployment. Import the full dashboard from deploy/grafana/provisioning/dashboards/json/antiatropos-live.json via the AMG UI (Dashboards > Import > Upload JSON file). After import, change the data source from the local Prometheus (UID: PBFA97CFB590B2093) to your AMP workspace data source.",
|
| 9 |
+
"editable": true,
|
| 10 |
+
"fiscalYearStartMonth": 0,
|
| 11 |
+
"graphTooltip": 1,
|
| 12 |
+
"id": null,
|
| 13 |
+
"links": [],
|
| 14 |
+
"panels": [
|
| 15 |
+
{
|
| 16 |
+
"title": "Import the full dashboard",
|
| 17 |
+
"type": "text",
|
| 18 |
+
"gridPos": {"h": 4, "w": 24, "x": 0, "y": 0},
|
| 19 |
+
"options": {
|
| 20 |
+
"mode": "markdown",
|
| 21 |
+
"content": "## Import Instructions\n\nThis is a placeholder panel. Import the full dashboard from:\n\n```\ndeploy/grafana/provisioning/dashboards/json/antiatropos-live.json\n```\n\nvia the AMG UI: **Dashboards > Import > Upload JSON file**\n\nAfter import, change the data source from the local Prometheus (UID: `PBFA97CFB590B2093`) to your AMP workspace data source."
|
| 22 |
+
}
|
| 23 |
+
}
|
| 24 |
+
],
|
| 25 |
+
"schemaVersion": 39,
|
| 26 |
+
"tags": ["antiatropos", "sre", "aws", "live"],
|
| 27 |
+
"templating": {"list": []},
|
| 28 |
+
"time": {"from": "now-5m", "to": "now"},
|
| 29 |
+
"timepicker": {},
|
| 30 |
+
"timezone": "utc",
|
| 31 |
+
"title": "AntiAtropos Live (AWS)",
|
| 32 |
+
"uid": "antiatropos-live-aws",
|
| 33 |
+
"version": 0,
|
| 34 |
+
"refresh": "5s"
|
| 35 |
+
},
|
| 36 |
+
"overwrite": true
|
| 37 |
+
}
|
|
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"dashboard": {
|
| 3 |
+
"__inputs": [],
|
| 4 |
+
"__requires": [],
|
| 5 |
+
"annotations": {
|
| 6 |
+
"list": []
|
| 7 |
+
},
|
| 8 |
+
"description": "AntiAtropos SRE Environment Overview - AWS Deployment. Import the full dashboard from deploy/grafana/provisioning/dashboards/json/antiatropos-overview.json via the AMG UI (Dashboards > Import > Upload JSON file). After import, change the data source from the local Prometheus (UID: PBFA97CFB590B2093) to your AMP workspace data source.",
|
| 9 |
+
"editable": true,
|
| 10 |
+
"fiscalYearStartMonth": 0,
|
| 11 |
+
"graphTooltip": 1,
|
| 12 |
+
"id": null,
|
| 13 |
+
"links": [],
|
| 14 |
+
"panels": [
|
| 15 |
+
{
|
| 16 |
+
"title": "Import the full dashboard",
|
| 17 |
+
"type": "text",
|
| 18 |
+
"gridPos": {"h": 4, "w": 24, "x": 0, "y": 0},
|
| 19 |
+
"options": {
|
| 20 |
+
"mode": "markdown",
|
| 21 |
+
"content": "## Import Instructions\n\nThis is a placeholder panel. Import the full dashboard from:\n\n```\ndeploy/grafana/provisioning/dashboards/json/antiatropos-overview.json\n```\n\nvia the AMG UI: **Dashboards > Import > Upload JSON file**\n\nAfter import, change the data source from the local Prometheus (UID: `PBFA97CFB590B2093`) to your AMP workspace data source."
|
| 22 |
+
}
|
| 23 |
+
}
|
| 24 |
+
],
|
| 25 |
+
"schemaVersion": 39,
|
| 26 |
+
"tags": ["antiatropos", "sre", "aws"],
|
| 27 |
+
"templating": {"list": []},
|
| 28 |
+
"time": {"from": "now-1h", "to": "now"},
|
| 29 |
+
"timepicker": {},
|
| 30 |
+
"timezone": "utc",
|
| 31 |
+
"title": "AntiAtropos Overview (AWS)",
|
| 32 |
+
"uid": "antiatropos-overview-aws",
|
| 33 |
+
"version": 0
|
| 34 |
+
},
|
| 35 |
+
"overwrite": true
|
| 36 |
+
}
|
|
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"Version": "2012-10-17",
|
| 3 |
+
"Statement": [
|
| 4 |
+
{
|
| 5 |
+
"Effect": "Allow",
|
| 6 |
+
"Principal": {
|
| 7 |
+
"Service": "grafana.amazonaws.com"
|
| 8 |
+
},
|
| 9 |
+
"Action": "sts:AssumeRole",
|
| 10 |
+
"Condition": {
|
| 11 |
+
"StringEquals": {
|
| 12 |
+
"sts:ExternalId": "YOUR_ACCOUNT_ID"
|
| 13 |
+
}
|
| 14 |
+
}
|
| 15 |
+
}
|
| 16 |
+
]
|
| 17 |
+
}
|
|
@@ -0,0 +1,39 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
apiVersion: v1
|
| 2 |
+
kind: ConfigMap
|
| 3 |
+
metadata:
|
| 4 |
+
name: antiatropos-config
|
| 5 |
+
namespace: antiatropos
|
| 6 |
+
data:
|
| 7 |
+
# Environment mode: "simulated" for mock, "live" for real K8s + Prometheus
|
| 8 |
+
ANTIATROPOS_ENV_MODE: "live"
|
| 9 |
+
ANTIATROPOS_STRICT_REAL: "false"
|
| 10 |
+
|
| 11 |
+
# Prometheus URL — replace WORKSPACE_ID with your AMP workspace ID
|
| 12 |
+
# The pod uses IRSA to authenticate; no API key needed
|
| 13 |
+
PROMETHEUS_URL: "https://aps-workspaces.us-east-1.amazonaws.com/workspaces/WORKSPACE_ID"
|
| 14 |
+
|
| 15 |
+
# Kubernetes executor config — empty KUBECONFIG = in-cluster config
|
| 16 |
+
KUBECONFIG: ""
|
| 17 |
+
ANTIATROPOS_K8S_NAMESPACE: "default"
|
| 18 |
+
ANTIATROPOS_DEPLOYMENT_PREFIX: ""
|
| 19 |
+
ANTIATROPOS_MIN_REPLICAS: "1"
|
| 20 |
+
ANTIATROPOS_MAX_REPLICAS: "20"
|
| 21 |
+
ANTIATROPOS_SCALE_STEP: "3"
|
| 22 |
+
|
| 23 |
+
# Workload mapping — maps simulator nodes to real K8s deployments
|
| 24 |
+
# Customize this to match your actual deployments
|
| 25 |
+
ANTIATROPOS_WORKLOAD_MAP: |
|
| 26 |
+
{
|
| 27 |
+
"node-0": {"deployment": "payments", "namespace": "prod-sre"},
|
| 28 |
+
"node-1": {"deployment": "checkout", "namespace": "prod-sre"},
|
| 29 |
+
"node-2": {"deployment": "catalog", "namespace": "prod-sre"},
|
| 30 |
+
"node-3": {"deployment": "cart", "namespace": "prod-sre"},
|
| 31 |
+
"node-4": {"deployment": "auth", "namespace": "prod-sre"}
|
| 32 |
+
}
|
| 33 |
+
ANTIATROPOS_NODE_DEPLOYMENT_MAP: "{}"
|
| 34 |
+
|
| 35 |
+
# Prometheus query timeout
|
| 36 |
+
ANTIATROPOS_PROM_TIMEOUT_S: "5.0"
|
| 37 |
+
|
| 38 |
+
# Metric aggregation strategy
|
| 39 |
+
ANTIATROPOS_METRIC_AGGREGATION: "sum"
|
|
@@ -0,0 +1,82 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
apiVersion: apps/v1
|
| 2 |
+
kind: Deployment
|
| 3 |
+
metadata:
|
| 4 |
+
name: antiatropos
|
| 5 |
+
namespace: antiatropos
|
| 6 |
+
labels:
|
| 7 |
+
app: antiatropos
|
| 8 |
+
spec:
|
| 9 |
+
replicas: 1
|
| 10 |
+
selector:
|
| 11 |
+
matchLabels:
|
| 12 |
+
app: antiatropos
|
| 13 |
+
template:
|
| 14 |
+
metadata:
|
| 15 |
+
labels:
|
| 16 |
+
app: antiatropos
|
| 17 |
+
annotations:
|
| 18 |
+
prometheus.io/scrape: "true"
|
| 19 |
+
prometheus.io/port: "8000"
|
| 20 |
+
prometheus.io/path: "/metrics"
|
| 21 |
+
spec:
|
| 22 |
+
serviceAccountName: antiatropos-sa
|
| 23 |
+
containers:
|
| 24 |
+
- name: antiatropos
|
| 25 |
+
# Replace ACCOUNT_ID with your AWS account ID
|
| 26 |
+
image: ACCOUNT_ID.dkr.ecr.us-east-1.amazonaws.com/antiatropos:latest
|
| 27 |
+
ports:
|
| 28 |
+
- containerPort: 8000
|
| 29 |
+
name: http
|
| 30 |
+
protocol: TCP
|
| 31 |
+
envFrom:
|
| 32 |
+
- configMapRef:
|
| 33 |
+
name: antiatropos-config
|
| 34 |
+
env:
|
| 35 |
+
# Secrets should come from AWS Secrets Manager or Kubernetes Secrets
|
| 36 |
+
- name: OPENAI_API_KEY
|
| 37 |
+
valueFrom:
|
| 38 |
+
secretKeyRef:
|
| 39 |
+
name: antiatropos-secrets
|
| 40 |
+
key: openai-api-key
|
| 41 |
+
- name: ANTIATROPOS_TASK
|
| 42 |
+
value: "task-1"
|
| 43 |
+
resources:
|
| 44 |
+
requests:
|
| 45 |
+
cpu: 500m
|
| 46 |
+
memory: 512Mi
|
| 47 |
+
limits:
|
| 48 |
+
cpu: 2000m
|
| 49 |
+
memory: 2Gi
|
| 50 |
+
livenessProbe:
|
| 51 |
+
httpGet:
|
| 52 |
+
path: /health
|
| 53 |
+
port: 8000
|
| 54 |
+
initialDelaySeconds: 15
|
| 55 |
+
periodSeconds: 30
|
| 56 |
+
timeoutSeconds: 5
|
| 57 |
+
failureThreshold: 3
|
| 58 |
+
readinessProbe:
|
| 59 |
+
httpGet:
|
| 60 |
+
path: /health
|
| 61 |
+
port: 8000
|
| 62 |
+
initialDelaySeconds: 10
|
| 63 |
+
periodSeconds: 10
|
| 64 |
+
timeoutSeconds: 3
|
| 65 |
+
failureThreshold: 3
|
| 66 |
+
# Allow scheduling on spot instances too
|
| 67 |
+
tolerations:
|
| 68 |
+
- key: "spot"
|
| 69 |
+
operator: "Equal"
|
| 70 |
+
value: "true"
|
| 71 |
+
effect: "NoSchedule"
|
| 72 |
+
affinity:
|
| 73 |
+
nodeAffinity:
|
| 74 |
+
preferredDuringSchedulingIgnoredDuringExecution:
|
| 75 |
+
- weight: 50
|
| 76 |
+
preference:
|
| 77 |
+
matchExpressions:
|
| 78 |
+
- key: role
|
| 79 |
+
operator: In
|
| 80 |
+
values:
|
| 81 |
+
- worker
|
| 82 |
+
- spot-worker
|
|
@@ -0,0 +1,34 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# ALB Ingress — requires AWS Load Balancer Controller installed on the cluster
|
| 2 |
+
# Install it first: https://docs.aws.amazon.com/eks/latest/userguide/aws-load-balancer-controller.html
|
| 3 |
+
#
|
| 4 |
+
# If you prefer a simpler setup, just use the k8s-service.yaml LoadBalancer
|
| 5 |
+
# and skip this Ingress entirely.
|
| 6 |
+
|
| 7 |
+
apiVersion: networking.k8s.io/v1
|
| 8 |
+
kind: Ingress
|
| 9 |
+
metadata:
|
| 10 |
+
name: antiatropos-ingress
|
| 11 |
+
namespace: antiatropos
|
| 12 |
+
annotations:
|
| 13 |
+
alb.ingress.kubernetes.io/scheme: internet-facing
|
| 14 |
+
alb.ingress.kubernetes.io/target-type: ip
|
| 15 |
+
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}]'
|
| 16 |
+
# Uncomment for HTTPS (requires ACM certificate):
|
| 17 |
+
# alb.ingress.kubernetes.io/listen-ports: '[{"HTTPS": 443}]'
|
| 18 |
+
# alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:ACCOUNT_ID:certificate/CERT_ID
|
| 19 |
+
alb.ingress.kubernetes.io/healthcheck-path: /health
|
| 20 |
+
alb.ingress.kubernetes.io/healthcheck-interval-seconds: "30"
|
| 21 |
+
alb.ingress.kubernetes.io/success-codes: "200"
|
| 22 |
+
alb.ingress.kubernetes.io/tags: Project=AntiAtropos
|
| 23 |
+
spec:
|
| 24 |
+
ingressClassName: alb
|
| 25 |
+
rules:
|
| 26 |
+
- http:
|
| 27 |
+
paths:
|
| 28 |
+
- path: /
|
| 29 |
+
pathType: Prefix
|
| 30 |
+
backend:
|
| 31 |
+
service:
|
| 32 |
+
name: antiatropos
|
| 33 |
+
port:
|
| 34 |
+
number: 80
|
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
apiVersion: v1
|
| 2 |
+
kind: Namespace
|
| 3 |
+
metadata:
|
| 4 |
+
name: antiatropos
|
| 5 |
+
labels:
|
| 6 |
+
app.kubernetes.io/name: antiatropos
|
| 7 |
+
app.kubernetes.io/part-of: antiatropos
|
|
@@ -0,0 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Create this secret before deploying AntiAtropos
|
| 2 |
+
# Replace the value with your actual OpenAI API key (base64-encoded)
|
| 3 |
+
#
|
| 4 |
+
# To encode your key:
|
| 5 |
+
# echo -n 'sk-your-key-here' | base64
|
| 6 |
+
#
|
| 7 |
+
# Then apply:
|
| 8 |
+
# kubectl apply -f k8s-secret.yaml
|
| 9 |
+
|
| 10 |
+
apiVersion: v1
|
| 11 |
+
kind: Secret
|
| 12 |
+
metadata:
|
| 13 |
+
name: antiatropos-secrets
|
| 14 |
+
namespace: antiatropos
|
| 15 |
+
type: Opaque
|
| 16 |
+
data:
|
| 17 |
+
# base64-encoded OpenAI API key
|
| 18 |
+
openai-api-key: U0tZT1VSX09QRU5BSV9BUElfS0VZX0hFUkU= # placeholder — replace!
|
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
apiVersion: v1
|
| 2 |
+
kind: Service
|
| 3 |
+
metadata:
|
| 4 |
+
name: antiatropos
|
| 5 |
+
namespace: antiatropos
|
| 6 |
+
labels:
|
| 7 |
+
app: antiatropos
|
| 8 |
+
annotations:
|
| 9 |
+
# Required for AWS Load Balancer Controller to create an NLB
|
| 10 |
+
service.beta.kubernetes.io/aws-load-balancer-type: external
|
| 11 |
+
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
|
| 12 |
+
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
|
| 13 |
+
spec:
|
| 14 |
+
type: LoadBalancer
|
| 15 |
+
selector:
|
| 16 |
+
app: antiatropos
|
| 17 |
+
ports:
|
| 18 |
+
- name: http
|
| 19 |
+
port: 80
|
| 20 |
+
targetPort: 8000
|
| 21 |
+
protocol: TCP
|
|
@@ -0,0 +1,86 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Helm values for Prometheus Agent that remote-writes to Amazon Managed Prometheus
|
| 2 |
+
#
|
| 3 |
+
# Usage:
|
| 4 |
+
# helm install prometheus-agent prometheus-community/prometheus \
|
| 5 |
+
# --namespace monitoring --create-namespace \
|
| 6 |
+
# -f prometheus-agent-values.yaml \
|
| 7 |
+
# --set prometheus.prometheusSpec.remoteWrite[0].url="https://aps-workspaces.us-east-1.amazonaws.com/workspaces/WORKSPACE_ID/api/v1/remote_write"
|
| 8 |
+
#
|
| 9 |
+
# Prerequisite: Create an IAM service account for the prometheus pod
|
| 10 |
+
# eksctl create iamserviceaccount \
|
| 11 |
+
# --cluster antiatropos \
|
| 12 |
+
# --namespace monitoring \
|
| 13 |
+
# --name prometheus-sa \
|
| 14 |
+
# --attach-policy-arn arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess \
|
| 15 |
+
# --approve
|
| 16 |
+
|
| 17 |
+
prometheus:
|
| 18 |
+
prometheusSpec:
|
| 19 |
+
# Run as agent mode (remote-write only, no local query API)
|
| 20 |
+
agentMode: true
|
| 21 |
+
|
| 22 |
+
# Remote write — override via --set on the command line
|
| 23 |
+
remoteWrite:
|
| 24 |
+
- url: "https://aps-workspaces.us-east-1.amazonaws.com/workspaces/REPLACE_WORKSPACE_ID/api/v1/remote_write"
|
| 25 |
+
sigv4:
|
| 26 |
+
region: us-east-1
|
| 27 |
+
|
| 28 |
+
# Scrape the AntiAtropos FastAPI /metrics endpoint
|
| 29 |
+
additionalScrapeConfigs:
|
| 30 |
+
- job_name: antiatropos
|
| 31 |
+
metrics_path: /metrics
|
| 32 |
+
scrape_interval: 15s
|
| 33 |
+
kubernetes_sd_configs:
|
| 34 |
+
- role: pod
|
| 35 |
+
namespaces:
|
| 36 |
+
names:
|
| 37 |
+
- antiatropos
|
| 38 |
+
relabel_configs:
|
| 39 |
+
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
|
| 40 |
+
action: keep
|
| 41 |
+
regex: true
|
| 42 |
+
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
|
| 43 |
+
action: replace
|
| 44 |
+
target_label: __metrics_path__
|
| 45 |
+
regex: (.+)
|
| 46 |
+
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
|
| 47 |
+
action: replace
|
| 48 |
+
regex: ([^:]+)(?::\d+)?;(\d+)
|
| 49 |
+
replacement: $1:$2
|
| 50 |
+
target_label: __address__
|
| 51 |
+
- action: labelmap
|
| 52 |
+
regex: __meta_kubernetes_pod_label_(.+)
|
| 53 |
+
- source_labels: [__meta_kubernetes_namespace]
|
| 54 |
+
action: replace
|
| 55 |
+
target_label: namespace
|
| 56 |
+
- source_labels: [__meta_kubernetes_pod_name]
|
| 57 |
+
action: replace
|
| 58 |
+
target_label: pod
|
| 59 |
+
|
| 60 |
+
resources:
|
| 61 |
+
requests:
|
| 62 |
+
cpu: 100m
|
| 63 |
+
memory: 256Mi
|
| 64 |
+
limits:
|
| 65 |
+
cpu: 500m
|
| 66 |
+
memory: 512Mi
|
| 67 |
+
|
| 68 |
+
# Short retention since we're remote-writing everything to AMP
|
| 69 |
+
retention: 2h
|
| 70 |
+
|
| 71 |
+
# Use the IAM service account for AMP authentication
|
| 72 |
+
serviceAccount:
|
| 73 |
+
name: prometheus-sa
|
| 74 |
+
create: false
|
| 75 |
+
|
| 76 |
+
# Disable alertmanager (AMP handles alerting if needed)
|
| 77 |
+
alertmanager:
|
| 78 |
+
enabled: false
|
| 79 |
+
|
| 80 |
+
# Disable pushgateway
|
| 81 |
+
pushgateway:
|
| 82 |
+
enabled: false
|
| 83 |
+
|
| 84 |
+
# Disable server (we only need the agent)
|
| 85 |
+
server:
|
| 86 |
+
enabled: false
|