# SECURITY — arac-hasar-v2 Owner: Security Engineer Scope: pilot-production. Stores customer PII (vehicle images, user emails) and produces damage / cost estimates that may flow into invoice / claim workflows. --- ## 1. Threat Model ### 1.1 System overview | Layer | Component | Notes | |---|---|---| | Edge | TLS terminator (Render / Cloudflare / nginx) | HTTPS only; no plaintext listener in prod | | API | FastAPI (`services/backend`) | JWT-authenticated REST + WebSocket; this document covers it | | ML | YOLO inference service (`services/ml`) | Internal; reachable only from backend | | Storage | PostgreSQL (managed), Redis (rate-limit + pubsub), S3/MinIO (images) | Network-isolated; no public exposure | | Clients | Next.js web, Tauri 2 desktop, React Native mobile | All consume the same API | ### 1.2 Trust boundaries ``` Public Internet | (TLS) [Edge / CDN] | (private network) [FastAPI] | (private network, IAM) [Postgres] [Redis] [S3] [ML service] ``` Every arrow crossing a boundary is an authentication checkpoint. ### 1.3 Sensitive data inventory | Data | Classification | Where it lives | Controls | |---|---|---|---| | User email | PII | Postgres `users.email`, access logs (redacted on auth paths) | TLS in transit; encrypted-at-rest (managed Postgres) | | Password | secret | Postgres `users.password_hash` (bcrypt cost 12) | Never logged; never returned | | Vehicle images | PII (may contain plates, faces, location via EXIF) | S3 bucket | EXIF stripped on upload; private bucket; signed URLs only | | JWT access/refresh | secret | Client-held; never persisted server-side | Short TTL (30 min / 7 d); HS256 signed | | API keys (pilot integrations) | secret | Postgres `api_keys.key_hash` (sha256) | Shown plaintext once on issue; revocable | | ML inference results / cost estimates | business data | Postgres + S3 reports | Tenant isolation enforced at handler | ### 1.4 STRIDE summary | Threat | Vector | Risk | Mitigation | |---|---|---|---| | Spoofing | Stolen credentials, token replay | High | Bcrypt cost 12, short access-token TTL, refresh rotation (TODO: backend wire), per-route rate limits on `/auth/login` | | Tampering | Modified upload, tampered cost estimate | Med | Server-side decode + revalidation of images; cost computed server-side from `cost_table.yaml`; never trust client-supplied totals | | Repudiation | "I never uploaded that" / "I never approved that estimate" | Med | Structured JSON access log w/ request_id, user_id, sha256 of uploaded image | | Information disclosure | IDOR on `/api/v1/inspect/{id}`, EXIF GPS leak | High | Mandatory ownership check pattern (section 3); EXIF stripped before storage | | Denial of Service | Image bomb, hot loop on `/inspect`, brute-force login | High | 20 MB cap, decompression-bomb guard (`Image.MAX_IMAGE_PIXELS`), slowapi limits | | Elevation of privilege | `role` claim tampering, missing admin check | Crit | JWT signature verification; `require_admin` dependency; role re-read from DB on refresh | ### 1.5 Out of scope (for now) - Multi-region failover - DDoS at the transport layer (delegated to CDN) - Hardware security modules / KMS-managed JWT signing keys (flagged for production-scale) - SSO / SAML (pilot uses local accounts + API keys) --- ## 2. OWASP Top 10 (2021) — Mitigations ### A01 — Broken Access Control - Every protected route depends on `require_user` (or `require_admin`). - IDOR pattern is mandatory; see section 3. - WebSocket connections must authenticate within 5 s of `accept()` (Backend Architect owns the WS handler — flagged in section 6). - Default policy is deny: a route without an explicit auth dependency is treated as a review failure. ### A02 — Cryptographic Failures - Passwords: bcrypt (passlib), cost factor 12, `BCRYPT_ROUNDS` env-tunable. - JWT: HS256 (acceptable for monolithic backend; migrate to RS256 if signing moves to a separate service). - API keys: 256 bits of entropy, prefixed `ahv2_`, stored as sha256 hash, compared with `hmac.compare_digest`. - Secrets exclusively via env vars; `.env` is gitignored. - TLS terminated at edge; HSTS sent in staging/prod by `SecurityHeadersMiddleware`. - No custom crypto. Period. ### A03 — Injection - **SQL**: SQLAlchemy ORM + parameterized `text()` for any raw SQL. Never f-string user input into queries. Reviewed in PR template. - **Command**: no `subprocess` with `shell=True`. Image processing stays in-process (PIL). - **Path**: `sanitize_filename` strips `..`, backslashes, control chars, and prefixes a uuid4. S3 keys are never user-supplied raw. - **Header**: request IDs whitelisted to `[A-Za-z0-9_-]`, capped at 128 chars (CRLF injection guard). ### A04 — Insecure Design - Threat model (section 1) reviewed before each release. - Cost estimates computed server-side from `cost_table.yaml`; clients cannot override. - Refresh tokens carry `role="user"` by design — privilege is re-derived from the DB on refresh so a leaked refresh token cannot escalate. ### A05 — Security Misconfiguration - `_validate_config()` hard-fails at import time if `JWT_SECRET_KEY` is < 32 chars in staging/production. - Default-deny CSP (`default-src 'none'`) on all API responses. - CORS allowlist (`ALLOWED_ORIGINS`); `allow_credentials=False` because we use bearer tokens, not cookies. - `Server` header stripped. - Debug / docs (`/docs`, `/redoc`) must be disabled in production (flagged for Backend Architect — see section 6). ### A06 — Vulnerable & Outdated Components - `requirements.txt` is the canonical lock; CI must run `pip-audit` (or `trivy fs`) on every PR. - Renovate / Dependabot recommended for weekly updates. ### A07 — Identification & Authentication Failures - `/auth/login` rate-limited to **5/min per IP** via slowapi. - Generic error messages on bad credentials ("invalid email or password") — no user enumeration. - Access tokens: 30 min. Refresh tokens: 7 d, single-use rotation (Backend Architect to implement `jti` blocklist in Redis — section 6). - Password requirements (length / complexity) are owned by the user model layer (Database Optimizer) — flagged. ### A08 — Software & Data Integrity Failures - Pinned dependencies in `requirements.txt`. - Container images built from pinned base + reproducible build. - ML model weights checksummed at load time (Backend Architect owns `ml_service.py` — flagged in section 6). ### A09 — Security Logging & Monitoring Failures - `AccessLogMiddleware` emits structured JSON: `ts, method, path, status, duration_ms, user_id, request_id, ip, ua`. - Auth paths (`/auth/*`, `/login`, `/token`, `/refresh`, `/password`) suppress query string from logs. - Request bodies are never logged. - Bcrypt / JWT failures log at INFO with **reason class only**, never the input. - Recommend shipping access log to a SIEM / log aggregator (Loki / CloudWatch) with retention >= 90 days. ### A10 — Server-Side Request Forgery (SSRF) - The backend never fetches user-supplied URLs. - Image uploads are received as multipart bytes — no fetch-by-URL path exists. - If a "fetch from URL" feature is added later, it MUST: 1. Resolve DNS server-side once and reject private / link-local ranges. 2. Disallow redirects to private ranges. 3. Run in a dedicated egress-restricted network namespace. --- ## 3. Authorization pattern (mandatory) Every endpoint that touches a tenant-scoped resource MUST follow this shape: ```python from fastapi import APIRouter, Depends, HTTPException, status from security import require_user, TokenPayload router = APIRouter() @router.get("/api/v1/inspect/{inspection_id}") async def get_inspection( inspection_id: UUID, user: TokenPayload = Depends(require_user), db: AsyncSession = Depends(get_db), ): row = await db.get(Inspection, inspection_id) if row is None: # 404, not 403, to avoid leaking existence raise HTTPException(status.HTTP_404_NOT_FOUND) if row.user_id != user.user_id and user.role != "admin": # IDOR check. Same 404 to prevent enumeration. raise HTTPException(status.HTTP_404_NOT_FOUND) return row ``` Rules: 1. **Always** check ownership before returning a row. 2. **Always** return `404`, never `403`, when the user isn't the owner (no existence oracle). 3. Admin override goes through `user.role == "admin"`, never a query param. 4. Bulk endpoints (e.g. `GET /api/v1/inspect`) MUST filter `WHERE user_id = :uid` in the query — never in Python. --- ## 4. File upload pipeline ``` multipart bytes -> validate_image_upload(buf) size cap (20 MB) magic-byte MIME sniff (jpeg / png / webp only) PIL decode + verify() decompression-bomb guard EXIF orientation applied EXIF metadata stripped (PII: GPS, camera serial, timestamps) dimension cap (10000 x 10000) -> sanitize_filename(orig_name) -> upload to S3 with server-generated key Content-Type forced to sniffed MIME bucket policy: private, no public-read served via short-lived presigned URLs ``` Hard rules: - Never trust client-supplied `Content-Type`. - Never store the raw user-supplied filename as the S3 key. - Never serve images from a domain that can execute scripts (use a separate static / signed-URL domain). - S3 bucket policy must deny `*:GetObject` to the public. --- ## 5. CSRF The API is bearer-token only (`Authorization: Bearer `). Browsers do **not** automatically attach `Authorization` headers cross-origin, so the classic CSRF vector (auto-submit a form, browser attaches cookie) does not apply. This is enforced by: - `allow_credentials=False` on CORS. - No `Set-Cookie` issued anywhere in the backend. - Tight `ALLOWED_ORIGINS`. If cookie-based sessions are ever introduced (e.g. SSR Next.js with httponly cookies), CSRF tokens become mandatory — flagged in section 6. --- ## 6. Open items for follow-up (NOT owned by Security) Items below are flagged for the corresponding owner; Security has not modified those files. | # | Item | Owner | Severity | |---|---|---|---| | 1 | Refresh-token rotation: persist used `jti` in Redis with TTL = refresh lifetime; reject reuse | Backend Architect | High | | 2 | Disable `/docs` and `/redoc` in production (`docs_url=None` when `ENVIRONMENT=production`) | Backend Architect | Med | | 3 | WebSocket auth: enforce JWT within 5 s of `accept()`, close 4401 otherwise | Backend Architect (`ws.py`) | High | | 4 | Password policy (min 12 chars, breach check via HIBP k-anonymity) at registration | Database Optimizer (`models.py`) + Backend Architect (handler) | Med | | 5 | ML weights integrity: sha256 manifest verified before load in `ml_service.py` | Backend Architect | Med | | 6 | S3 bucket policy review: confirm `BlockPublicAcls`, `IgnorePublicAcls`, `BlockPublicPolicy`, `RestrictPublicBuckets` all true; encryption at rest enabled | Backend Architect (`storage.py`) + Infra | High | | 7 | Audit log: separate immutable stream for security events (login success/failure, role change, api-key issue/revoke) | Backend Architect | Med | | 8 | Secret rotation runbook (JWT key, DB password, S3 keys) | Infra / Ops | Med | | 9 | Penetration test before GA (target: OWASP ASVS L2) | External | High | | 10 | KMS-managed JWT signing (migrate HS256 -> RS256/EdDSA) at scale | Backend Architect | Low (deferred) | | 11 | CI security gates: `pip-audit`, `gitleaks`, `semgrep` on every PR | DevOps | High | | 12 | Brute-force / credential-stuffing detection beyond simple rate limit (e.g. account lockout after N failures with cool-down) | Backend Architect | Med | --- ## 7. Deploy checklist Before tagging a release that goes to staging or production: - [ ] `JWT_SECRET_KEY` is set, >= 32 chars, unique per environment. - [ ] `ENVIRONMENT` set to `staging` or `production` (enables HSTS + strict config validation). - [ ] `ALLOWED_ORIGINS` populated with the exact production origins. - [ ] `RATE_LIMIT_REDIS_URL` points to a managed Redis (not memory://). - [ ] `BCRYPT_ROUNDS=12` (or higher; benchmark target ~250 ms per hash on prod CPU). - [ ] `/docs` and `/redoc` disabled. - [ ] Postgres / Redis / S3 reachable only over private network. - [ ] S3 bucket: private, encryption-at-rest, lifecycle rule to purge after retention window. - [ ] TLS cert valid; HSTS preload submitted if appropriate. - [ ] `pip-audit` and `gitleaks` green on the build SHA. - [ ] Access log is shipping to the aggregator and is searchable by `request_id`. - [ ] Incident-response runbook (who to page, how to revoke a leaked JWT secret, how to rotate API keys) is current. - [ ] Backup + restore tested in the last 30 days. --- ## 8. Reporting a vulnerability Send to security@ (mailbox TBD). Include reproduction, impact, and a sane timeline. We commit to acknowledging within 72 h and patching critical issues within 7 days.