Spaces:
Sleeping
Sleeping
| # SECURITY β arac-hasar-v2 | |
| Owner: Security Engineer | |
| Scope: pilot-production. Stores customer PII (vehicle images, user emails) and produces damage / cost estimates that may flow into invoice / claim workflows. | |
| --- | |
| ## 1. Threat Model | |
| ### 1.1 System overview | |
| | Layer | Component | Notes | | |
| |---|---|---| | |
| | Edge | TLS terminator (Render / Cloudflare / nginx) | HTTPS only; no plaintext listener in prod | | |
| | API | FastAPI (`services/backend`) | JWT-authenticated REST + WebSocket; this document covers it | | |
| | ML | YOLO inference service (`services/ml`) | Internal; reachable only from backend | | |
| | Storage | PostgreSQL (managed), Redis (rate-limit + pubsub), S3/MinIO (images) | Network-isolated; no public exposure | | |
| | Clients | Next.js web, Tauri 2 desktop, React Native mobile | All consume the same API | | |
| ### 1.2 Trust boundaries | |
| ``` | |
| Public Internet | |
| | (TLS) | |
| [Edge / CDN] | |
| | (private network) | |
| [FastAPI] | |
| | (private network, IAM) | |
| [Postgres] [Redis] [S3] [ML service] | |
| ``` | |
| Every arrow crossing a boundary is an authentication checkpoint. | |
| ### 1.3 Sensitive data inventory | |
| | Data | Classification | Where it lives | Controls | | |
| |---|---|---|---| | |
| | User email | PII | Postgres `users.email`, access logs (redacted on auth paths) | TLS in transit; encrypted-at-rest (managed Postgres) | | |
| | Password | secret | Postgres `users.password_hash` (bcrypt cost 12) | Never logged; never returned | | |
| | Vehicle images | PII (may contain plates, faces, location via EXIF) | S3 bucket | EXIF stripped on upload; private bucket; signed URLs only | | |
| | JWT access/refresh | secret | Client-held; never persisted server-side | Short TTL (30 min / 7 d); HS256 signed | | |
| | API keys (pilot integrations) | secret | Postgres `api_keys.key_hash` (sha256) | Shown plaintext once on issue; revocable | | |
| | ML inference results / cost estimates | business data | Postgres + S3 reports | Tenant isolation enforced at handler | | |
| ### 1.4 STRIDE summary | |
| | Threat | Vector | Risk | Mitigation | | |
| |---|---|---|---| | |
| | Spoofing | Stolen credentials, token replay | High | Bcrypt cost 12, short access-token TTL, refresh rotation (TODO: backend wire), per-route rate limits on `/auth/login` | | |
| | Tampering | Modified upload, tampered cost estimate | Med | Server-side decode + revalidation of images; cost computed server-side from `cost_table.yaml`; never trust client-supplied totals | | |
| | Repudiation | "I never uploaded that" / "I never approved that estimate" | Med | Structured JSON access log w/ request_id, user_id, sha256 of uploaded image | | |
| | Information disclosure | IDOR on `/api/v1/inspect/{id}`, EXIF GPS leak | High | Mandatory ownership check pattern (section 3); EXIF stripped before storage | | |
| | Denial of Service | Image bomb, hot loop on `/inspect`, brute-force login | High | 20 MB cap, decompression-bomb guard (`Image.MAX_IMAGE_PIXELS`), slowapi limits | | |
| | Elevation of privilege | `role` claim tampering, missing admin check | Crit | JWT signature verification; `require_admin` dependency; role re-read from DB on refresh | | |
| ### 1.5 Out of scope (for now) | |
| - Multi-region failover | |
| - DDoS at the transport layer (delegated to CDN) | |
| - Hardware security modules / KMS-managed JWT signing keys (flagged for production-scale) | |
| - SSO / SAML (pilot uses local accounts + API keys) | |
| --- | |
| ## 2. OWASP Top 10 (2021) β Mitigations | |
| ### A01 β Broken Access Control | |
| - Every protected route depends on `require_user` (or `require_admin`). | |
| - IDOR pattern is mandatory; see section 3. | |
| - WebSocket connections must authenticate within 5 s of `accept()` (Backend Architect owns the WS handler β flagged in section 6). | |
| - Default policy is deny: a route without an explicit auth dependency is treated as a review failure. | |
| ### A02 β Cryptographic Failures | |
| - Passwords: bcrypt (passlib), cost factor 12, `BCRYPT_ROUNDS` env-tunable. | |
| - JWT: HS256 (acceptable for monolithic backend; migrate to RS256 if signing moves to a separate service). | |
| - API keys: 256 bits of entropy, prefixed `ahv2_`, stored as sha256 hash, compared with `hmac.compare_digest`. | |
| - Secrets exclusively via env vars; `.env` is gitignored. | |
| - TLS terminated at edge; HSTS sent in staging/prod by `SecurityHeadersMiddleware`. | |
| - No custom crypto. Period. | |
| ### A03 β Injection | |
| - **SQL**: SQLAlchemy ORM + parameterized `text()` for any raw SQL. Never f-string user input into queries. Reviewed in PR template. | |
| - **Command**: no `subprocess` with `shell=True`. Image processing stays in-process (PIL). | |
| - **Path**: `sanitize_filename` strips `..`, backslashes, control chars, and prefixes a uuid4. S3 keys are never user-supplied raw. | |
| - **Header**: request IDs whitelisted to `[A-Za-z0-9_-]`, capped at 128 chars (CRLF injection guard). | |
| ### A04 β Insecure Design | |
| - Threat model (section 1) reviewed before each release. | |
| - Cost estimates computed server-side from `cost_table.yaml`; clients cannot override. | |
| - Refresh tokens carry `role="user"` by design β privilege is re-derived from the DB on refresh so a leaked refresh token cannot escalate. | |
| ### A05 β Security Misconfiguration | |
| - `_validate_config()` hard-fails at import time if `JWT_SECRET_KEY` is < 32 chars in staging/production. | |
| - Default-deny CSP (`default-src 'none'`) on all API responses. | |
| - CORS allowlist (`ALLOWED_ORIGINS`); `allow_credentials=False` because we use bearer tokens, not cookies. | |
| - `Server` header stripped. | |
| - Debug / docs (`/docs`, `/redoc`) must be disabled in production (flagged for Backend Architect β see section 6). | |
| ### A06 β Vulnerable & Outdated Components | |
| - `requirements.txt` is the canonical lock; CI must run `pip-audit` (or `trivy fs`) on every PR. | |
| - Renovate / Dependabot recommended for weekly updates. | |
| ### A07 β Identification & Authentication Failures | |
| - `/auth/login` rate-limited to **5/min per IP** via slowapi. | |
| - Generic error messages on bad credentials ("invalid email or password") β no user enumeration. | |
| - Access tokens: 30 min. Refresh tokens: 7 d, single-use rotation (Backend Architect to implement `jti` blocklist in Redis β section 6). | |
| - Password requirements (length / complexity) are owned by the user model layer (Database Optimizer) β flagged. | |
| ### A08 β Software & Data Integrity Failures | |
| - Pinned dependencies in `requirements.txt`. | |
| - Container images built from pinned base + reproducible build. | |
| - ML model weights checksummed at load time (Backend Architect owns `ml_service.py` β flagged in section 6). | |
| ### A09 β Security Logging & Monitoring Failures | |
| - `AccessLogMiddleware` emits structured JSON: `ts, method, path, status, duration_ms, user_id, request_id, ip, ua`. | |
| - Auth paths (`/auth/*`, `/login`, `/token`, `/refresh`, `/password`) suppress query string from logs. | |
| - Request bodies are never logged. | |
| - Bcrypt / JWT failures log at INFO with **reason class only**, never the input. | |
| - Recommend shipping access log to a SIEM / log aggregator (Loki / CloudWatch) with retention >= 90 days. | |
| ### A10 β Server-Side Request Forgery (SSRF) | |
| - The backend never fetches user-supplied URLs. | |
| - Image uploads are received as multipart bytes β no fetch-by-URL path exists. | |
| - If a "fetch from URL" feature is added later, it MUST: | |
| 1. Resolve DNS server-side once and reject private / link-local ranges. | |
| 2. Disallow redirects to private ranges. | |
| 3. Run in a dedicated egress-restricted network namespace. | |
| --- | |
| ## 3. Authorization pattern (mandatory) | |
| Every endpoint that touches a tenant-scoped resource MUST follow this shape: | |
| ```python | |
| from fastapi import APIRouter, Depends, HTTPException, status | |
| from security import require_user, TokenPayload | |
| router = APIRouter() | |
| @router.get("/api/v1/inspect/{inspection_id}") | |
| async def get_inspection( | |
| inspection_id: UUID, | |
| user: TokenPayload = Depends(require_user), | |
| db: AsyncSession = Depends(get_db), | |
| ): | |
| row = await db.get(Inspection, inspection_id) | |
| if row is None: | |
| # 404, not 403, to avoid leaking existence | |
| raise HTTPException(status.HTTP_404_NOT_FOUND) | |
| if row.user_id != user.user_id and user.role != "admin": | |
| # IDOR check. Same 404 to prevent enumeration. | |
| raise HTTPException(status.HTTP_404_NOT_FOUND) | |
| return row | |
| ``` | |
| Rules: | |
| 1. **Always** check ownership before returning a row. | |
| 2. **Always** return `404`, never `403`, when the user isn't the owner (no existence oracle). | |
| 3. Admin override goes through `user.role == "admin"`, never a query param. | |
| 4. Bulk endpoints (e.g. `GET /api/v1/inspect`) MUST filter `WHERE user_id = :uid` in the query β never in Python. | |
| --- | |
| ## 4. File upload pipeline | |
| ``` | |
| multipart bytes | |
| -> validate_image_upload(buf) | |
| size cap (20 MB) | |
| magic-byte MIME sniff (jpeg / png / webp only) | |
| PIL decode + verify() | |
| decompression-bomb guard | |
| EXIF orientation applied | |
| EXIF metadata stripped (PII: GPS, camera serial, timestamps) | |
| dimension cap (10000 x 10000) | |
| -> sanitize_filename(orig_name) | |
| -> upload to S3 with server-generated key | |
| Content-Type forced to sniffed MIME | |
| bucket policy: private, no public-read | |
| served via short-lived presigned URLs | |
| ``` | |
| Hard rules: | |
| - Never trust client-supplied `Content-Type`. | |
| - Never store the raw user-supplied filename as the S3 key. | |
| - Never serve images from a domain that can execute scripts (use a separate static / signed-URL domain). | |
| - S3 bucket policy must deny `*:GetObject` to the public. | |
| --- | |
| ## 5. CSRF | |
| The API is bearer-token only (`Authorization: Bearer <jwt>`). Browsers do **not** automatically attach `Authorization` headers cross-origin, so the classic CSRF vector (auto-submit a form, browser attaches cookie) does not apply. | |
| This is enforced by: | |
| - `allow_credentials=False` on CORS. | |
| - No `Set-Cookie` issued anywhere in the backend. | |
| - Tight `ALLOWED_ORIGINS`. | |
| If cookie-based sessions are ever introduced (e.g. SSR Next.js with httponly cookies), CSRF tokens become mandatory β flagged in section 6. | |
| --- | |
| ## 6. Open items for follow-up (NOT owned by Security) | |
| Items below are flagged for the corresponding owner; Security has not modified those files. | |
| | # | Item | Owner | Severity | | |
| |---|---|---|---| | |
| | 1 | Refresh-token rotation: persist used `jti` in Redis with TTL = refresh lifetime; reject reuse | Backend Architect | High | | |
| | 2 | Disable `/docs` and `/redoc` in production (`docs_url=None` when `ENVIRONMENT=production`) | Backend Architect | Med | | |
| | 3 | WebSocket auth: enforce JWT within 5 s of `accept()`, close 4401 otherwise | Backend Architect (`ws.py`) | High | | |
| | 4 | Password policy (min 12 chars, breach check via HIBP k-anonymity) at registration | Database Optimizer (`models.py`) + Backend Architect (handler) | Med | | |
| | 5 | ML weights integrity: sha256 manifest verified before load in `ml_service.py` | Backend Architect | Med | | |
| | 6 | S3 bucket policy review: confirm `BlockPublicAcls`, `IgnorePublicAcls`, `BlockPublicPolicy`, `RestrictPublicBuckets` all true; encryption at rest enabled | Backend Architect (`storage.py`) + Infra | High | | |
| | 7 | Audit log: separate immutable stream for security events (login success/failure, role change, api-key issue/revoke) | Backend Architect | Med | | |
| | 8 | Secret rotation runbook (JWT key, DB password, S3 keys) | Infra / Ops | Med | | |
| | 9 | Penetration test before GA (target: OWASP ASVS L2) | External | High | | |
| | 10 | KMS-managed JWT signing (migrate HS256 -> RS256/EdDSA) at scale | Backend Architect | Low (deferred) | | |
| | 11 | CI security gates: `pip-audit`, `gitleaks`, `semgrep` on every PR | DevOps | High | | |
| | 12 | Brute-force / credential-stuffing detection beyond simple rate limit (e.g. account lockout after N failures with cool-down) | Backend Architect | Med | | |
| --- | |
| ## 7. Deploy checklist | |
| Before tagging a release that goes to staging or production: | |
| - [ ] `JWT_SECRET_KEY` is set, >= 32 chars, unique per environment. | |
| - [ ] `ENVIRONMENT` set to `staging` or `production` (enables HSTS + strict config validation). | |
| - [ ] `ALLOWED_ORIGINS` populated with the exact production origins. | |
| - [ ] `RATE_LIMIT_REDIS_URL` points to a managed Redis (not memory://). | |
| - [ ] `BCRYPT_ROUNDS=12` (or higher; benchmark target ~250 ms per hash on prod CPU). | |
| - [ ] `/docs` and `/redoc` disabled. | |
| - [ ] Postgres / Redis / S3 reachable only over private network. | |
| - [ ] S3 bucket: private, encryption-at-rest, lifecycle rule to purge after retention window. | |
| - [ ] TLS cert valid; HSTS preload submitted if appropriate. | |
| - [ ] `pip-audit` and `gitleaks` green on the build SHA. | |
| - [ ] Access log is shipping to the aggregator and is searchable by `request_id`. | |
| - [ ] Incident-response runbook (who to page, how to revoke a leaked JWT secret, how to rotate API keys) is current. | |
| - [ ] Backup + restore tested in the last 30 days. | |
| --- | |
| ## 8. Reporting a vulnerability | |
| Send to security@ (mailbox TBD). Include reproduction, impact, and a sane timeline. We commit to acknowledging within 72 h and patching critical issues within 7 days. | |