apoorvrajdev commited on
Commit
131c45c
Β·
1 Parent(s): 8a3f3b1

docs(readme): finalize Phase 2B frontend documentation

Browse files
Files changed (1) hide show
  1. README.md +130 -3
README.md CHANGED
@@ -9,6 +9,13 @@
9
  <img alt="FastAPI ready" src="https://img.shields.io/badge/FastAPI-ready-009688?logo=fastapi&logoColor=white">
10
  </p>
11
 
 
 
 
 
 
 
 
12
  <p align="left">
13
  <img alt="Ruff" src="https://img.shields.io/badge/lint-ruff-261230?logo=ruff&logoColor=white">
14
  <img alt="mypy" src="https://img.shields.io/badge/typed-mypy-1F5082">
@@ -21,7 +28,7 @@
21
  <img alt="License: MIT" src="https://img.shields.io/badge/license-MIT-lightgrey">
22
  <img alt="Phase 1" src="https://img.shields.io/badge/Phase%201-complete-brightgreen">
23
  <img alt="Phase 2A" src="https://img.shields.io/badge/Phase%202A-complete-brightgreen">
24
- <img alt="Phase 2B" src="https://img.shields.io/badge/Phase%202B-planned-blue">
25
  </p>
26
 
27
  ---
@@ -30,6 +37,8 @@
30
 
31
  This repository implements an **end-to-end image-captioning pipeline** built around an InceptionV3 visual encoder and a custom multi-head Transformer decoder. The architecture is the basis of the IEEE-published paper *β€œAI Narratives: Bridging Visual Content and Linguistic Expression”*; this codebase lifts the original Kaggle research notebook into a typed, tested, configuration-driven Python package that can be reused from CLI, scripts, or a future serving layer.
32
 
 
 
33
  The repository is structured in deliberate phases:
34
 
35
  | Phase | Focus | Status |
@@ -37,7 +46,7 @@ The repository is structured in deliberate phases:
37
  | 0 β€” Bootstrap | Tooling, packaging, freeze policy | βœ… complete |
38
  | 1 β€” Modularisation | Notebook β†’ typed Python package, parity audit, unit tests | βœ… complete |
39
  | 2A β€” Backend Infrastructure | FastAPI inference API, structured logging, schemas, health checks, Swagger/OpenAPI, predictor lifecycle | βœ… complete |
40
- | 2B β€” Frontend UI | React/Vite frontend + upload UX + API integration | ⏳ planned |
41
  | 3 β€” Multimodal baselines | BLIP / ViT-GPT2 / GIT side-by-side comparison | ⏳ planned |
42
  | 4 β€” Observability | Sentry, Prometheus metrics, ADRs | ⏳ planned |
43
 
@@ -137,6 +146,28 @@ image-captioning-system/
137
  β”‚ β”œβ”€β”€ services/ # PredictorService β€” image bytes β†’ caption + latency
138
  β”‚ └── utils/ # Image decoding + content-type guards
139
  β”‚
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
140
  β”œβ”€β”€ configs/
141
  β”‚ β”œβ”€β”€ base.yaml # IEEE hyperparameters (cell 6 mirror)
142
  β”‚ └── train/debug.yaml # CI smoke override
@@ -277,6 +308,20 @@ curl -X POST http://localhost:8000/v1/captions \
277
 
278
  Interactive Swagger UI is auto-generated at [`/docs`](http://localhost:8000/docs); the raw schema lives at [`/openapi.json`](http://localhost:8000/openapi.json).
279
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
280
  ---
281
 
282
  ## FastAPI backend
@@ -314,6 +359,80 @@ python -m scripts.bootstrap_dev_artifacts \
314
 
315
  ---
316
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
317
  ## Configuration system
318
 
319
  Hyperparameters are not globals. They live in YAML files validated by Pydantic v2 `BaseSettings`:
@@ -406,7 +525,8 @@ These are explicitly tracked rather than hidden; full list in [`docs/PHASE_1_NOT
406
 
407
  - **Phase 1b** β€” beam search, CIDEr / METEOR / ROUGE-L, masked accuracy parity-fix, label smoothing, warmup + cosine LR schedule.
408
  - **Phase 2A** βœ… β€” FastAPI backend, lifespan-managed predictor singleton, multipart inference endpoint, structured logging + request IDs, Pydantic schemas, Swagger/OpenAPI docs, health/readiness probe.
409
- - **Phase 2B** β€” React/Vite frontend with Tailwind UI, drag/drop image uploads, live API integration against `POST /v1/captions`, deployment integration (HuggingFace Spaces backend + Vercel-hosted frontend), GitHub Actions CI/CD.
 
410
  - **Phase 3** β€” Tier-1 multimodal upgrades: BLIP-base / ViT-GPT2 / GIT-base-coco side-by-side comparison demo with per-model BLEU + latency.
411
  - **Phase 4** β€” Sentry, Prometheus, DagsHub-hosted MLflow link, Architecture Decision Records (`docs/adr/`).
412
  - **Future work** β€” ViT + Transformer fine-tune on COCO; VLM API integration (Anthropic Claude vision) behind a feature flag; VQA endpoint.
@@ -421,6 +541,13 @@ Detailed plan: [`docs/restructure-plan.md`](docs/restructure-plan.md).
421
  - Swagger/OpenAPI testing β€” interactive `/docs` UI for hand-testing every endpoint, raw `/openapi.json` for client codegen.
422
  - Structured logging β€” JSON in production, pretty in dev; per-request UUIDs threaded through every log line.
423
  - End-to-end image upload β†’ caption flow β€” multipart upload β†’ content-type guard β†’ image decode β†’ predictor β†’ typed response with latency + request ID.
 
 
 
 
 
 
 
424
 
425
  ---
426
 
 
9
  <img alt="FastAPI ready" src="https://img.shields.io/badge/FastAPI-ready-009688?logo=fastapi&logoColor=white">
10
  </p>
11
 
12
+ <p align="left">
13
+ <img alt="React 19" src="https://img.shields.io/badge/React-19-61DAFB?logo=react&logoColor=black">
14
+ <img alt="Vite 8" src="https://img.shields.io/badge/Vite-8-646CFF?logo=vite&logoColor=white">
15
+ <img alt="Frontend integrated" src="https://img.shields.io/badge/frontend-integrated-brightgreen">
16
+ <img alt="API connected" src="https://img.shields.io/badge/API-connected-009688">
17
+ </p>
18
+
19
  <p align="left">
20
  <img alt="Ruff" src="https://img.shields.io/badge/lint-ruff-261230?logo=ruff&logoColor=white">
21
  <img alt="mypy" src="https://img.shields.io/badge/typed-mypy-1F5082">
 
28
  <img alt="License: MIT" src="https://img.shields.io/badge/license-MIT-lightgrey">
29
  <img alt="Phase 1" src="https://img.shields.io/badge/Phase%201-complete-brightgreen">
30
  <img alt="Phase 2A" src="https://img.shields.io/badge/Phase%202A-complete-brightgreen">
31
+ <img alt="Phase 2B" src="https://img.shields.io/badge/Phase%202B-complete-brightgreen">
32
  </p>
33
 
34
  ---
 
37
 
38
  This repository implements an **end-to-end image-captioning pipeline** built around an InceptionV3 visual encoder and a custom multi-head Transformer decoder. The architecture is the basis of the IEEE-published paper *β€œAI Narratives: Bridging Visual Content and Linguistic Expression”*; this codebase lifts the original Kaggle research notebook into a typed, tested, configuration-driven Python package that can be reused from CLI, scripts, or a future serving layer.
39
 
40
+ With Phase 2B complete, the system now runs as a **full-stack inference workflow**: a React/Vite frontend issues multipart uploads to the FastAPI `POST /v1/captions` endpoint, the backend predictor returns a typed response, and the end-to-end image-to-caption interaction is operational in the browser.
41
+
42
  The repository is structured in deliberate phases:
43
 
44
  | Phase | Focus | Status |
 
46
  | 0 β€” Bootstrap | Tooling, packaging, freeze policy | βœ… complete |
47
  | 1 β€” Modularisation | Notebook β†’ typed Python package, parity audit, unit tests | βœ… complete |
48
  | 2A β€” Backend Infrastructure | FastAPI inference API, structured logging, schemas, health checks, Swagger/OpenAPI, predictor lifecycle | βœ… complete |
49
+ | 2B β€” Frontend UI | React/Vite frontend + upload UX + API integration | βœ… complete |
50
  | 3 β€” Multimodal baselines | BLIP / ViT-GPT2 / GIT side-by-side comparison | ⏳ planned |
51
  | 4 β€” Observability | Sentry, Prometheus metrics, ADRs | ⏳ planned |
52
 
 
146
  β”‚ β”œβ”€β”€ services/ # PredictorService β€” image bytes β†’ caption + latency
147
  β”‚ └── utils/ # Image decoding + content-type guards
148
  β”‚
149
+ β”œβ”€β”€ frontend/ # Phase 2B β€” React 19 + Vite 8 + Tailwind v4 SPA
150
+ β”‚ β”œβ”€β”€ index.html # Vite entry; mounts <App /> into #root
151
+ β”‚ β”œβ”€β”€ vite.config.js # Vite + @vitejs/plugin-react + Tailwind v4 plugin
152
+ β”‚ β”œβ”€β”€ eslint.config.js # Flat ESLint config (React + Hooks + React Refresh)
153
+ β”‚ β”œβ”€β”€ package.json # React 19, Vite 8, Tailwind v4
154
+ β”‚ β”œβ”€β”€ .env.example # VITE_API_BASE β€” env-driven backend origin
155
+ β”‚ β”œβ”€β”€ public/ # Static assets served verbatim (favicon, icons)
156
+ β”‚ └── src/
157
+ β”‚ β”œβ”€β”€ main.jsx # React root + StrictMode bootstrap
158
+ β”‚ β”œβ”€β”€ App.jsx # Page composition + upload β†’ generate flow
159
+ β”‚ β”œβ”€β”€ index.css # Tailwind v4 entry (single @import)
160
+ β”‚ β”œβ”€β”€ services/
161
+ β”‚ β”‚ └── api.js # checkHealth / captionImage β€” AbortController + typed ApiError
162
+ β”‚ └── components/
163
+ β”‚ β”œβ”€β”€ Header.jsx # Brand bar + StatusBadge slot
164
+ β”‚ β”œβ”€β”€ StatusBadge.jsx # /healthz poller (10 s) β€” checking/online/offline state machine
165
+ β”‚ β”œβ”€β”€ UploadZone.jsx # Drag/drop + click-to-browse + client-side validation
166
+ β”‚ β”œβ”€β”€ ImagePreview.jsx # Selected-file preview + size/format meta + clear
167
+ β”‚ β”œβ”€β”€ CaptionResult.jsx # Caption + model_version / decode / latency / request_id
168
+ β”‚ β”œβ”€β”€ ErrorBanner.jsx # Dismissible error display (network / timeout / HTTP)
169
+ β”‚ └── Spinner.jsx # Shared loading indicator (sm / md / lg)
170
+ β”‚
171
  β”œβ”€β”€ configs/
172
  β”‚ β”œβ”€β”€ base.yaml # IEEE hyperparameters (cell 6 mirror)
173
  β”‚ └── train/debug.yaml # CI smoke override
 
308
 
309
  Interactive Swagger UI is auto-generated at [`/docs`](http://localhost:8000/docs); the raw schema lives at [`/openapi.json`](http://localhost:8000/openapi.json).
310
 
311
+ ### Frontend (Phase 2B β€” operational)
312
+
313
+ A React 19 + Vite 8 + Tailwind v4 single-page app under [`frontend/`](frontend/) drives the same endpoints from the browser. The SPA posts multipart `FormData` to `POST /v1/captions`, polls `GET /healthz` every 10 seconds for a live status badge, consumes the typed `CaptionResponse` schema, and renders caption + `model_version` + `decode_strategy` + `latency_ms` + `request_id` exactly as the backend returns them. Loading, error, and success states are surfaced through dedicated components; network failures, request timeouts (3 s health / 60 s caption), CORS rejections, and non-2xx responses are all classified into a single typed `ApiError` shape so the UI shows actionable copy instead of a raw `Failed to fetch`.
314
+
315
+ ```bash
316
+ # Boot the frontend dev server
317
+ cd frontend
318
+ npm install
319
+ npm run dev
320
+ # Defaults to http://localhost:5173 (Vite picks the next free port if 5173 is busy)
321
+ ```
322
+
323
+ `VITE_API_BASE` (see [`frontend/.env.example`](frontend/.env.example)) points the SPA at any backend origin; absent the env var, the client falls back to `http://127.0.0.1:8000`. The dev origins `localhost:5173/5174` and `127.0.0.1:5173/5174` are pre-allowed in [`configs/base.yaml`](configs/base.yaml) under `serve.cors_allowed_origins` so the browser accepts cross-origin responses end-to-end.
324
+
325
  ---
326
 
327
  ## FastAPI backend
 
359
 
360
  ---
361
 
362
+ ## Frontend UI (Phase 2B)
363
+
364
+ Phase 2B ships a single-page inference UI under [`frontend/`](frontend/), not a styled demo. The split mirrors the backend's separation between transport, service, and presentation:
365
+
366
+ - **Application shell** β€” [`frontend/src/App.jsx`](frontend/src/App.jsx). Owns the request lifecycle (selected file β†’ preview β†’ generate β†’ result). The preview `URL.createObjectURL` is `useMemo`-derived and revoked through an effect cleanup so previews never leak memory across uploads. Four `useState` slots (`file`, `result`, `error`, `loading`) cover every UI state β€” no Redux, no React Query, no context.
367
+ - **API service layer** β€” [`frontend/src/services/api.js`](frontend/src/services/api.js). Single boundary for every backend call. Reads `import.meta.env.VITE_API_BASE` once at module load (falls back to `http://127.0.0.1:8000`), wraps `fetch` with `AbortController`-driven timeouts (3 s for `/healthz`, 60 s for `/v1/captions`), and classifies failures into `timeout` / `network` / `http` / `unknown` kinds on a typed `ApiError` so components never see a raw `TypeError`.
368
+ - **Upload zone** β€” [`frontend/src/components/UploadZone.jsx`](frontend/src/components/UploadZone.jsx). Drag/drop + click-to-browse + keyboard activation (`Enter` / `Space`). Validates content-type (JPEG / PNG / WebP) and size (10 MB) before the file ever touches the network β€” invalid uploads are rejected client-side with the same wording the backend would have returned, so the user experience is consistent whether validation fires locally or remotely.
369
+ - **Image preview** β€” [`frontend/src/components/ImagePreview.jsx`](frontend/src/components/ImagePreview.jsx). Renders the selected file via its object URL with size/format metadata and a clear button. Disabled while a request is in flight so re-drops cannot race the POST.
370
+ - **Caption result** β€” [`frontend/src/components/CaptionResult.jsx`](frontend/src/components/CaptionResult.jsx). Consumes the backend's typed `CaptionResponse` directly: caption text plus model version, decode strategy, latency in milliseconds, and the request ID echoed from the `x-request-id` header. Copy-to-clipboard is built in for log correlation during debugging.
371
+ - **Status badge** β€” [`frontend/src/components/StatusBadge.jsx`](frontend/src/components/StatusBadge.jsx). Polls `/healthz` every 10 seconds and on window focus, runs a three-state machine (`checking` / `online` / `offline`), and recovers automatically when the backend comes back β€” no page reload required.
372
+ - **Error banner** β€” [`frontend/src/components/ErrorBanner.jsx`](frontend/src/components/ErrorBanner.jsx). Single surface for every failure class. Reads `ApiError.message` so the user sees "Cannot reach backend" or "Request timed out" instead of a raw browser error.
373
+ - **Spinner / Header** β€” [`frontend/src/components/Spinner.jsx`](frontend/src/components/Spinner.jsx) and [`frontend/src/components/Header.jsx`](frontend/src/components/Header.jsx). Shared loading indicator and the sticky brand bar that hosts the status badge.
374
+
375
+ ### Upload flow
376
+
377
+ ```
378
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” drag/drop β”Œβ”€β”€β”€β”€β”€β”€β”€β”€οΏ½οΏ½β”€β”€β”€β”€β” validate β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
379
+ β”‚ UploadZone β”‚ ───────────▢ β”‚ App state β”‚ ──────────▢ β”‚ ImagePreview β”‚
380
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
381
+ β”‚ click "Generate"
382
+ β–Ό
383
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” multipart POST /v1/captions
384
+ β”‚ services/api.js β”‚ ───────────▢ FastAPI backend
385
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
386
+ β”‚ typed CaptionResponse / ApiError
387
+ β–Ό
388
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
389
+ β”‚ CaptionResult / β”‚
390
+ β”‚ ErrorBanner β”‚
391
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
392
+ ```
393
+
394
+ ### State, transport, and frontend/backend separation
395
+
396
+ State management is intentionally local: four `useState` slots in `App.jsx` (`file`, `result`, `error`, `loading`) plus a `useMemo`-derived preview URL. The data flow is shallow enough that an extra abstraction would obscure rather than help. All cross-cutting concerns β€” timeouts, error classification, env-driven base URL β€” live in the API service layer so components stay declarative and lift no transport details into JSX.
397
+
398
+ Frontend and backend are deployed independently. The SPA only knows the backend's origin via `VITE_API_BASE`; the backend only trusts SPAs whose origin appears in `serve.cors_allowed_origins`. Dev origins (`localhost:5173/5174`, `127.0.0.1:5173/5174`) are pre-allowed in [`configs/base.yaml`](configs/base.yaml); production origins join the same list at deploy time. No shared build, no shared runtime, no shared state β€” only the typed Pydantic schemas defined in [`backend/app/schemas/caption.py`](backend/app/schemas/caption.py) cross the wire.
399
+
400
+ ### UX, error handling, and loading states
401
+
402
+ - **Loading** β€” the Generate button shows the shared [`Spinner`](frontend/src/components/Spinner.jsx) and disables itself for the entire request; the upload zone is locked in parallel so a re-drop cannot race the in-flight POST.
403
+ - **Errors** β€” every failure surfaces through `ErrorBanner` with copy specific to its `ApiError.kind`. Network/CORS failures, request timeouts, and `4xx` / `5xx` payloads each map to a distinct, actionable message.
404
+ - **Status awareness** β€” when the backend is down, `StatusBadge` flips to red within one poll cycle; when it comes back, the badge recovers automatically without a page reload, and a fresh `/healthz` is also fired on window focus.
405
+ - **Responsive layout** β€” Tailwind v4's grid (`lg:grid-cols-5`) drops to a single column under the `lg` breakpoint, preserving the upload β†’ preview β†’ result flow on tablet and phone widths. The sticky header keeps the live status badge visible while scrolling.
406
+
407
+ ### Environment configuration
408
+
409
+ ```bash
410
+ # frontend/.env (gitignored) β€” overrides the default backend origin
411
+ VITE_API_BASE=http://127.0.0.1:8000
412
+ ```
413
+
414
+ The variable is read once at module load and stripped of any trailing slash. Absent the variable, the client falls back to `http://127.0.0.1:8000`; production builds set the variable at build time so the SPA can ship as static assets to Vercel, Cloudflare Pages, HuggingFace Spaces, or any CDN.
415
+
416
+ ### Production deployment readiness
417
+
418
+ - **Static-asset build** β€” `npm run build` emits a hash-named bundle under `frontend/dist/` that any static host can serve; no runtime Node process is required.
419
+ - **Origin pinning** β€” the CORS allow-list in `configs/base.yaml` plus `VITE_API_BASE` at build time tie a given SPA build to a specific backend origin without a runtime config endpoint.
420
+ - **No secrets in the client** β€” the SPA carries no API keys; the only network surface it depends on is `/healthz` and `/v1/captions` on the configured backend.
421
+ - **Lint-clean** β€” `npm run lint` (flat ESLint config with `eslint-plugin-react-hooks` and `eslint-plugin-react-refresh`) runs alongside the Python tooling.
422
+
423
+ ```bash
424
+ # Development server (Vite + HMR on :5173)
425
+ cd frontend
426
+ npm install
427
+ npm run dev
428
+
429
+ # Production build + local preview of the built bundle
430
+ npm run build
431
+ npm run preview
432
+ ```
433
+
434
+ ---
435
+
436
  ## Configuration system
437
 
438
  Hyperparameters are not globals. They live in YAML files validated by Pydantic v2 `BaseSettings`:
 
525
 
526
  - **Phase 1b** β€” beam search, CIDEr / METEOR / ROUGE-L, masked accuracy parity-fix, label smoothing, warmup + cosine LR schedule.
527
  - **Phase 2A** βœ… β€” FastAPI backend, lifespan-managed predictor singleton, multipart inference endpoint, structured logging + request IDs, Pydantic schemas, Swagger/OpenAPI docs, health/readiness probe.
528
+ - **Phase 2B** βœ… β€” React 19 + Vite 8 + Tailwind v4 SPA, drag/drop upload UX, live API integration against `POST /v1/captions`, env-driven `VITE_API_BASE`, `AbortController` timeouts, typed `ApiError` classification, polled health badge with auto-recovery, CORS allow-list wired through the backend YAML config.
529
+ - **Phase 2C** β€” Deployment integration: HuggingFace Spaces backend, Vercel-hosted frontend, production CORS allow-list, GitHub Actions CI/CD across both packages.
530
  - **Phase 3** β€” Tier-1 multimodal upgrades: BLIP-base / ViT-GPT2 / GIT-base-coco side-by-side comparison demo with per-model BLEU + latency.
531
  - **Phase 4** β€” Sentry, Prometheus, DagsHub-hosted MLflow link, Architecture Decision Records (`docs/adr/`).
532
  - **Future work** β€” ViT + Transformer fine-tune on COCO; VLM API integration (Anthropic Claude vision) behind a feature flag; VQA endpoint.
 
541
  - Swagger/OpenAPI testing β€” interactive `/docs` UI for hand-testing every endpoint, raw `/openapi.json` for client codegen.
542
  - Structured logging β€” JSON in production, pretty in dev; per-request UUIDs threaded through every log line.
543
  - End-to-end image upload β†’ caption flow β€” multipart upload β†’ content-type guard β†’ image decode β†’ predictor β†’ typed response with latency + request ID.
544
+ - End-to-end browser inference workflow β€” React 19 + Vite 8 SPA under [`frontend/`](frontend/) wired to `POST /v1/captions`; drag/drop or click-to-browse upload, live caption + latency + request ID display.
545
+ - Drag/drop upload UI β€” JPEG / PNG / WebP, 10 MB cap, keyboard-activatable (`Enter` / `Space`), client-side validation mirrored from the backend so error wording stays consistent.
546
+ - Live frontend-backend integration β€” typed `ApiError` boundary, `AbortController` timeouts (3 s health / 60 s caption), CORS allow-list aligned with `serve.cors_allowed_origins`.
547
+ - Polled health surface β€” `StatusBadge` reads `/healthz` every 10 s plus on window focus; recovers automatically without page reload when the backend comes back.
548
+ - Responsive Tailwind v4 inference interface β€” single-column layout under the `lg` breakpoint, sticky header with live status, modular component split under [`frontend/src/components/`](frontend/src/components/).
549
+ - Typed API communication β€” SPA consumes the same Pydantic `CaptionResponse` shape the backend emits; caption, `model_version`, `decode_strategy`, `latency_ms`, and `request_id` render directly from the wire payload.
550
+ - Production-style frontend architecture β€” dedicated [`services/api.js`](frontend/src/services/api.js) boundary, env-driven `VITE_API_BASE` with safe fallback, lint-clean flat ESLint config, static-asset build via `npm run build`.
551
 
552
  ---
553