fix: pin dropdown label backgrounds light (no dark FMS Test header)

by BladeSzaSza - opened 22 days ago

base: refs/heads/main

←

from: refs/pr/8

Discussion Files changed

+7852

-6906

Files changed (48) hide show

.hfignore +37 -37
CLAUDE.md +199 -192
README.md +118 -118
app.py +11 -11
docs/superpowers/plans/2026-06-09-pose-model-selector.md +734 -734
docs/superpowers/plans/2026-06-09-pose-visualizer.md +914 -914
docs/superpowers/plans/2026-06-13-full-fms-session-pdf.md +1209 -1209
docs/superpowers/specs/2026-06-09-pose-model-selector-design.md +171 -171
docs/superpowers/specs/2026-06-09-pose-visualizer-design.md +197 -197
docs/superpowers/specs/2026-06-13-full-fms-session-pdf-design.md +154 -154
formscout/agents/classifier.py +102 -102
formscout/agents/ingest.py +7 -28
formscout/agents/judge.py +125 -136
formscout/agents/pdf_report.py +175 -115
formscout/agents/pose2d.py +232 -232
formscout/agents/report.py +139 -139
formscout/agents/visualizer.py +418 -435
formscout/analysis/__init__.py +1 -0
formscout/analysis/charts.py +171 -0
formscout/analysis/laban.py +127 -0
formscout/analysis/relevant_joints.py +122 -0
formscout/analysis/timeseries.py +49 -0
formscout/config.py +15 -3
formscout/pipeline.py +111 -111
formscout/rubric/__init__.py +32 -32
formscout/rubric/active_slr.py +51 -51
formscout/rubric/hurdle_step.py +60 -60
formscout/rubric/inline_lunge.py +58 -58
formscout/rubric/rotary_stability.py +56 -56
formscout/rubric/shoulder_mobility.py +46 -46
formscout/rubric/trunk_stability_pushup.py +55 -55
formscout/serving/__init__.py +20 -0
formscout/serving/llama_cpp.py +148 -174
formscout/serving/transformers_vlm.py +116 -0
formscout/session.py +283 -194
formscout/startup.py +47 -47
formscout/types.py +3 -0
formscout/ui/theme.py +272 -250
requirements.txt +3 -1
scripts/hf_upload.sh +97 -97
scripts/serve_judge.sh +35 -35
tests/test_analysis.py +145 -0
tests/test_judge_backend.py +75 -0
tests/test_keyframe.py +37 -37
tests/test_pdf_report.py +51 -51
tests/test_phase2.py +354 -354
tests/test_session.py +94 -94
tests/test_visualizer.py +176 -176

.hfignore CHANGED Viewed

@@ -1,37 +1,37 @@
-# Python
-__pycache__/
-*.py[cod]
-*.egg-info/
-dist/
-build/
-.eggs/
-*.egg
-# Virtual environments
-.venv/
-venv/
-env/
-# Secrets / local config
-.env
-.env.*
-# Model weights (managed separately)
-checkpoints/
-*.pt
-*.pth
-*.gguf
-*.bin
-# Run artifacts
-traces/
-*.mp4
-# Dev tooling
-.pytest_cache/
-.ruff_cache/
-.DS_Store
-.claude/
-# Git
-.git/

+# Python
+__pycache__/
+*.py[cod]
+*.egg-info/
+dist/
+build/
+.eggs/
+*.egg
+# Virtual environments
+.venv/
+venv/
+env/
+# Secrets / local config
+.env
+.env.*
+# Model weights (managed separately)
+checkpoints/
+*.pt
+*.pth
+*.gguf
+*.bin
+# Run artifacts
+traces/
+*.mp4
+# Dev tooling
+.pytest_cache/
+.ruff_cache/
+.DS_Store
+.claude/
+# Git
+.git/

CLAUDE.md CHANGED Viewed

@@ -1,192 +1,199 @@
-# CLAUDE.md
-This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
-## Project overview
-FormScout is a Gradio app (Hugging Face Space) that scores Functional Movement Screen (FMS) videos 0–3 per test with a written rationale and an annotated overlay. It is a **screening aid** — not a diagnosis, not an injury predictor. Built for the Build Small Hackathon (Backyard AI track). Full product spec is in `docs/FormScout-FMS-Spec.md`; the engineering contract is in `docs/plans/FormScout-Build-Prompt.md`.
-**Current status:** Phase 2 complete. All 7 FMS test rubric scorers, JudgeAgent, MovementClassifierAgent, ReportAgent, PoseVisualizer (overlay video), and a user-selectable pose-model registry are implemented and tested (86/87 passing). Phase 3 is next (ST-GCN fine-tune + RAG retrieval).
-## Common commands
-```bash
-# Run the Gradio app locally
-python3 app.py
-# Headless pipeline test (no Gradio)
-python3 -m formscout.run sample.mp4
-# Run all tests
-pytest tests/
-# Run a single test file or test
-pytest tests/test_phase2.py
-pytest tests/test_biomechanics.py::TestBiomechanicsAgent::test_deep_squat_score
-# Lint / format
-ruff check . && ruff format .
-# Start the local VLM judge server (llama.cpp, port 8080)
-./scripts/serve_judge.sh
-# Push source tree to the HF model repo + Space (PRs; message from last commit)
-./scripts/hf_upload.sh
-# Run Svelte component tests (when frontend work is added)
-npx vitest run
-```
-## Architecture
-The pipeline is a sequence of **typed specialist agents**. Each agent accepts and returns a frozen dataclass from `formscout/types.py`. The Director in `formscout/pipeline.py` orchestrates them as a deterministic state machine (not an LLM).
-### Agent pipeline
-```
-IngestAgent → Pose2DAgent → [Body3DAgent — optional]
-→ MovementClassifierAgent → BiomechanicsAgent
-→ rubric/score_test() → JudgeAgent → ReportAgent
-```
-The **Director** (`pipeline.py`) owns the flow. `app.py` creates one `Director()` instance and calls `director.run(video_path, test_name, side, model_key)` per submission. The Gradio UI passes `test_name` directly (from dropdown), bypassing the classifier; `model_key` selects the pose backend from `config.POSE_MODELS`.
-`PoseVisualizer` (`formscout/agents/visualizer.py`) renders the annotated overlay video (skeleton, trails, velocity arrows) from `IngestResult` + `Pose2DResult`. It is called from `app.py` after the pipeline run — it is a UI-layer component, not a Director stage. It returns `None` on failure, never raises.
-### The tiering rule (most important invariant)
-**The 2D path is the default and must stand alone as a complete, functional pipeline.** `Body3DAgent` is only activated when `config.ENABLE_3D == True` AND the checkpoint loads successfully. If 3D is off or fails, `Body3DResult(used=False, ...)` is returned — this is a normal success path, not an error. `BiomechFeatures.view` is `"2d"` or `"3d"` so the `JudgeAgent` can caveat its rationale appropriately. Never put `Body3DAgent` on the critical path.
-### Feature flags in `config.py` and their current state
-| Flag | Default | Meaning |
-|------|---------|---------|
-| `ENABLE_JUDGE` | `True` | Judge/Classifier call Qwen3-VL via llama-server; graceful rubric fallback when the server is down |
-| `ENABLE_3D` | `False` | When False, Body3DAgent returns `used=False` immediately |
-| `ENABLE_STGCN` | `False` | Phase 3 — ST-GCN learned scoring head |
-| `ENABLE_RAG` | `False` | Phase 3 — RetrievalAgent exemplar lookup |
-All model IDs, thresholds, k-values, and feature flags live in `config.py` — never scattered literals.
-### Fallback chain (important for local dev and Spaces)
-1. `ENABLE_JUDGE=False` → JudgeAgent returns rubric score wrapped as JudgeResult (no VLM needed)
-2. `ENABLE_JUDGE=True` + llama.cpp server unreachable → same fallback, logs a warning
-3. `ENABLE_JUDGE=True` + server available → calls Qwen3-VL-8B-Instruct at `127.0.0.1:8080`
-Start the VLM server with `scripts/serve_judge.sh` (downloads live in `checkpoints/qwen3-vl/`, gitignored). To use a fine-tuned GGUF, set `FORMSCOUT_JUDGE_GGUF` (and `FORMSCOUT_JUDGE_MMPROJ` if it ships its own projector) — no code change needed. Multimodal requests go through the OpenAI-compatible `/v1/chat/completions` endpoint (the legacy `/completion` + `image_data` path does not work with modern llama-server).
-This means the app is **fully functional without any GPU or llama.cpp** — rubric scoring is pure Python.
-### Rubric scorers
-Each FMS test has a pure-function scorer in `formscout/rubric/`:
-```
-score_deep_squat / score_hurdle_step / score_inline_lunge /
-score_shoulder_mobility / score_active_slr /
-score_trunk_stability_pushup / score_rotary_stability
-```
-All accept `BiomechFeatures` and return `ScoreResult`. Dispatch via `rubric.score_test(features)`. **Rubric functions must remain pure** — no model calls, no I/O.
-### Bilateral tests
-`hurdle_step`, `inline_lunge`, `shoulder_mobility`, `active_slr` are bilateral. `ReportAgent` groups them by test name, takes the **lower** score, and always emits the asymmetry delta even when scores are equal. `composite` is `None` when any test is unscored.
-### Types contract
-Every agent I/O is a frozen dataclass from `formscout/types.py`. Key types:
-- `IngestResult` — decoded frames (np.ndarray list), fps, duration, dimensions
-- `Pose2DResult` — per-frame keypoints as `dict[int, {x, y, conf}]` (COCO 17 joints)
-- `Body3DResult` — optional 3D joints, always has `used: bool`
-- `MovementResult` — `test_name` (validated enum), `side` ("left"|"right"|"na")
-- `BiomechFeatures` — `angles: dict`, `alignments: dict`, `view: "2d"|"3d"`, `symmetry_delta`
-- `ScoreResult` — `score: int` (0–3), `rationale`, `needs_human`
-- `JudgeResult` — same as ScoreResult + `compensation_tags`, `corrective_hint`; `score=None` when `needs_human=True`
-- `PipelineState` — mutable accumulator threaded through the Director
-`MovementResult` and `JudgeResult` validate their fields in `__post_init__` — passing invalid values raises immediately.
-### Pose model selection and checkpoints
-`config.POSE_MODELS` is a registry of pose backends: MediaPipe (CPU-friendly), five YOLO26 sizes (n/s/m/l/x), and Sapiens2 variants (Phase 3, need the custom `sapiens` repo installed). `config.DEFAULT_POSE_MODEL` is YOLO26n. The Gradio UI exposes a dropdown built from `config.available_pose_models()` (filters to checkpoints actually present) and passes the chosen `model_key` through `Director.run` to `Pose2DAgent`. `config.YOLO_POSE_MODEL` is a backward-compat alias only.
-Checkpoints are **not** committed (`checkpoints/` is gitignored). `formscout/startup.py:ensure_checkpoints()` downloads missing YOLO26/MediaPipe files from the `silas-therapy/formscout-checkpoints` HF repo once at app startup. Models load once per process and are cached — never inside the inference hot path.
-### llama.cpp serving
-`formscout/serving/llama_cpp.py` provides `LlamaCppClient` (VLM, port 8080) and `EmbeddingClient` (embeddings, port 8081). Both check `/health` before use and return safe error dicts when unavailable. Only active when the corresponding `ENABLE_*` flag is True.
-### Deploying to Hugging Face
-The repo deploys to both `silas-therapy/small-functional-movement-screening` (model repo) and the Space of the same name (README frontmatter is the Space config). Use `./scripts/hf_upload.sh` — never raw `hf upload .`: the `hf` CLI does **not** read `.hfignore`, so a raw upload hashes the entire `.venv` (~44k files) and pushes torch binaries. The script parses `.hfignore` into `--exclude` globs, preflights the file count, creates PRs on both repos, and auto-switches to `hf upload-large-folder` (resumable, but no PR / no commit message) above 500 files.
-## Key constraints and invariants
-- **No cloud model APIs.** All inference runs on-Space (ZeroGPU). No OpenAI/Anthropic/Gemini calls.
-- **Pain is never auto-scored.** Any clearing test or visible distress sets `needs_human=True` — enforced in rubric functions and JudgeAgent. `JudgeResult.score` must be `None` when `needs_human=True`.
-- **Quality gates (Director, never silently skip):**
-  - Any agent `confidence < config.MIN_CONFIDENCE` (0.6) → warn or stop
-  - `|rubric.score - judge.score| >= 1` → flag disagreement
-  - `MovementResult.test_name == "unknown"` → stop pipeline, surface manual override
-  - `JudgeAgent.needs_human == True` → no numeric score emitted
-- **Composite is null** when any test is unscored. Never show a partial 0–21 as complete.
-- **Pipeline runs headless.** No Gradio imports in any agent file.
-- **Safety banner** ("Screening aid — not a diagnosis…") must always be visible in the UI — appears at top and bottom of `app.py`.
-## Engineering standards
-- Every agent: one public entrypoint, typed dataclass I/O from `types.py`, `confidence: float` and `notes: str` on every result.
-- Models load once at module/instance init — never inside the inference hot path.
-- Every agent module docstring states: purpose, inputs, outputs, failure behavior, model param count, license, and gated status.
-- `tracing.py` records structured per-agent I/O for any run; one full run gets exported to the Hub.
-- Every agent ships with a pytest in `tests/` that runs without model downloads and asserts the typed contract.
-## Model stack (~17.6B total — stay under 32B)
-| Component | Model | Params | Status |
-|---|---|---|---|
-| 2D pose (primary) | YOLO26-Pose n/s/m/l/x (default: n) | 0.0007–0.058B | Ready (auto-downloaded at startup) |
-| 2D pose (CPU alt) | MediaPipe Pose Landmarker (full) | ~0.004B | Ready (auto-downloaded at startup) |
-| 2D pose (HQ alt) | `facebook/sapiens2-pose-0.4b/0.8b/1b/5b` | 0.4–5B | Phase 3 — needs custom `sapiens` repo |
-| Segmentation | SAM 3.1 base | ~0.85B | Access accepted |
-| 3D biomechanics | `facebook/sam-3d-body-dinov3` | ~0.84B | **Access ACCEPTED Jun 4 2026** |
-| Learned scoring | ST-GCN (pyskl) | ~0.03B | Phase 3 |
-| Judge + Classifier | Qwen3-VL-8B-Instruct (llama.cpp) | 8B | **Online** — `scripts/serve_judge.sh`, ENABLE_JUDGE=True |
-| Retrieval | Qwen3-VL-Embedding-8B (llama.cpp) | 8B | Phase 3 |
-Track the running sum in `MODEL_BUDGET.md`. The two Qwen3-VL-8B models share a backbone.
-## Gradio + Svelte UI guidance
-The UI uses **Gradio `gr.Blocks`** with custom CSS/theme (`formscout/ui/theme.py`). Custom Svelte components for score dial, asymmetry bars, rubric drawer are planned for Phase 4. Use `gradio-svelte-expert` agent for Svelte component work.
-- ZeroGPU: wrap heavy inference (`Pose2DAgent.run`, `Body3DAgent.run`) in `@spaces.GPU` before deploying to Spaces.
-- Verify Gradio APIs against current docs before use — pin exact versions in `requirements.txt`.
-## Build phases
-1. **Phase 0 — Recon:** ✅ Complete. See `RECON.md`.
-2. **Phase 1 — Spine:** ✅ Complete. Deep Squat end-to-end.
-3. **Phase 2 — All 7 tests:** ✅ Complete. Classifier, Judge, Report agents; all rubric scorers; Gradio UI.
-4. **Phase 3 — Learned scoring + retrieval:** ST-GCN fine-tune on physio clips, publish to Hub. RetrievalAgent with embedding index.
-5. **Phase 4 — Polish + ship:** Custom Svelte UI components, agent trace to Hub, blog post. (Overlay video done via `PoseVisualizer`; full 7-test session + PDF export done via `formscout/session.py` + `PdfReportAgent`.)
-## Known issues
-- `tests/test_biomechanics.py::TestBiomechanicsAgent::test_unimplemented_test_returns_low_confidence` fails: expects `"not yet implemented"` in `result.notes` but biomechanics returns empty string. Minor — low priority.
-## Badge checklist (definition of done)
-- [ ] Space runs green; upload → scorecard works on real clips
-- [ ] Param sum verified ≤ 32B in `MODEL_BUDGET.md`
-- [ ] 🔌 **Off the Grid** — no cloud model APIs anywhere in the pipeline
-- [ ] 🎯 **Well-Tuned** — fine-tuned ST-GCN head published to Hub with honest model card
-- [ ] 🎨 **Off-Brand** — custom, non-default Gradio UI (scout/trail theme)
-- [ ] 🦙 **Llama Champion** — VLM + embedder served via llama.cpp (GGUF)
-- [ ] 📡 **Sharing is Caring** — one full agent trace (all I/O) published to Hub
-- [ ] 📓 **Field Notes** — blog post written, honesty section (FMS limitations) front-and-center
-- [ ] Demo video + social post recorded
-- [ ] Safety banner present; pain/clearing never auto-scored; low-confidence flagged

+# CLAUDE.md
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+## Project overview
+FormScout is a Gradio app (Hugging Face Space) that scores Functional Movement Screen (FMS) videos 0–3 per test with a written rationale and an annotated overlay. It is a **screening aid** — not a diagnosis, not an injury predictor. Built for the Build Small Hackathon (Backyard AI track). Full product spec is in `docs/FormScout-FMS-Spec.md`; the engineering contract is in `docs/plans/FormScout-Build-Prompt.md`.
+**Current status:** Phase 2 complete. All 7 FMS test rubric scorers, JudgeAgent, MovementClassifierAgent, ReportAgent, PoseVisualizer (overlay video), and a user-selectable pose-model registry are implemented and tested (86/87 passing). Phase 3 is next (ST-GCN fine-tune + RAG retrieval).
+## Common commands
+```bash
+# Run the Gradio app locally
+python3 app.py
+# Headless pipeline test (no Gradio)
+python3 -m formscout.run sample.mp4
+# Run all tests
+pytest tests/
+# Run a single test file or test
+pytest tests/test_phase2.py
+pytest tests/test_biomechanics.py::TestBiomechanicsAgent::test_deep_squat_score
+# Lint / format
+ruff check . && ruff format .
+# Start the local VLM judge server (llama.cpp, port 8080)
+./scripts/serve_judge.sh
+# Push source tree to the HF model repo + Space (PRs; message from last commit)
+./scripts/hf_upload.sh
+# Run Svelte component tests (when frontend work is added)
+npx vitest run
+```
+## Architecture
+The pipeline is a sequence of **typed specialist agents**. Each agent accepts and returns a frozen dataclass from `formscout/types.py`. The Director in `formscout/pipeline.py` orchestrates them as a deterministic state machine (not an LLM).
+### Agent pipeline
+```
+IngestAgent → Pose2DAgent → [Body3DAgent — optional]
+→ MovementClassifierAgent → BiomechanicsAgent
+→ rubric/score_test() → JudgeAgent → ReportAgent
+```
+The **Director** (`pipeline.py`) owns the flow. `app.py` creates one `Director()` instance and calls `director.run(video_path, test_name, side, model_key)` per submission. The Gradio UI passes `test_name` directly (from dropdown), bypassing the classifier; `model_key` selects the pose backend from `config.POSE_MODELS`.
+`PoseVisualizer` (`formscout/agents/visualizer.py`) renders the annotated overlay video (skeleton, trails, velocity arrows) from `IngestResult` + `Pose2DResult`. It is called from `app.py` after the pipeline run — it is a UI-layer component, not a Director stage. It returns `None` on failure, never raises.
+### The tiering rule (most important invariant)
+**The 2D path is the default and must stand alone as a complete, functional pipeline.** `Body3DAgent` is only activated when `config.ENABLE_3D == True` AND the checkpoint loads successfully. If 3D is off or fails, `Body3DResult(used=False, ...)` is returned — this is a normal success path, not an error. `BiomechFeatures.view` is `"2d"` or `"3d"` so the `JudgeAgent` can caveat its rationale appropriately. Never put `Body3DAgent` on the critical path.
+### Feature flags in `config.py` and their current state
+| Flag | Default | Meaning |
+|------|---------|---------|
+| `ENABLE_JUDGE` | `True` | Judge/Classifier call Qwen3-VL via llama-server; graceful rubric fallback when the server is down |
+| `ENABLE_3D` | `False` | When False, Body3DAgent returns `used=False` immediately |
+| `ENABLE_STGCN` | `False` | Phase 3 — ST-GCN learned scoring head |
+| `ENABLE_RAG` | `False` | Phase 3 — RetrievalAgent exemplar lookup |
+All model IDs, thresholds, k-values, and feature flags live in `config.py` — never scattered literals.
+### Judge backend selection (local vs Space)
+`config.resolve_judge_backend()` picks the VLM backend via `FORMSCOUT_JUDGE_BACKEND` (`llama_cpp` | `transformers` | `auto`). `auto` (default) uses **llama-server locally** and the **in-process transformers backend on a Space** (detected via `SPACE_ID`). `JudgeAgent` gets its client from `serving.get_vlm_client()`.
+- **`llama_cpp`** — `LlamaCppClient` → llama-server at `127.0.0.1:8080` (start with `scripts/serve_judge.sh`). The local path; works perfectly.
+- **`transformers`** — `TransformersVLMClient` loads Qwen3-VL-8B via transformers, GPU-wrapped with `spaces.GPU` (ZeroGPU). Lazy model load, cached per process. On any load/inference failure it returns `{"fallback": True}` and the Judge falls back to the rubric. **Needs validation on real ZeroGPU hardware** — not exercised in CPU tests.
+### Fallback chain (important for local dev and Spaces)
+1. `ENABLE_JUDGE=False` → JudgeAgent returns rubric score wrapped as JudgeResult (no VLM needed)
+2. `ENABLE_JUDGE=True` + selected backend unavailable / transformers load fails → same rubric fallback, logs a warning
+3. `ENABLE_JUDGE=True` + backend available → calls Qwen3-VL-8B-Instruct (llama-server locally, transformers/ZeroGPU on a Space)
+Start the VLM server with `scripts/serve_judge.sh` (downloads live in `checkpoints/qwen3-vl/`, gitignored). To use a fine-tuned GGUF, set `FORMSCOUT_JUDGE_GGUF` (and `FORMSCOUT_JUDGE_MMPROJ` if it ships its own projector) — no code change needed. Multimodal requests go through the OpenAI-compatible `/v1/chat/completions` endpoint (the legacy `/completion` + `image_data` path does not work with modern llama-server).
+This means the app is **fully functional without any GPU or llama.cpp** — rubric scoring is pure Python.
+### Rubric scorers
+Each FMS test has a pure-function scorer in `formscout/rubric/`:
+```
+score_deep_squat / score_hurdle_step / score_inline_lunge /
+score_shoulder_mobility / score_active_slr /
+score_trunk_stability_pushup / score_rotary_stability
+```
+All accept `BiomechFeatures` and return `ScoreResult`. Dispatch via `rubric.score_test(features)`. **Rubric functions must remain pure** — no model calls, no I/O.
+### Bilateral tests
+`hurdle_step`, `inline_lunge`, `shoulder_mobility`, `active_slr` are bilateral. `ReportAgent` groups them by test name, takes the **lower** score, and always emits the asymmetry delta even when scores are equal. `composite` is `None` when any test is unscored.
+### Types contract
+Every agent I/O is a frozen dataclass from `formscout/types.py`. Key types:
+- `IngestResult` — decoded frames (np.ndarray list), fps, duration, dimensions
+- `Pose2DResult` — per-frame keypoints as `dict[int, {x, y, conf}]` (COCO 17 joints)
+- `Body3DResult` — optional 3D joints, always has `used: bool`
+- `MovementResult` — `test_name` (validated enum), `side` ("left"|"right"|"na")
+- `BiomechFeatures` — `angles: dict`, `alignments: dict`, `view: "2d"|"3d"`, `symmetry_delta`
+- `ScoreResult` — `score: int` (0–3), `rationale`, `needs_human`
+- `JudgeResult` — same as ScoreResult + `compensation_tags`, `corrective_hint`; `score=None` when `needs_human=True`
+- `PipelineState` — mutable accumulator threaded through the Director
+`MovementResult` and `JudgeResult` validate their fields in `__post_init__` — passing invalid values raises immediately.
+### Pose model selection and checkpoints
+`config.POSE_MODELS` is a registry of pose backends: MediaPipe (CPU-friendly), five YOLO26 sizes (n/s/m/l/x), and Sapiens2 variants (Phase 3, need the custom `sapiens` repo installed). `config.DEFAULT_POSE_MODEL` is YOLO26n. The Gradio UI exposes a dropdown built from `config.available_pose_models()` (filters to checkpoints actually present) and passes the chosen `model_key` through `Director.run` to `Pose2DAgent`. `config.YOLO_POSE_MODEL` is a backward-compat alias only.
+Checkpoints are **not** committed (`checkpoints/` is gitignored). `formscout/startup.py:ensure_checkpoints()` downloads missing YOLO26/MediaPipe files from the `silas-therapy/formscout-checkpoints` HF repo once at app startup. Models load once per process and are cached — never inside the inference hot path.
+### llama.cpp serving
+`formscout/serving/llama_cpp.py` provides `LlamaCppClient` (VLM, port 8080) and `EmbeddingClient` (embeddings, port 8081). Both check `/health` before use and return safe error dicts when unavailable. Only active when the corresponding `ENABLE_*` flag is True.
+### Deploying to Hugging Face
+The repo deploys to both `silas-therapy/small-functional-movement-screening` (model repo) and the Space of the same name (README frontmatter is the Space config). Use `./scripts/hf_upload.sh` — never raw `hf upload .`: the `hf` CLI does **not** read `.hfignore`, so a raw upload hashes the entire `.venv` (~44k files) and pushes torch binaries. The script parses `.hfignore` into `--exclude` globs, preflights the file count, creates PRs on both repos, and auto-switches to `hf upload-large-folder` (resumable, but no PR / no commit message) above 500 files.
+## Key constraints and invariants
+- **No cloud model APIs.** All inference runs on-Space (ZeroGPU). No OpenAI/Anthropic/Gemini calls.
+- **Pain is never auto-scored.** Any clearing test or visible distress sets `needs_human=True` — enforced in rubric functions and JudgeAgent. `JudgeResult.score` must be `None` when `needs_human=True`.
+- **Quality gates (Director, never silently skip):**
+  - Any agent `confidence < config.MIN_CONFIDENCE` (0.6) → warn or stop
+  - `|rubric.score - judge.score| >= 1` → flag disagreement
+  - `MovementResult.test_name == "unknown"` → stop pipeline, surface manual override
+  - `JudgeAgent.needs_human == True` → no numeric score emitted
+- **Composite is null** when any test is unscored. Never show a partial 0–21 as complete.
+- **Pipeline runs headless.** No Gradio imports in any agent file.
+- **Safety banner** ("Screening aid — not a diagnosis…") must always be visible in the UI — appears at top and bottom of `app.py`.
+## Engineering standards
+- Every agent: one public entrypoint, typed dataclass I/O from `types.py`, `confidence: float` and `notes: str` on every result.
+- Models load once at module/instance init — never inside the inference hot path.
+- Every agent module docstring states: purpose, inputs, outputs, failure behavior, model param count, license, and gated status.
+- `tracing.py` records structured per-agent I/O for any run; one full run gets exported to the Hub.
+- Every agent ships with a pytest in `tests/` that runs without model downloads and asserts the typed contract.
+## Model stack (~17.6B total — stay under 32B)
+| Component | Model | Params | Status |
+|---|---|---|---|
+| 2D pose (primary) | YOLO26-Pose n/s/m/l/x (default: n) | 0.0007–0.058B | Ready (auto-downloaded at startup) |
+| 2D pose (CPU alt) | MediaPipe Pose Landmarker (full) | ~0.004B | Ready (auto-downloaded at startup) |
+| 2D pose (HQ alt) | `facebook/sapiens2-pose-0.4b/0.8b/1b/5b` | 0.4–5B | Phase 3 — needs custom `sapiens` repo |
+| Segmentation | SAM 3.1 base | ~0.85B | Access accepted |
+| 3D biomechanics | `facebook/sam-3d-body-dinov3` | ~0.84B | **Access ACCEPTED Jun 4 2026** |
+| Learned scoring | ST-GCN (pyskl) | ~0.03B | Phase 3 |
+| Judge + Classifier | Qwen3-VL-8B-Instruct (llama.cpp) | 8B | **Online** — `scripts/serve_judge.sh`, ENABLE_JUDGE=True |
+| Retrieval | Qwen3-VL-Embedding-8B (llama.cpp) | 8B | Phase 3 |
+Track the running sum in `MODEL_BUDGET.md`. The two Qwen3-VL-8B models share a backbone.
+## Gradio + Svelte UI guidance
+The UI uses **Gradio `gr.Blocks`** with custom CSS/theme (`formscout/ui/theme.py`). Custom Svelte components for score dial, asymmetry bars, rubric drawer are planned for Phase 4. Use `gradio-svelte-expert` agent for Svelte component work.
+- ZeroGPU: wrap heavy inference (`Pose2DAgent.run`, `Body3DAgent.run`) in `@spaces.GPU` before deploying to Spaces.
+- Verify Gradio APIs against current docs before use — pin exact versions in `requirements.txt`.
+## Build phases
+1. **Phase 0 — Recon:** ✅ Complete. See `RECON.md`.
+2. **Phase 1 — Spine:** ✅ Complete. Deep Squat end-to-end.
+3. **Phase 2 — All 7 tests:** ✅ Complete. Classifier, Judge, Report agents; all rubric scorers; Gradio UI.
+4. **Phase 3 — Learned scoring + retrieval:** ST-GCN fine-tune on physio clips, publish to Hub. RetrievalAgent with embedding index.
+5. **Phase 4 — Polish + ship:** Custom Svelte UI components, agent trace to Hub, blog post. (Overlay video done via `PoseVisualizer`; full 7-test session + PDF export done via `formscout/session.py` + `PdfReportAgent`.)
+## Known issues
+- `tests/test_biomechanics.py::TestBiomechanicsAgent::test_unimplemented_test_returns_low_confidence` fails: expects `"not yet implemented"` in `result.notes` but biomechanics returns empty string. Minor �� low priority.
+## Badge checklist (definition of done)
+- [ ] Space runs green; upload → scorecard works on real clips
+- [ ] Param sum verified ≤ 32B in `MODEL_BUDGET.md`
+- [ ] 🔌 **Off the Grid** — no cloud model APIs anywhere in the pipeline
+- [ ] 🎯 **Well-Tuned** — fine-tuned ST-GCN head published to Hub with honest model card
+- [ ] 🎨 **Off-Brand** — custom, non-default Gradio UI (scout/trail theme)
+- [ ] 🦙 **Llama Champion** — VLM + embedder served via llama.cpp (GGUF)
+- [ ] 📡 **Sharing is Caring** — one full agent trace (all I/O) published to Hub
+- [ ] 📓 **Field Notes** — blog post written, honesty section (FMS limitations) front-and-center
+- [ ] Demo video + social post recorded
+- [ ] Safety banner present; pain/clearing never auto-scored; low-confidence flagged

README.md CHANGED Viewed

@@ -1,118 +1,118 @@
----
-title: FormScout
-emoji: 🏔️
-colorFrom: green
-colorTo: green
-sdk: gradio
-app_file: app.py
-pinned: false
-license: apache-2.0
-short_description: FMS video scoring — movement screen aid
----
-# FormScout
-FMS (Functional Movement Screen) scoring pipeline — a screening aid that scores movement videos 0–3 per test with a written rationale and annotated overlay.
-**⚠️ Screening aid — not a diagnosis. Pain or clearing tests require a clinician.**
-## Running locally
-### 1. Clone and install
-```bash
-git clone https://huggingface.co/silas-therapy/small-functional-movement-screening
-cd small-functional-movement-screening
-python3 -m venv .venv && source .venv/bin/activate
-pip install -r requirements.txt
-```
-### 2. Start the VLM judge (optional but recommended)
-The judge uses Qwen3-VL-8B-Instruct via llama.cpp. Without it the app falls back to the deterministic rubric score — fully functional, no GPU needed.
-```bash
-# Install llama.cpp once
-brew install llama.cpp
-# Download the model (one-time, ~6 GB)
-python3 -c "
-from huggingface_hub import hf_hub_download
-for f in ['Qwen3VL-8B-Instruct-Q4_K_M.gguf', 'mmproj-Qwen3VL-8B-Instruct-F16.gguf']:
-    hf_hub_download('Qwen/Qwen3-VL-8B-Instruct-GGUF', f, local_dir='checkpoints/qwen3-vl')
-"
-# Start the server (keep this terminal open)
-./scripts/serve_judge.sh
-```
-To use a fine-tuned GGUF instead of the default:
-```bash
-FORMSCOUT_JUDGE_GGUF=/path/to/finetuned.gguf ./scripts/serve_judge.sh
-```
-### 3. Launch the Gradio app
-```bash
-python3 app.py
-# → http://127.0.0.1:7860
-```
-Upload a video, select the FMS test from the dropdown, and click **Analyze**.
-### 4. Headless pipeline (no Gradio)
-```bash
-python3 -m formscout.run sample.mp4
-```
-### 5. Tests
-```bash
-pytest tests/ -v
-```
-### 6. Upload to Hugging Face
-```bash
-# Pushes source to both model repo and Space, opens a PR on each
-./scripts/hf_upload.sh
-# Or with a custom commit message
-./scripts/hf_upload.sh "feat: my change"
-```
-## Architecture
-Typed specialist agents orchestrated by a deterministic Director:
-```
-Ingest → Pose2D → [Body3D optional] → Biomechanics → Rubric Score → [Judge] → Report
-```
-| Agent | Model | Status |
-|---|---|---|
-| Pose2D | YOLO26l-Pose (0.026B) + MediaPipe fallback | ✅ |
-| Body3D | SAM 3D Body DINOv3 (0.84B) | gated, off by default |
-| Judge + Classifier | Qwen3-VL-8B-Instruct via llama.cpp (8B) | ✅ |
-| Scoring Head | ST-GCN (0.03B) | Phase 3 |
-| Retrieval | Qwen3-VL-Embedding-8B (8B) | Phase 3 |
-See [CLAUDE.md](CLAUDE.md) for full architecture and invariants.
-## Feature flags (`formscout/config.py`)
-| Flag | Default | Meaning |
-|---|---|---|
-| `ENABLE_JUDGE` | `True` | VLM judge via llama-server; rubric fallback when server is down |
-| `ENABLE_3D` | `False` | SAM 3D Body — off until integrated |
-| `ENABLE_STGCN` | `False` | Phase 3 |
-| `ENABLE_RAG` | `False` | Phase 3 |
-## Model budget
-~18B params total (under 32B cap). See [MODEL_BUDGET.md](MODEL_BUDGET.md).
-## License
-Apache-2.0. Built for the Build Small Hackathon (Backyard AI track).

+---
+title: FormScout
+emoji: 🏔️
+colorFrom: green
+colorTo: yellow
+sdk: gradio
+app_file: app.py
+pinned: false
+license: apache-2.0
+short_description: FMS video scoring — movement screen aid
+---
+# FormScout
+FMS (Functional Movement Screen) scoring pipeline — a screening aid that scores movement videos 0–3 per test with a written rationale and annotated overlay.
+**⚠️ Screening aid — not a diagnosis. Pain or clearing tests require a clinician.**
+## Running locally
+### 1. Clone and install
+```bash
+git clone https://huggingface.co/silas-therapy/small-functional-movement-screening
+cd small-functional-movement-screening
+python3 -m venv .venv && source .venv/bin/activate
+pip install -r requirements.txt
+```
+### 2. Start the VLM judge (optional but recommended)
+The judge uses Qwen3-VL-8B-Instruct via llama.cpp. Without it the app falls back to the deterministic rubric score — fully functional, no GPU needed.
+```bash
+# Install llama.cpp once
+brew install llama.cpp
+# Download the model (one-time, ~6 GB)
+python3 -c "
+from huggingface_hub import hf_hub_download
+for f in ['Qwen3VL-8B-Instruct-Q4_K_M.gguf', 'mmproj-Qwen3VL-8B-Instruct-F16.gguf']:
+    hf_hub_download('Qwen/Qwen3-VL-8B-Instruct-GGUF', f, local_dir='checkpoints/qwen3-vl')
+"
+# Start the server (keep this terminal open)
+./scripts/serve_judge.sh
+```
+To use a fine-tuned GGUF instead of the default:
+```bash
+FORMSCOUT_JUDGE_GGUF=/path/to/finetuned.gguf ./scripts/serve_judge.sh
+```
+### 3. Launch the Gradio app
+```bash
+python3 app.py
+# → http://127.0.0.1:7860
+```
+Upload a video, select the FMS test from the dropdown, and click **Analyze**.
+### 4. Headless pipeline (no Gradio)
+```bash
+python3 -m formscout.run sample.mp4
+```
+### 5. Tests
+```bash
+pytest tests/ -v
+```
+### 6. Upload to Hugging Face
+```bash
+# Pushes source to both model repo and Space, opens a PR on each
+./scripts/hf_upload.sh
+# Or with a custom commit message
+./scripts/hf_upload.sh "feat: my change"
+```
+## Architecture
+Typed specialist agents orchestrated by a deterministic Director:
+```
+Ingest → Pose2D → [Body3D optional] → Biomechanics → Rubric Score → [Judge] → Report
+```
+| Agent | Model | Status |
+|---|---|---|
+| Pose2D | YOLO26l-Pose (0.026B) + MediaPipe fallback | ✅ |
+| Body3D | SAM 3D Body DINOv3 (0.84B) | gated, off by default |
+| Judge + Classifier | Qwen3-VL-8B-Instruct via llama.cpp (8B) | ✅ |
+| Scoring Head | ST-GCN (0.03B) | Phase 3 |
+| Retrieval | Qwen3-VL-Embedding-8B (8B) | Phase 3 |
+See [CLAUDE.md](CLAUDE.md) for full architecture and invariants.
+## Feature flags (`formscout/config.py`)
+| Flag | Default | Meaning |
+|---|---|---|
+| `ENABLE_JUDGE` | `True` | VLM judge via llama-server; rubric fallback when server is down |
+| `ENABLE_3D` | `False` | SAM 3D Body — off until integrated |
+| `ENABLE_STGCN` | `False` | Phase 3 |
+| `ENABLE_RAG` | `False` | Phase 3 |
+## Model budget
+~18B params total (under 32B cap). See [MODEL_BUDGET.md](MODEL_BUDGET.md).
+## License
+Apache-2.0. Built for the Build Small Hackathon (Backyard AI track).

app.py CHANGED Viewed

@@ -125,22 +125,22 @@ def _render_score_card(score: int, confidence: float, needs_human: bool) -> str:
     if needs_human:
         return """
         <div class="score-card needs-review">
-            <div style="font-size: 1.2em; color: #fbbf24; margin-bottom: 8px;">⚠️ Needs Clinician Review</div>
-            <div style="font-size: 0.9em; color: #94a3b8;">Pain or clearing test detected — cannot auto-score</div>
         </div>
         """
     conf_pct = int(confidence * 100)
-    conf_color = "#059669" if confidence >= 0.7 else "#f59e0b" if confidence >= 0.4 else "#ef4444"
     return f"""
     <div class="score-card">
         <div class="score-value">{score}/3</div>
-        <div style="font-size: 0.95em; color: #94a3b8; margin-top: 4px;">
             {SCORE_DESCRIPTIONS.get(score, '')}
         </div>
         <div style="margin-top: 12px;">
-            <div style="display: flex; justify-content: space-between; font-size: 0.8em; color: #64748b;">
                 <span>Confidence</span>
                 <span style="color: {conf_color};">{conf_pct}%</span>
             </div>
@@ -155,9 +155,9 @@ def _render_score_card(score: int, confidence: float, needs_human: bool) -> str:
 def _render_empty_state() -> str:
     """Render placeholder when no video processed yet."""
     return """
-    <div class="score-card" style="opacity: 0.5;">
         <div style="font-size: 2em; margin-bottom: 8px;">🏔️</div>
-        <div style="color: #64748b;">Upload a video to begin</div>
     </div>
     """
@@ -319,7 +319,7 @@ def build_app() -> gr.Blocks:
         gr.HTML("""
         <div class="formscout-header">
             <h1>🏔️ FormScout</h1>
-            <p style="color: #94a3b8; font-size: 0.95em;">
                 Functional Movement Screen · Automated Scoring Aid
             </p>
         </div>
@@ -362,7 +362,7 @@ def build_app() -> gr.Blocks:
                 overlay_layers = gr.CheckboxGroup(
                     choices=["Skeleton", "Trails", "Velocity arrows"],
-                    value=["Skeleton", "Trails"],
                     label="Overlay Layers",
                 )
@@ -414,9 +414,9 @@ def build_app() -> gr.Blocks:
         gr.HTML(f'<div class="safety-banner" style="margin-top: 20px;">{DISCLAIMER}</div>')
         gr.Markdown(
-            "<center style='color: #64748b; font-size: 0.8em; margin-top: 12px;'>"
             "FormScout · ~18B params · Off the Grid · "
-            "<a href='https://github.com/' style='color: #86efac;'>Built for Build Small Hackathon</a>"
             "</center>"
         )

     if needs_human:
         return """
         <div class="score-card needs-review">
+            <div style="font-size: 1.2em; color: #cf922a; margin-bottom: 8px;">⚠️ Needs Clinician Review</div>
+            <div style="font-size: 0.9em; color: #4a5f57;">Pain or clearing test detected — cannot auto-score</div>
         </div>
         """
     conf_pct = int(confidence * 100)
+    conf_color = "#2b8a8a" if confidence >= 0.7 else "#cf922a" if confidence >= 0.4 else "#d9534f"
     return f"""
     <div class="score-card">
         <div class="score-value">{score}/3</div>
+        <div style="font-size: 0.95em; color: #4a5f57; margin-top: 4px;">
             {SCORE_DESCRIPTIONS.get(score, '')}
         </div>
         <div style="margin-top: 12px;">
+            <div style="display: flex; justify-content: space-between; font-size: 0.8em; color: #6b7d75;">
                 <span>Confidence</span>
                 <span style="color: {conf_color};">{conf_pct}%</span>
             </div>
 def _render_empty_state() -> str:
     """Render placeholder when no video processed yet."""
     return """
+    <div class="score-card" style="opacity: 0.6;">
         <div style="font-size: 2em; margin-bottom: 8px;">🏔️</div>
+        <div style="color: #6b7d75;">Upload a video to begin</div>
     </div>
     """
         gr.HTML("""
         <div class="formscout-header">
             <h1>🏔️ FormScout</h1>
+            <p style="color: #4a5f57; font-size: 0.95em;">
                 Functional Movement Screen · Automated Scoring Aid
             </p>
         </div>
                 overlay_layers = gr.CheckboxGroup(
                     choices=["Skeleton", "Trails", "Velocity arrows"],
+                    value=["Skeleton", "Trails", "Velocity arrows"],
                     label="Overlay Layers",
                 )
         gr.HTML(f'<div class="safety-banner" style="margin-top: 20px;">{DISCLAIMER}</div>')
         gr.Markdown(
+            "<center style='color: #6b7d75; font-size: 0.8em; margin-top: 12px;'>"
             "FormScout · ~18B params · Off the Grid · "
+            "<a href='https://silastherapy.sk' style='color: #1f6e6e;'>Silas Therapy · Build Small Hackathon</a>"
             "</center>"
         )

docs/superpowers/plans/2026-06-09-pose-model-selector.md CHANGED Viewed

@@ -1,734 +1,734 @@
-# Pose Model Selector Implementation Plan
-> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
-**Goal:** Replace the hard-coded YOLO26l default with a 10-model dropdown (MediaPipe, YOLO26 n→x, Sapiens2 0.4B→5B) wired end-to-end from UI through the Director to `Pose2DAgent`.
-**Architecture:** Unified `POSE_MODELS` registry in `config.py` drives a `gr.Dropdown` in `app.py`; the selected key flows through `Director.run()` into `Pose2DAgent.run(model_key)`, which dispatches to one of three private sub-runners (`_run_yolo`, `_run_mediapipe`, `_run_sapiens2`), all producing the same COCO-17 `list[dict]` contract.
-**Tech Stack:** `ultralytics` (YOLO), `onnxruntime` + `huggingface_hub` (MediaPipe), `transformers` (Sapiens2), `gradio` (UI).
----
-## File map
-| File | Change |
-|---|---|
-| `formscout/config.py` | Replace `YOLO_POSE_MODELS` with `POSE_MODELS` dict + `DEFAULT_POSE_MODEL` |
-| `formscout/agents/pose2d.py` | Add `_run_yolo`, `_run_mediapipe`, `_run_sapiens2`; update `run()` signature |
-| `formscout/pipeline.py` | Change `pose_model_path` param to `model_key` |
-| `app.py` | Add `pose_model_dropdown`, fix `_map_inputs` + `process_video` |
-| `requirements.txt` | Add `onnxruntime>=1.18` |
-| `tests/test_pose2d.py` | Add mocked tests for each backend |
----
-## Task 1: Add unified `POSE_MODELS` registry to `config.py`
-**Files:**
-- Modify: `formscout/config.py`
-- [ ] **Step 1: Open `formscout/config.py` and replace the `YOLO_POSE_MODELS` block**
-Replace lines 12–20 (the `YOLO_POSE_MODELS` dict and `YOLO_POSE_MODEL` / `YOLO_POSE_MODEL_HQ` lines) with:
-```python
-_YOLO_DIR = ROOT / "checkpoints" / "yolo26"
-POSE_MODELS: dict[str, dict] = {
-    # ── MediaPipe (Qualcomm HF, ONNX Runtime) ──────────────────────────────
-    "MediaPipe-Pose ⬇ ~16 MB, CPU-friendly": {
-        "backend": "mediapipe",
-        "hf_id": "qualcomm/MediaPipe-Pose-Estimation",
-        "params_m": 4.2,
-    },
-    # ── YOLO26 (local checkpoints) ─────────────────────────────────────────
-    "YOLO26n — nano (0.7M, fastest)": {
-        "backend": "yolo",
-        "path": str(_YOLO_DIR / "yolo26n-pose.pt"),
-        "params_m": 0.7,
-    },
-    "YOLO26s — small (3.5M)": {
-        "backend": "yolo",
-        "path": str(_YOLO_DIR / "yolo26s-pose.pt"),
-        "params_m": 3.5,
-    },
-    "YOLO26m — medium (9M)": {
-        "backend": "yolo",
-        "path": str(_YOLO_DIR / "yolo26m-pose.pt"),
-        "params_m": 9.0,
-    },
-    "YOLO26l — large (25.9M)": {
-        "backend": "yolo",
-        "path": str(_YOLO_DIR / "yolo26l-pose.pt"),
-        "params_m": 25.9,
-    },
-    "YOLO26x — extra-large (57.6M)": {
-        "backend": "yolo",
-        "path": str(_YOLO_DIR / "yolo26x-pose.pt"),
-        "params_m": 57.6,
-    },
-    # ── Sapiens2 (HF download, transformers) ───────────────────────────────
-    "Sapiens2-0.4B ⬇ ~1.6 GB": {
-        "backend": "sapiens2",
-        "hf_id": "facebook/sapiens2-pose-0.4b",
-        "params_m": 400,
-    },
-    "Sapiens2-0.8B ⬇ ~3.2 GB": {
-        "backend": "sapiens2",
-        "hf_id": "facebook/sapiens2-pose-0.8b",
-        "params_m": 800,
-    },
-    "Sapiens2-1B ⬇ ~4 GB": {
-        "backend": "sapiens2",
-        "hf_id": "facebook/sapiens2-pose-1b",
-        "params_m": 1000,
-    },
-    "Sapiens2-5B ⬇ ~20 GB, large GPU": {
-        "backend": "sapiens2",
-        "hf_id": "facebook/sapiens2-pose-5b",
-        "params_m": 5000,
-    },
-}
-DEFAULT_POSE_MODEL = "YOLO26n — nano (0.7M, fastest)"
-# Backward-compat aliases — kept for any direct references outside the agent
-YOLO_POSE_MODEL = str(_YOLO_DIR / "yolo26l-pose.pt")
-YOLO_POSE_MODEL_HQ = str(_YOLO_DIR / "yolo26x-pose.pt")
-```
-- [ ] **Step 2: Verify import is clean**
-```bash
-python3 -c "from formscout import config; print(list(config.POSE_MODELS.keys()))"
-```
-Expected: list of 10 model labels, starting with `MediaPipe-Pose...`
-- [ ] **Step 3: Commit**
-```bash
-git add formscout/config.py
-git commit -m "feat: unified POSE_MODELS registry with MediaPipe, YOLO26 n-x, Sapiens2 0.4-5B"
-git push
-```
----
-## Task 2: Refactor `Pose2DAgent` — YOLO sub-runner + new `run()` signature
-**Files:**
-- Modify: `formscout/agents/pose2d.py`
-- Modify: `tests/test_pose2d.py`
-- [ ] **Step 1: Write failing test for the new `model_key` signature**
-Add to `tests/test_pose2d.py`:
-```python
-def test_run_accepts_model_key(pose2d_agent):
-    """run() must accept model_key kwarg, not model_path."""
-    import inspect
-    sig = inspect.signature(pose2d_agent.run)
-    assert "model_key" in sig.parameters
-    assert "model_path" not in sig.parameters
-```
-- [ ] **Step 2: Run to confirm it fails**
-```bash
-pytest tests/test_pose2d.py::TestPose2DAgent::test_run_accepts_model_key -v
-```
-Expected: FAIL — `model_path` still present in signature.
-- [ ] **Step 3: Rewrite `formscout/agents/pose2d.py`**
-Replace the entire file with:
-```python
-"""
-Pose2DAgent — 2D per-frame keypoint extraction.
-Backends: yolo (local ONNX), mediapipe (Qualcomm HF/ONNX Runtime),
-          sapiens2 (Meta HF/transformers).
-All backends output COCO-17 keypoints: dict[int, {x, y, conf}] per frame.
-Input:  IngestResult
-Output: Pose2DResult(keypoints per frame, fps, confidence)
-Failure: Pose2DResult(confidence=0.0, notes=<reason>) — never raises.
-"""
-from __future__ import annotations
-import logging
-import numpy as np
-from formscout import config
-from formscout.types import IngestResult, Pose2DResult
-logger = logging.getLogger(__name__)
-COCO_KEYPOINTS = [
-    "nose", "left_eye", "right_eye", "left_ear", "right_ear",
-    "left_shoulder", "right_shoulder", "left_elbow", "right_elbow",
-    "left_wrist", "right_wrist", "left_hip", "right_hip",
-    "left_knee", "right_knee", "left_ankle", "right_ankle",
-]
-# BlazePose-33 → COCO-17 index mapping
-_BLAZEPOSE_TO_COCO: dict[int, int] = {
-    0: 0,   # nose
-    1: 2,   # left_eye (inner → left_eye)
-    2: 1,   # right_eye (inner → right_eye) — swapped: BlazePose 1=left_eye_inner
-    3: 3,   # left_ear
-    4: 4,   # right_ear
-    5: 5,   # left_shoulder → COCO left_shoulder... wait
-    # Correct BlazePose-33 COCO mapping (canonical):
-    # BlazePose idx : COCO idx
-    # 0  nose           → COCO 0
-    # 2  left_eye       → COCO 1
-    # 5  right_eye      → COCO 2
-    # 7  left_ear       → COCO 3
-    # 8  right_ear      → COCO 4
-    # 11 left_shoulder  → COCO 5
-    # 12 right_shoulder → COCO 6
-    # 13 left_elbow     → COCO 7
-    # 14 right_elbow    → COCO 8
-    # 15 left_wrist     → COCO 9
-    # 16 right_wrist    → COCO 10
-    # 23 left_hip       → COCO 11
-    # 24 right_hip      → COCO 12
-    # 25 left_knee      → COCO 13
-    # 26 right_knee     → COCO 14
-    # 27 left_ankle     → COCO 15
-    # 28 right_ankle    → COCO 16
-}
-# BlazePose source index → COCO target index (correct mapping, no duplicates)
-_BP_SRC = [0, 2, 5, 7, 8, 11, 12, 13, 14, 15, 16, 23, 24, 25, 26, 27, 28]
-_BP_DST = list(range(17))  # COCO 0..16
-_model_cache: dict[str, object] = {}
-# ── YOLO backend ─────────────────────────────────────────────────────────────
-def _get_yolo(path: str) -> object:
-    if path not in _model_cache:
-        from ultralytics import YOLO
-        _model_cache[path] = YOLO(path)
-    return _model_cache[path]
-def _run_yolo(frames: list, path: str) -> list[dict]:
-    model = _get_yolo(path)
-    out = []
-    for frame in frames:
-        try:
-            results = model(frame, verbose=False)
-            kps: dict[int, dict] = {}
-            if results and results[0].keypoints is not None:
-                kp = results[0].keypoints
-                if kp.xy is not None and len(kp.xy) > 0:
-                    xy = kp.xy[0].cpu().numpy()
-                    conf = kp.conf[0].cpu().numpy()
-                    for j in range(min(len(xy), 17)):
-                        kps[j] = {"x": float(xy[j, 0]), "y": float(xy[j, 1]), "conf": float(conf[j])}
-            out.append(kps)
-        except Exception:
-            out.append({})
-    return out
-# ── MediaPipe backend ────────────────────────────────────────────────────────
-def _get_mediapipe_sessions(hf_id: str):
-    """Return (detector_session, landmark_session) cached by hf_id."""
-    cache_key = f"mp:{hf_id}"
-    if cache_key not in _model_cache:
-        from huggingface_hub import snapshot_download
-        import onnxruntime as ort
-        from pathlib import Path
-        snap = Path(snapshot_download(hf_id))
-        onnx_files = sorted(snap.glob("**/*.onnx"), key=lambda p: p.stat().st_size)
-        if len(onnx_files) < 2:
-            raise RuntimeError(f"Expected 2 ONNX files in {snap}, found {len(onnx_files)}")
-        # Smaller file = pose detector; larger = pose landmark detector
-        det_sess = ort.InferenceSession(str(onnx_files[0]))
-        lmk_sess = ort.InferenceSession(str(onnx_files[-1]))
-        _model_cache[cache_key] = (det_sess, lmk_sess)
-    return _model_cache[cache_key]
-def _preprocess_mediapipe(frame: np.ndarray, size: int = 256) -> np.ndarray:
-    """Resize to size×size, normalize to [0,1], add batch dim → (1,3,H,W)."""
-    import cv2
-    img = cv2.resize(frame, (size, size)).astype(np.float32) / 255.0
-    return img.transpose(2, 0, 1)[None]  # (1, 3, 256, 256)
-def _run_mediapipe(frames: list, hf_id: str) -> list[dict]:
-    try:
-        det_sess, lmk_sess = _get_mediapipe_sessions(hf_id)
-    except Exception as e:
-        logger.warning("mediapipe load failed: %s", e)
-        return [{} for _ in frames]
-    import cv2
-    h_orig, w_orig = frames[0].shape[:2] if frames else (480, 640)
-    out = []
-    for frame in frames:
-        try:
-            h, w = frame.shape[:2]
-            inp = _preprocess_mediapipe(frame)
-            # Run landmark detector directly on full frame (single-person FMS use-case)
-            lmk_input_name = lmk_sess.get_inputs()[0].name
-            lmk_out = lmk_sess.run(None, {lmk_input_name: inp})
-            # lmk_out[0] shape: (1, 33, 3) — [x, y, visibility] normalized 0..1
-            landmarks = lmk_out[0][0]  # (33, 3)
-            kps: dict[int, dict] = {}
-            for coco_idx, bp_idx in zip(_BP_DST, _BP_SRC):
-                if bp_idx < len(landmarks):
-                    lm = landmarks[bp_idx]
-                    kps[coco_idx] = {
-                        "x": float(lm[0] * w),
-                        "y": float(lm[1] * h),
-                        "conf": float(lm[2]),  # visibility score
-                    }
-            out.append(kps)
-        except Exception:
-            out.append({})
-    return out
-# ── Sapiens2 backend ─────────────────────────────────────────────────────────
-# COCO-17 keypoint names in order (used to map Sapiens2 named output → COCO index)
-_COCO_NAMES = [
-    "nose", "left_eye", "right_eye", "left_ear", "right_ear",
-    "left_shoulder", "right_shoulder", "left_elbow", "right_elbow",
-    "left_wrist", "right_wrist", "left_hip", "right_hip",
-    "left_knee", "right_knee", "left_ankle", "right_ankle",
-]
-def _get_sapiens2(hf_id: str) -> object:
-    if hf_id not in _model_cache:
-        from transformers import pipeline as hf_pipeline
-        _model_cache[hf_id] = hf_pipeline("pose-estimation", model=hf_id)
-    return _model_cache[hf_id]
-def _run_sapiens2(frames: list, hf_id: str) -> list[dict]:
-    try:
-        pipe = _get_sapiens2(hf_id)
-    except Exception as e:
-        logger.warning("sapiens2 load failed: %s", e)
-        return [{} for _ in frames]
-    from PIL import Image
-    out = []
-    for frame in frames:
-        try:
-            pil_img = Image.fromarray(frame)
-            result = pipe(pil_img)
-            # result is a list of person dicts; take the first (highest confidence)
-            if not result:
-                out.append({})
-                continue
-            person = result[0]
-            keypoints = person.get("keypoints", [])
-            scores = person.get("keypoint_scores", [])
-            # Build name→(x,y,score) lookup from pipeline output
-            kp_lookup: dict[str, tuple] = {}
-            for i, kp in enumerate(keypoints):
-                name = kp.get("label", "") if isinstance(kp, dict) else ""
-                x = kp.get("x", 0.0) if isinstance(kp, dict) else float(kp[0])
-                y = kp.get("y", 0.0) if isinstance(kp, dict) else float(kp[1])
-                score = scores[i] if i < len(scores) else 0.0
-                if name:
-                    kp_lookup[name] = (x, y, float(score))
-            kps: dict[int, dict] = {}
-            for coco_idx, name in enumerate(_COCO_NAMES):
-                if name in kp_lookup:
-                    x, y, s = kp_lookup[name]
-                    kps[coco_idx] = {"x": x, "y": y, "conf": s}
-            out.append(kps)
-        except Exception:
-            out.append({})
-    return out
-# ── Agent ────────────────────────────────────────────────────────────────────
-class Pose2DAgent:
-    """Extracts COCO-17 keypoints per frame; dispatches to YOLO, MediaPipe, or Sapiens2."""
-    def run(self, ingest: IngestResult, model_key: str | None = None) -> Pose2DResult:
-        if not ingest.frames:
-            return Pose2DResult(keypoints=[], fps=ingest.fps, confidence=0.0, notes="no frames in ingest")
-        key = model_key or config.DEFAULT_POSE_MODEL
-        spec = config.POSE_MODELS.get(key)
-        if spec is None:
-            logger.warning("Unknown model_key %r — falling back to %s", key, config.DEFAULT_POSE_MODEL)
-            spec = config.POSE_MODELS[config.DEFAULT_POSE_MODEL]
-        backend = spec["backend"]
-        try:
-            if backend == "yolo":
-                kps_per_frame = _run_yolo(ingest.frames, spec["path"])
-            elif backend == "mediapipe":
-                kps_per_frame = _run_mediapipe(ingest.frames, spec["hf_id"])
-            elif backend == "sapiens2":
-                kps_per_frame = _run_sapiens2(ingest.frames, spec["hf_id"])
-            else:
-                return Pose2DResult(
-                    keypoints=[{} for _ in ingest.frames],
-                    fps=ingest.fps, confidence=0.0,
-                    notes=f"unknown backend: {backend}",
-                )
-        except Exception as e:
-            return Pose2DResult(
-                keypoints=[{} for _ in ingest.frames],
-                fps=ingest.fps, confidence=0.0,
-                notes=str(e),
-            )
-        n_detected = sum(1 for f in kps_per_frame if f)
-        total_conf = sum(
-            sum(kp["conf"] for kp in f.values()) / len(f)
-            for f in kps_per_frame if f
-        )
-        overall_conf = (total_conf / n_detected) if n_detected > 0 else 0.0
-        notes = "" if n_detected > 0 else "no person detected in any frame"
-        return Pose2DResult(
-            keypoints=kps_per_frame,
-            fps=ingest.fps,
-            confidence=overall_conf,
-            notes=notes,
-        )
-```
-- [ ] **Step 4: Run the new signature test**
-```bash
-pytest tests/test_pose2d.py::TestPose2DAgent::test_run_accepts_model_key -v
-```
-Expected: PASS
-- [ ] **Step 5: Run full existing pose2d test suite**
-```bash
-pytest tests/test_pose2d.py -v
-```
-Expected: all existing tests pass (they will skip if YOLO model unavailable in env — that's OK).
-- [ ] **Step 6: Commit and push**
-```bash
-git add formscout/agents/pose2d.py tests/test_pose2d.py
-git commit -m "feat: Pose2DAgent — three backends (yolo/mediapipe/sapiens2), model_key dispatch"
-git push
-```
----
-## Task 3: Add `onnxruntime` to requirements
-**Files:**
-- Modify: `requirements.txt`
-- [ ] **Step 1: Add onnxruntime**
-Open `requirements.txt` and add after the existing `transformers` line:
-```
-onnxruntime>=1.18
-```
-- [ ] **Step 2: Verify it installs**
-```bash
-pip install onnxruntime --quiet && python3 -c "import onnxruntime; print(onnxruntime.__version__)"
-```
-Expected: version string printed, no errors.
-- [ ] **Step 3: Commit and push**
-```bash
-git add requirements.txt
-git commit -m "chore: add onnxruntime for MediaPipe ONNX backend"
-git push
-```
----
-## Task 4: Update `Director.run()` — `pose_model_path` → `model_key`
-**Files:**
-- Modify: `formscout/pipeline.py`
-- [ ] **Step 1: Update the signature and the `pose2d` call**
-In `formscout/pipeline.py`, change `Director.run()`:
-```python
-def run(self, video_path: str, test_name: str = "deep_squat", side: str = "na", model_key: str | None = None) -> PipelineState:
-    """
-    Run the full pipeline on a single video.
-    test_name/side serve as manual override when provided (skips classifier).
-    model_key selects the pose backend (see config.POSE_MODELS).
-    """
-    state = PipelineState(video_path=video_path)
-    # ─── Ingest ───
-    state.ingest = self._ingest.run(video_path)
-    if state.ingest.confidence < config.MIN_CONFIDENCE:
-        state.errors.append("ingest: low confidence — video may be corrupt")
-        return state
-    # ─── Pose 2D ───
-    state.pose2d = self._pose2d.run(state.ingest, model_key=model_key)
-    # ... rest of method unchanged
-```
-(Only the signature line and the `self._pose2d.run(...)` call change — everything else stays the same.)
-- [ ] **Step 2: Verify import is clean**
-```bash
-python3 -c "from formscout.pipeline import Director; d = Director(); print('ok')"
-```
-Expected: `ok` (models load lazily so no crash here).
-- [ ] **Step 3: Commit and push**
-```bash
-git add formscout/pipeline.py
-git commit -m "feat: Director.run() accepts model_key, threads to Pose2DAgent"
-git push
-```
----
-## Task 5: Wire the UI — pose model dropdown in `app.py`
-**Files:**
-- Modify: `app.py`
-- [ ] **Step 1: Update `process_video` to use `model_key` and the unified registry**
-Replace the existing `process_video` function signature and the old `YOLO_POSE_MODELS.get()` lookup:
-```python
-def process_video(video_path: str, test_name: str, side: str, model_key: str):
-    """Process an uploaded video through the FormScout pipeline."""
-    if not video_path:
-        return (
-            _render_empty_state(),
-            "Upload a video to begin analysis.",
-            "",
-            "",
-        )
-    director = Director()
-    state = director.run(video_path, test_name=test_name, side=side, model_key=model_key)
-```
-(Remove the `pose_model_path = config.YOLO_POSE_MODELS.get(...)` line entirely.)
-- [ ] **Step 2: Add the `pose_model_dropdown` in `build_app()`**
-Inside `build_app()`, after the `side_dropdown` block (around line 265) and before `submit_btn`, add:
-```python
-pose_model_dropdown = gr.Dropdown(
-    choices=list(config.POSE_MODELS.keys()),
-    value=config.DEFAULT_POSE_MODEL,
-    label="Pose Model",
-)
-```
-- [ ] **Step 3: Update `_map_inputs` to pass the model key**
-Replace the existing `_map_inputs` closure:
-```python
-def _map_inputs(video, test_display_name, side_display, pose_model_key):
-    """Map UI display values to internal values."""
-    test_map = {name: val for name, val in FMS_TESTS}
-    test_name = test_map.get(test_display_name, "deep_squat")
-    side = {"N/A": "na", "Left": "left", "Right": "right"}.get(side_display, "na")
-    return process_video(video, test_name, side, pose_model_key)
-```
-- [ ] **Step 4: Update `submit_btn.click` to include `pose_model_dropdown`**
-Replace the existing `.click(...)` call:
-```python
-submit_btn.click(
-    fn=_map_inputs,
-    inputs=[video_input, test_dropdown, side_dropdown, pose_model_dropdown],
-    outputs=[score_html, pipeline_md, score_details, alerts_md],
-)
-```
-- [ ] **Step 5: Smoke-test the app starts**
-```bash
-python3 -c "from app import build_app; app = build_app(); print('app built ok')"
-```
-Expected: `app built ok` — no import or config errors.
-- [ ] **Step 6: Commit and push**
-```bash
-git add app.py
-git commit -m "feat: pose model dropdown in UI, wired through process_video → Director"
-git push
-```
----
-## Task 6: Add mocked backend tests
-**Files:**
-- Modify: `tests/test_pose2d.py`
-- [ ] **Step 1: Add mocked YOLO test**
-Append to `tests/test_pose2d.py`:
-```python
-import unittest.mock as mock
-import numpy as np
-from formscout.types import IngestResult, Pose2DResult
-def _blank_ingest_3():
-    frames = [np.zeros((480, 640, 3), dtype=np.uint8) for _ in range(3)]
-    return IngestResult(frames=frames, fps=30.0, duration=0.1, n_people=1, width=640, height=480)
-class TestPose2DBackendsMocked:
-    """Backend dispatch tests — no real model downloads."""
-    def test_yolo_backend_dispatches(self):
-        from formscout.agents.pose2d import Pose2DAgent, _run_yolo
-        fake_kps = [{0: {"x": 10.0, "y": 20.0, "conf": 0.9}} for _ in range(3)]
-        with mock.patch("formscout.agents.pose2d._run_yolo", return_value=fake_kps) as m:
-            agent = Pose2DAgent()
-            result = agent.run(_blank_ingest_3(), model_key="YOLO26n — nano (0.7M, fastest)")
-        m.assert_called_once()
-        assert isinstance(result, Pose2DResult)
-        assert len(result.keypoints) == 3
-        assert result.confidence > 0.0
-    def test_mediapipe_backend_dispatches(self):
-        from formscout.agents.pose2d import Pose2DAgent
-        fake_kps = [{i: {"x": float(i), "y": float(i), "conf": 0.8} for i in range(17)} for _ in range(3)]
-        with mock.patch("formscout.agents.pose2d._run_mediapipe", return_value=fake_kps) as m:
-            agent = Pose2DAgent()
-            result = agent.run(_blank_ingest_3(), model_key="MediaPipe-Pose ⬇ ~16 MB, CPU-friendly")
-        m.assert_called_once()
-        assert isinstance(result, Pose2DResult)
-        assert len(result.keypoints) == 3
-        assert all(len(f) == 17 for f in result.keypoints)
-    def test_sapiens2_backend_dispatches(self):
-        from formscout.agents.pose2d import Pose2DAgent
-        fake_kps = [{i: {"x": float(i), "y": float(i), "conf": 0.85} for i in range(17)} for _ in range(3)]
-        with mock.patch("formscout.agents.pose2d._run_sapiens2", return_value=fake_kps) as m:
-            agent = Pose2DAgent()
-            result = agent.run(_blank_ingest_3(), model_key="Sapiens2-0.4B ⬇ ~1.6 GB")
-        m.assert_called_once()
-        assert isinstance(result, Pose2DResult)
-        assert len(result.keypoints) == 3
-    def test_unknown_model_key_falls_back(self):
-        from formscout.agents.pose2d import Pose2DAgent
-        fake_kps = [{0: {"x": 1.0, "y": 2.0, "conf": 0.7}} for _ in range(3)]
-        with mock.patch("formscout.agents.pose2d._run_yolo", return_value=fake_kps):
-            agent = Pose2DAgent()
-            result = agent.run(_blank_ingest_3(), model_key="nonexistent-model-xyz")
-        assert isinstance(result, Pose2DResult)  # graceful fallback, no crash
-    def test_confidence_zero_on_empty_keypoints(self):
-        from formscout.agents.pose2d import Pose2DAgent
-        with mock.patch("formscout.agents.pose2d._run_yolo", return_value=[{}, {}, {}]):
-            agent = Pose2DAgent()
-            result = agent.run(_blank_ingest_3(), model_key="YOLO26n — nano (0.7M, fastest)")
-        assert result.confidence == 0.0
-        assert "no person" in result.notes.lower()
-```
-- [ ] **Step 2: Run the new tests**
-```bash
-pytest tests/test_pose2d.py::TestPose2DBackendsMocked -v
-```
-Expected: all 5 tests PASS.
-- [ ] **Step 3: Run the full test suite to check for regressions**
-```bash
-pytest tests/ -v --tb=short 2>&1 | tail -30
-```
-Expected: same pass/fail ratio as before (45/46 known passing). The one known failure (`test_unimplemented_test_returns_low_confidence`) is pre-existing — ignore it.
-- [ ] **Step 4: Commit and push**
-```bash
-git add tests/test_pose2d.py
-git commit -m "test: mocked backend dispatch tests for YOLO, MediaPipe, Sapiens2"
-git push
-```
----
-## Self-review
-**Spec coverage:**
-- ✅ Unified `POSE_MODELS` registry (Task 1)
-- ✅ `DEFAULT_POSE_MODEL = YOLO26n` (Task 1)
-- ✅ Backward-compat `YOLO_POSE_MODEL` / `YOLO_POSE_MODEL_HQ` aliases (Task 1)
-- ✅ `_run_yolo` sub-runner (Task 2)
-- ✅ `_run_mediapipe` with ONNX Runtime + BlazePose→COCO-17 mapping (Task 2)
-- ✅ `_run_sapiens2` with transformers pipeline + named-keypoint→COCO-17 mapping (Task 2)
-- ✅ `Pose2DAgent.run(model_key)` dispatch + fallback on unknown key (Task 2)
-- ✅ `onnxruntime` added to requirements (Task 3)
-- ✅ `Director.run(model_key)` threads key to agent (Task 4)
-- ✅ `pose_model_dropdown` in UI (Task 5)
-- ✅ `_map_inputs` + `submit_btn.click` wired (Task 5)
-- ✅ Error handling: unknown key → warning + fallback; download failure → confidence=0 (Task 2)
-- ✅ Mocked tests for all three backends (Task 6)
-**Placeholder scan:** None found.
-**Type consistency:** `model_key: str | None` used consistently across `Pose2DAgent.run`, `Director.run`, `process_video`. `config.POSE_MODELS` and `config.DEFAULT_POSE_MODEL` referenced consistently.
-**Note on Sapiens2 keypoint format:** The `_run_sapiens2` implementation uses **named keypoint lookup** (by label string) rather than assuming fixed indices 0–16 = COCO. This is the safe approach — the transformers pipeline returns labeled keypoints and the code maps by name. If the pipeline returns unnamed keypoints (index-only), the `kp_lookup` will be empty and the frame will gracefully return `{}`.

+# Pose Model Selector Implementation Plan
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+**Goal:** Replace the hard-coded YOLO26l default with a 10-model dropdown (MediaPipe, YOLO26 n→x, Sapiens2 0.4B→5B) wired end-to-end from UI through the Director to `Pose2DAgent`.
+**Architecture:** Unified `POSE_MODELS` registry in `config.py` drives a `gr.Dropdown` in `app.py`; the selected key flows through `Director.run()` into `Pose2DAgent.run(model_key)`, which dispatches to one of three private sub-runners (`_run_yolo`, `_run_mediapipe`, `_run_sapiens2`), all producing the same COCO-17 `list[dict]` contract.
+**Tech Stack:** `ultralytics` (YOLO), `onnxruntime` + `huggingface_hub` (MediaPipe), `transformers` (Sapiens2), `gradio` (UI).
+---
+## File map
+| File | Change |
+|---|---|
+| `formscout/config.py` | Replace `YOLO_POSE_MODELS` with `POSE_MODELS` dict + `DEFAULT_POSE_MODEL` |
+| `formscout/agents/pose2d.py` | Add `_run_yolo`, `_run_mediapipe`, `_run_sapiens2`; update `run()` signature |
+| `formscout/pipeline.py` | Change `pose_model_path` param to `model_key` |
+| `app.py` | Add `pose_model_dropdown`, fix `_map_inputs` + `process_video` |
+| `requirements.txt` | Add `onnxruntime>=1.18` |
+| `tests/test_pose2d.py` | Add mocked tests for each backend |
+---
+## Task 1: Add unified `POSE_MODELS` registry to `config.py`
+**Files:**
+- Modify: `formscout/config.py`
+- [ ] **Step 1: Open `formscout/config.py` and replace the `YOLO_POSE_MODELS` block**
+Replace lines 12–20 (the `YOLO_POSE_MODELS` dict and `YOLO_POSE_MODEL` / `YOLO_POSE_MODEL_HQ` lines) with:
+```python
+_YOLO_DIR = ROOT / "checkpoints" / "yolo26"
+POSE_MODELS: dict[str, dict] = {
+    # ── MediaPipe (Qualcomm HF, ONNX Runtime) ──────────────────────────────
+    "MediaPipe-Pose ⬇ ~16 MB, CPU-friendly": {
+        "backend": "mediapipe",
+        "hf_id": "qualcomm/MediaPipe-Pose-Estimation",
+        "params_m": 4.2,
+    },
+    # ── YOLO26 (local checkpoints) ─────────────────────────────────────────
+    "YOLO26n — nano (0.7M, fastest)": {
+        "backend": "yolo",
+        "path": str(_YOLO_DIR / "yolo26n-pose.pt"),
+        "params_m": 0.7,
+    },
+    "YOLO26s — small (3.5M)": {
+        "backend": "yolo",
+        "path": str(_YOLO_DIR / "yolo26s-pose.pt"),
+        "params_m": 3.5,
+    },
+    "YOLO26m — medium (9M)": {
+        "backend": "yolo",
+        "path": str(_YOLO_DIR / "yolo26m-pose.pt"),
+        "params_m": 9.0,
+    },
+    "YOLO26l — large (25.9M)": {
+        "backend": "yolo",
+        "path": str(_YOLO_DIR / "yolo26l-pose.pt"),
+        "params_m": 25.9,
+    },
+    "YOLO26x — extra-large (57.6M)": {
+        "backend": "yolo",
+        "path": str(_YOLO_DIR / "yolo26x-pose.pt"),
+        "params_m": 57.6,
+    },
+    # ── Sapiens2 (HF download, transformers) ─────────────────────────���─────
+    "Sapiens2-0.4B ⬇ ~1.6 GB": {
+        "backend": "sapiens2",
+        "hf_id": "facebook/sapiens2-pose-0.4b",
+        "params_m": 400,
+    },
+    "Sapiens2-0.8B ⬇ ~3.2 GB": {
+        "backend": "sapiens2",
+        "hf_id": "facebook/sapiens2-pose-0.8b",
+        "params_m": 800,
+    },
+    "Sapiens2-1B ⬇ ~4 GB": {
+        "backend": "sapiens2",
+        "hf_id": "facebook/sapiens2-pose-1b",
+        "params_m": 1000,
+    },
+    "Sapiens2-5B ⬇ ~20 GB, large GPU": {
+        "backend": "sapiens2",
+        "hf_id": "facebook/sapiens2-pose-5b",
+        "params_m": 5000,
+    },
+}
+DEFAULT_POSE_MODEL = "YOLO26n — nano (0.7M, fastest)"
+# Backward-compat aliases — kept for any direct references outside the agent
+YOLO_POSE_MODEL = str(_YOLO_DIR / "yolo26l-pose.pt")
+YOLO_POSE_MODEL_HQ = str(_YOLO_DIR / "yolo26x-pose.pt")
+```
+- [ ] **Step 2: Verify import is clean**
+```bash
+python3 -c "from formscout import config; print(list(config.POSE_MODELS.keys()))"
+```
+Expected: list of 10 model labels, starting with `MediaPipe-Pose...`
+- [ ] **Step 3: Commit**
+```bash
+git add formscout/config.py
+git commit -m "feat: unified POSE_MODELS registry with MediaPipe, YOLO26 n-x, Sapiens2 0.4-5B"
+git push
+```
+---
+## Task 2: Refactor `Pose2DAgent` — YOLO sub-runner + new `run()` signature
+**Files:**
+- Modify: `formscout/agents/pose2d.py`
+- Modify: `tests/test_pose2d.py`
+- [ ] **Step 1: Write failing test for the new `model_key` signature**
+Add to `tests/test_pose2d.py`:
+```python
+def test_run_accepts_model_key(pose2d_agent):
+    """run() must accept model_key kwarg, not model_path."""
+    import inspect
+    sig = inspect.signature(pose2d_agent.run)
+    assert "model_key" in sig.parameters
+    assert "model_path" not in sig.parameters
+```
+- [ ] **Step 2: Run to confirm it fails**
+```bash
+pytest tests/test_pose2d.py::TestPose2DAgent::test_run_accepts_model_key -v
+```
+Expected: FAIL — `model_path` still present in signature.
+- [ ] **Step 3: Rewrite `formscout/agents/pose2d.py`**
+Replace the entire file with:
+```python
+"""
+Pose2DAgent — 2D per-frame keypoint extraction.
+Backends: yolo (local ONNX), mediapipe (Qualcomm HF/ONNX Runtime),
+          sapiens2 (Meta HF/transformers).
+All backends output COCO-17 keypoints: dict[int, {x, y, conf}] per frame.
+Input:  IngestResult
+Output: Pose2DResult(keypoints per frame, fps, confidence)
+Failure: Pose2DResult(confidence=0.0, notes=<reason>) — never raises.
+"""
+from __future__ import annotations
+import logging
+import numpy as np
+from formscout import config
+from formscout.types import IngestResult, Pose2DResult
+logger = logging.getLogger(__name__)
+COCO_KEYPOINTS = [
+    "nose", "left_eye", "right_eye", "left_ear", "right_ear",
+    "left_shoulder", "right_shoulder", "left_elbow", "right_elbow",
+    "left_wrist", "right_wrist", "left_hip", "right_hip",
+    "left_knee", "right_knee", "left_ankle", "right_ankle",
+]
+# BlazePose-33 → COCO-17 index mapping
+_BLAZEPOSE_TO_COCO: dict[int, int] = {
+    0: 0,   # nose
+    1: 2,   # left_eye (inner → left_eye)
+    2: 1,   # right_eye (inner → right_eye) — swapped: BlazePose 1=left_eye_inner
+    3: 3,   # left_ear
+    4: 4,   # right_ear
+    5: 5,   # left_shoulder → COCO left_shoulder... wait
+    # Correct BlazePose-33 COCO mapping (canonical):
+    # BlazePose idx : COCO idx
+    # 0  nose           → COCO 0
+    # 2  left_eye       → COCO 1
+    # 5  right_eye      → COCO 2
+    # 7  left_ear       → COCO 3
+    # 8  right_ear      → COCO 4
+    # 11 left_shoulder  → COCO 5
+    # 12 right_shoulder → COCO 6
+    # 13 left_elbow     → COCO 7
+    # 14 right_elbow    → COCO 8
+    # 15 left_wrist     → COCO 9
+    # 16 right_wrist    → COCO 10
+    # 23 left_hip       → COCO 11
+    # 24 right_hip      → COCO 12
+    # 25 left_knee      → COCO 13
+    # 26 right_knee     → COCO 14
+    # 27 left_ankle     → COCO 15
+    # 28 right_ankle    → COCO 16
+}
+# BlazePose source index → COCO target index (correct mapping, no duplicates)
+_BP_SRC = [0, 2, 5, 7, 8, 11, 12, 13, 14, 15, 16, 23, 24, 25, 26, 27, 28]
+_BP_DST = list(range(17))  # COCO 0..16
+_model_cache: dict[str, object] = {}
+# ── YOLO backend ─────────────────────────────────────────────────────────────
+def _get_yolo(path: str) -> object:
+    if path not in _model_cache:
+        from ultralytics import YOLO
+        _model_cache[path] = YOLO(path)
+    return _model_cache[path]
+def _run_yolo(frames: list, path: str) -> list[dict]:
+    model = _get_yolo(path)
+    out = []
+    for frame in frames:
+        try:
+            results = model(frame, verbose=False)
+            kps: dict[int, dict] = {}
+            if results and results[0].keypoints is not None:
+                kp = results[0].keypoints
+                if kp.xy is not None and len(kp.xy) > 0:
+                    xy = kp.xy[0].cpu().numpy()
+                    conf = kp.conf[0].cpu().numpy()
+                    for j in range(min(len(xy), 17)):
+                        kps[j] = {"x": float(xy[j, 0]), "y": float(xy[j, 1]), "conf": float(conf[j])}
+            out.append(kps)
+        except Exception:
+            out.append({})
+    return out
+# ── MediaPipe backend ────────────────────────────────────────────────────────
+def _get_mediapipe_sessions(hf_id: str):
+    """Return (detector_session, landmark_session) cached by hf_id."""
+    cache_key = f"mp:{hf_id}"
+    if cache_key not in _model_cache:
+        from huggingface_hub import snapshot_download
+        import onnxruntime as ort
+        from pathlib import Path
+        snap = Path(snapshot_download(hf_id))
+        onnx_files = sorted(snap.glob("**/*.onnx"), key=lambda p: p.stat().st_size)
+        if len(onnx_files) < 2:
+            raise RuntimeError(f"Expected 2 ONNX files in {snap}, found {len(onnx_files)}")
+        # Smaller file = pose detector; larger = pose landmark detector
+        det_sess = ort.InferenceSession(str(onnx_files[0]))
+        lmk_sess = ort.InferenceSession(str(onnx_files[-1]))
+        _model_cache[cache_key] = (det_sess, lmk_sess)
+    return _model_cache[cache_key]
+def _preprocess_mediapipe(frame: np.ndarray, size: int = 256) -> np.ndarray:
+    """Resize to size×size, normalize to [0,1], add batch dim → (1,3,H,W)."""
+    import cv2
+    img = cv2.resize(frame, (size, size)).astype(np.float32) / 255.0
+    return img.transpose(2, 0, 1)[None]  # (1, 3, 256, 256)
+def _run_mediapipe(frames: list, hf_id: str) -> list[dict]:
+    try:
+        det_sess, lmk_sess = _get_mediapipe_sessions(hf_id)
+    except Exception as e:
+        logger.warning("mediapipe load failed: %s", e)
+        return [{} for _ in frames]
+    import cv2
+    h_orig, w_orig = frames[0].shape[:2] if frames else (480, 640)
+    out = []
+    for frame in frames:
+        try:
+            h, w = frame.shape[:2]
+            inp = _preprocess_mediapipe(frame)
+            # Run landmark detector directly on full frame (single-person FMS use-case)
+            lmk_input_name = lmk_sess.get_inputs()[0].name
+            lmk_out = lmk_sess.run(None, {lmk_input_name: inp})
+            # lmk_out[0] shape: (1, 33, 3) — [x, y, visibility] normalized 0..1
+            landmarks = lmk_out[0][0]  # (33, 3)
+            kps: dict[int, dict] = {}
+            for coco_idx, bp_idx in zip(_BP_DST, _BP_SRC):
+                if bp_idx < len(landmarks):
+                    lm = landmarks[bp_idx]
+                    kps[coco_idx] = {
+                        "x": float(lm[0] * w),
+                        "y": float(lm[1] * h),
+                        "conf": float(lm[2]),  # visibility score
+                    }
+            out.append(kps)
+        except Exception:
+            out.append({})
+    return out
+# ── Sapiens2 backend ─────────────────────────────────────────────────────────
+# COCO-17 keypoint names in order (used to map Sapiens2 named output → COCO index)
+_COCO_NAMES = [
+    "nose", "left_eye", "right_eye", "left_ear", "right_ear",
+    "left_shoulder", "right_shoulder", "left_elbow", "right_elbow",
+    "left_wrist", "right_wrist", "left_hip", "right_hip",
+    "left_knee", "right_knee", "left_ankle", "right_ankle",
+]
+def _get_sapiens2(hf_id: str) -> object:
+    if hf_id not in _model_cache:
+        from transformers import pipeline as hf_pipeline
+        _model_cache[hf_id] = hf_pipeline("pose-estimation", model=hf_id)
+    return _model_cache[hf_id]
+def _run_sapiens2(frames: list, hf_id: str) -> list[dict]:
+    try:
+        pipe = _get_sapiens2(hf_id)
+    except Exception as e:
+        logger.warning("sapiens2 load failed: %s", e)
+        return [{} for _ in frames]
+    from PIL import Image
+    out = []
+    for frame in frames:
+        try:
+            pil_img = Image.fromarray(frame)
+            result = pipe(pil_img)
+            # result is a list of person dicts; take the first (highest confidence)
+            if not result:
+                out.append({})
+                continue
+            person = result[0]
+            keypoints = person.get("keypoints", [])
+            scores = person.get("keypoint_scores", [])
+            # Build name→(x,y,score) lookup from pipeline output
+            kp_lookup: dict[str, tuple] = {}
+            for i, kp in enumerate(keypoints):
+                name = kp.get("label", "") if isinstance(kp, dict) else ""
+                x = kp.get("x", 0.0) if isinstance(kp, dict) else float(kp[0])
+                y = kp.get("y", 0.0) if isinstance(kp, dict) else float(kp[1])
+                score = scores[i] if i < len(scores) else 0.0
+                if name:
+                    kp_lookup[name] = (x, y, float(score))
+            kps: dict[int, dict] = {}
+            for coco_idx, name in enumerate(_COCO_NAMES):
+                if name in kp_lookup:
+                    x, y, s = kp_lookup[name]
+                    kps[coco_idx] = {"x": x, "y": y, "conf": s}
+            out.append(kps)
+        except Exception:
+            out.append({})
+    return out
+# ── Agent ────────────────────────────────────────────────────────────────────
+class Pose2DAgent:
+    """Extracts COCO-17 keypoints per frame; dispatches to YOLO, MediaPipe, or Sapiens2."""
+    def run(self, ingest: IngestResult, model_key: str | None = None) -> Pose2DResult:
+        if not ingest.frames:
+            return Pose2DResult(keypoints=[], fps=ingest.fps, confidence=0.0, notes="no frames in ingest")
+        key = model_key or config.DEFAULT_POSE_MODEL
+        spec = config.POSE_MODELS.get(key)
+        if spec is None:
+            logger.warning("Unknown model_key %r — falling back to %s", key, config.DEFAULT_POSE_MODEL)
+            spec = config.POSE_MODELS[config.DEFAULT_POSE_MODEL]
+        backend = spec["backend"]
+        try:
+            if backend == "yolo":
+                kps_per_frame = _run_yolo(ingest.frames, spec["path"])
+            elif backend == "mediapipe":
+                kps_per_frame = _run_mediapipe(ingest.frames, spec["hf_id"])
+            elif backend == "sapiens2":
+                kps_per_frame = _run_sapiens2(ingest.frames, spec["hf_id"])
+            else:
+                return Pose2DResult(
+                    keypoints=[{} for _ in ingest.frames],
+                    fps=ingest.fps, confidence=0.0,
+                    notes=f"unknown backend: {backend}",
+                )
+        except Exception as e:
+            return Pose2DResult(
+                keypoints=[{} for _ in ingest.frames],
+                fps=ingest.fps, confidence=0.0,
+                notes=str(e),
+            )
+        n_detected = sum(1 for f in kps_per_frame if f)
+        total_conf = sum(
+            sum(kp["conf"] for kp in f.values()) / len(f)
+            for f in kps_per_frame if f
+        )
+        overall_conf = (total_conf / n_detected) if n_detected > 0 else 0.0
+        notes = "" if n_detected > 0 else "no person detected in any frame"
+        return Pose2DResult(
+            keypoints=kps_per_frame,
+            fps=ingest.fps,
+            confidence=overall_conf,
+            notes=notes,
+        )
+```
+- [ ] **Step 4: Run the new signature test**
+```bash
+pytest tests/test_pose2d.py::TestPose2DAgent::test_run_accepts_model_key -v
+```
+Expected: PASS
+- [ ] **Step 5: Run full existing pose2d test suite**
+```bash
+pytest tests/test_pose2d.py -v
+```
+Expected: all existing tests pass (they will skip if YOLO model unavailable in env — that's OK).
+- [ ] **Step 6: Commit and push**
+```bash
+git add formscout/agents/pose2d.py tests/test_pose2d.py
+git commit -m "feat: Pose2DAgent — three backends (yolo/mediapipe/sapiens2), model_key dispatch"
+git push
+```
+---
+## Task 3: Add `onnxruntime` to requirements
+**Files:**
+- Modify: `requirements.txt`
+- [ ] **Step 1: Add onnxruntime**
+Open `requirements.txt` and add after the existing `transformers` line:
+```
+onnxruntime>=1.18
+```
+- [ ] **Step 2: Verify it installs**
+```bash
+pip install onnxruntime --quiet && python3 -c "import onnxruntime; print(onnxruntime.__version__)"
+```
+Expected: version string printed, no errors.
+- [ ] **Step 3: Commit and push**
+```bash
+git add requirements.txt
+git commit -m "chore: add onnxruntime for MediaPipe ONNX backend"
+git push
+```
+---
+## Task 4: Update `Director.run()` — `pose_model_path` → `model_key`
+**Files:**
+- Modify: `formscout/pipeline.py`
+- [ ] **Step 1: Update the signature and the `pose2d` call**
+In `formscout/pipeline.py`, change `Director.run()`:
+```python
+def run(self, video_path: str, test_name: str = "deep_squat", side: str = "na", model_key: str | None = None) -> PipelineState:
+    """
+    Run the full pipeline on a single video.
+    test_name/side serve as manual override when provided (skips classifier).
+    model_key selects the pose backend (see config.POSE_MODELS).
+    """
+    state = PipelineState(video_path=video_path)
+    # ─── Ingest ───
+    state.ingest = self._ingest.run(video_path)
+    if state.ingest.confidence < config.MIN_CONFIDENCE:
+        state.errors.append("ingest: low confidence — video may be corrupt")
+        return state
+    # ─── Pose 2D ───
+    state.pose2d = self._pose2d.run(state.ingest, model_key=model_key)
+    # ... rest of method unchanged
+```
+(Only the signature line and the `self._pose2d.run(...)` call change — everything else stays the same.)
+- [ ] **Step 2: Verify import is clean**
+```bash
+python3 -c "from formscout.pipeline import Director; d = Director(); print('ok')"
+```
+Expected: `ok` (models load lazily so no crash here).
+- [ ] **Step 3: Commit and push**
+```bash
+git add formscout/pipeline.py
+git commit -m "feat: Director.run() accepts model_key, threads to Pose2DAgent"
+git push
+```
+---
+## Task 5: Wire the UI — pose model dropdown in `app.py`
+**Files:**
+- Modify: `app.py`
+- [ ] **Step 1: Update `process_video` to use `model_key` and the unified registry**
+Replace the existing `process_video` function signature and the old `YOLO_POSE_MODELS.get()` lookup:
+```python
+def process_video(video_path: str, test_name: str, side: str, model_key: str):
+    """Process an uploaded video through the FormScout pipeline."""
+    if not video_path:
+        return (
+            _render_empty_state(),
+            "Upload a video to begin analysis.",
+            "",
+            "",
+        )
+    director = Director()
+    state = director.run(video_path, test_name=test_name, side=side, model_key=model_key)
+```
+(Remove the `pose_model_path = config.YOLO_POSE_MODELS.get(...)` line entirely.)
+- [ ] **Step 2: Add the `pose_model_dropdown` in `build_app()`**
+Inside `build_app()`, after the `side_dropdown` block (around line 265) and before `submit_btn`, add:
+```python
+pose_model_dropdown = gr.Dropdown(
+    choices=list(config.POSE_MODELS.keys()),
+    value=config.DEFAULT_POSE_MODEL,
+    label="Pose Model",
+)
+```
+- [ ] **Step 3: Update `_map_inputs` to pass the model key**
+Replace the existing `_map_inputs` closure:
+```python
+def _map_inputs(video, test_display_name, side_display, pose_model_key):
+    """Map UI display values to internal values."""
+    test_map = {name: val for name, val in FMS_TESTS}
+    test_name = test_map.get(test_display_name, "deep_squat")
+    side = {"N/A": "na", "Left": "left", "Right": "right"}.get(side_display, "na")
+    return process_video(video, test_name, side, pose_model_key)
+```
+- [ ] **Step 4: Update `submit_btn.click` to include `pose_model_dropdown`**
+Replace the existing `.click(...)` call:
+```python
+submit_btn.click(
+    fn=_map_inputs,
+    inputs=[video_input, test_dropdown, side_dropdown, pose_model_dropdown],
+    outputs=[score_html, pipeline_md, score_details, alerts_md],
+)
+```
+- [ ] **Step 5: Smoke-test the app starts**
+```bash
+python3 -c "from app import build_app; app = build_app(); print('app built ok')"
+```
+Expected: `app built ok` — no import or config errors.
+- [ ] **Step 6: Commit and push**
+```bash
+git add app.py
+git commit -m "feat: pose model dropdown in UI, wired through process_video → Director"
+git push
+```
+---
+## Task 6: Add mocked backend tests
+**Files:**
+- Modify: `tests/test_pose2d.py`
+- [ ] **Step 1: Add mocked YOLO test**
+Append to `tests/test_pose2d.py`:
+```python
+import unittest.mock as mock
+import numpy as np
+from formscout.types import IngestResult, Pose2DResult
+def _blank_ingest_3():
+    frames = [np.zeros((480, 640, 3), dtype=np.uint8) for _ in range(3)]
+    return IngestResult(frames=frames, fps=30.0, duration=0.1, n_people=1, width=640, height=480)
+class TestPose2DBackendsMocked:
+    """Backend dispatch tests — no real model downloads."""
+    def test_yolo_backend_dispatches(self):
+        from formscout.agents.pose2d import Pose2DAgent, _run_yolo
+        fake_kps = [{0: {"x": 10.0, "y": 20.0, "conf": 0.9}} for _ in range(3)]
+        with mock.patch("formscout.agents.pose2d._run_yolo", return_value=fake_kps) as m:
+            agent = Pose2DAgent()
+            result = agent.run(_blank_ingest_3(), model_key="YOLO26n — nano (0.7M, fastest)")
+        m.assert_called_once()
+        assert isinstance(result, Pose2DResult)
+        assert len(result.keypoints) == 3
+        assert result.confidence > 0.0
+    def test_mediapipe_backend_dispatches(self):
+        from formscout.agents.pose2d import Pose2DAgent
+        fake_kps = [{i: {"x": float(i), "y": float(i), "conf": 0.8} for i in range(17)} for _ in range(3)]
+        with mock.patch("formscout.agents.pose2d._run_mediapipe", return_value=fake_kps) as m:
+            agent = Pose2DAgent()
+            result = agent.run(_blank_ingest_3(), model_key="MediaPipe-Pose ⬇ ~16 MB, CPU-friendly")
+        m.assert_called_once()
+        assert isinstance(result, Pose2DResult)
+        assert len(result.keypoints) == 3
+        assert all(len(f) == 17 for f in result.keypoints)
+    def test_sapiens2_backend_dispatches(self):
+        from formscout.agents.pose2d import Pose2DAgent
+        fake_kps = [{i: {"x": float(i), "y": float(i), "conf": 0.85} for i in range(17)} for _ in range(3)]
+        with mock.patch("formscout.agents.pose2d._run_sapiens2", return_value=fake_kps) as m:
+            agent = Pose2DAgent()
+            result = agent.run(_blank_ingest_3(), model_key="Sapiens2-0.4B ⬇ ~1.6 GB")
+        m.assert_called_once()
+        assert isinstance(result, Pose2DResult)
+        assert len(result.keypoints) == 3
+    def test_unknown_model_key_falls_back(self):
+        from formscout.agents.pose2d import Pose2DAgent
+        fake_kps = [{0: {"x": 1.0, "y": 2.0, "conf": 0.7}} for _ in range(3)]
+        with mock.patch("formscout.agents.pose2d._run_yolo", return_value=fake_kps):
+            agent = Pose2DAgent()
+            result = agent.run(_blank_ingest_3(), model_key="nonexistent-model-xyz")
+        assert isinstance(result, Pose2DResult)  # graceful fallback, no crash
+    def test_confidence_zero_on_empty_keypoints(self):
+        from formscout.agents.pose2d import Pose2DAgent
+        with mock.patch("formscout.agents.pose2d._run_yolo", return_value=[{}, {}, {}]):
+            agent = Pose2DAgent()
+            result = agent.run(_blank_ingest_3(), model_key="YOLO26n — nano (0.7M, fastest)")
+        assert result.confidence == 0.0
+        assert "no person" in result.notes.lower()
+```
+- [ ] **Step 2: Run the new tests**
+```bash
+pytest tests/test_pose2d.py::TestPose2DBackendsMocked -v
+```
+Expected: all 5 tests PASS.
+- [ ] **Step 3: Run the full test suite to check for regressions**
+```bash
+pytest tests/ -v --tb=short 2>&1 | tail -30
+```
+Expected: same pass/fail ratio as before (45/46 known passing). The one known failure (`test_unimplemented_test_returns_low_confidence`) is pre-existing — ignore it.
+- [ ] **Step 4: Commit and push**
+```bash
+git add tests/test_pose2d.py
+git commit -m "test: mocked backend dispatch tests for YOLO, MediaPipe, Sapiens2"
+git push
+```
+---
+## Self-review
+**Spec coverage:**
+- ✅ Unified `POSE_MODELS` registry (Task 1)
+- ✅ `DEFAULT_POSE_MODEL = YOLO26n` (Task 1)
+- ✅ Backward-compat `YOLO_POSE_MODEL` / `YOLO_POSE_MODEL_HQ` aliases (Task 1)
+- ✅ `_run_yolo` sub-runner (Task 2)
+- ✅ `_run_mediapipe` with ONNX Runtime + BlazePose→COCO-17 mapping (Task 2)
+- ✅ `_run_sapiens2` with transformers pipeline + named-keypoint→COCO-17 mapping (Task 2)
+- ✅ `Pose2DAgent.run(model_key)` dispatch + fallback on unknown key (Task 2)
+- ✅ `onnxruntime` added to requirements (Task 3)
+- ✅ `Director.run(model_key)` threads key to agent (Task 4)
+- ✅ `pose_model_dropdown` in UI (Task 5)
+- ✅ `_map_inputs` + `submit_btn.click` wired (Task 5)
+- ✅ Error handling: unknown key → warning + fallback; download failure → confidence=0 (Task 2)
+- ✅ Mocked tests for all three backends (Task 6)
+**Placeholder scan:** None found.
+**Type consistency:** `model_key: str | None` used consistently across `Pose2DAgent.run`, `Director.run`, `process_video`. `config.POSE_MODELS` and `config.DEFAULT_POSE_MODEL` referenced consistently.
+**Note on Sapiens2 keypoint format:** The `_run_sapiens2` implementation uses **named keypoint lookup** (by label string) rather than assuming fixed indices 0–16 = COCO. This is the safe approach — the transformers pipeline returns labeled keypoints and the code maps by name. If the pipeline returns unnamed keypoints (index-only), the `kp_lookup` will be empty and the frame will gracefully return `{}`.

docs/superpowers/plans/2026-06-09-pose-visualizer.md CHANGED Viewed

@@ -1,914 +1,914 @@
-# Pose Overlay Visualizer Implementation Plan
-> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
-**Goal:** Add a pose overlay video output to FormScout with skeleton, motion trails, and velocity arrows, plus a per-joint velocity summary table.
-**Architecture:** A new `formscout/agents/visualizer.py` runs after `director.run()` in `process_video()`; it uses Kalman-filtered per-joint velocity and OpenCV rendering. `app.py` gains a `gr.CheckboxGroup` for layer selection, a new `gr.Video` output tab, and a `gr.Markdown` velocity summary.
-**Tech Stack:** `opencv-python`, `numpy`, `colorsys` (stdlib), `gradio`.
----
-## File map
-| File | Change |
-|---|---|
-| `formscout/agents/visualizer.py` | Create — Kalman filter, velocity, PoseVisualizer, summary |
-| `tests/test_visualizer.py` | Create — all visualizer tests |
-| `app.py` | Modify — overlay_layers checkbox, new tab, wiring |
----
-## Task 1: `SimpleKalmanFilter` + `compute_joint_velocity`
-**Files:**
-- Create: `formscout/agents/visualizer.py`
-- Create: `tests/test_visualizer.py`
-- [ ] **Step 1: Write failing tests**
-Create `tests/test_visualizer.py`:
-```python
-"""Tests for PoseVisualizer — no GPU, no model downloads."""
-import numpy as np
-import pytest
-from formscout.types import IngestResult, Pose2DResult
-def _make_ingest(n=5, h=480, w=640, fps=30.0):
-    frames = [np.zeros((h, w, 3), dtype=np.uint8) for _ in range(n)]
-    return IngestResult(frames=frames, fps=fps, duration=n/fps, n_people=1, width=w, height=h)
-def _make_pose(n=5, w=640, h=480):
-    """Synthetic Pose2DResult: 17 joints at fixed pixel positions, conf=0.9."""
-    kps_per_frame = []
-    for i in range(n):
-        frame_kps = {}
-        for j in range(17):
-            frame_kps[j] = {
-                "x": float(50 + j * 30 + i * 2),  # slight movement each frame
-                "y": float(100 + j * 20),
-                "conf": 0.9,
-            }
-        kps_per_frame.append(frame_kps)
-    return Pose2DResult(keypoints=kps_per_frame, fps=30.0, confidence=0.9, notes="")
-class TestComputeJointVelocity:
-    def test_returns_17_joints(self):
-        from formscout.agents.visualizer import compute_joint_velocity
-        pose = _make_pose(n=5)
-        result = compute_joint_velocity(pose.keypoints, fps=30.0)
-        assert len(result) == 17
-    def test_each_list_has_n_frames(self):
-        from formscout.agents.visualizer import compute_joint_velocity
-        pose = _make_pose(n=5)
-        result = compute_joint_velocity(pose.keypoints, fps=30.0)
-        for joint_idx, speeds in result.items():
-            assert len(speeds) == 5, f"joint {joint_idx} has {len(speeds)} speeds, expected 5"
-    def test_speeds_are_non_negative(self):
-        from formscout.agents.visualizer import compute_joint_velocity
-        pose = _make_pose(n=5)
-        result = compute_joint_velocity(pose.keypoints, fps=30.0)
-        for speeds in result.values():
-            assert all(s >= 0.0 for s in speeds)
-    def test_missing_keypoints_give_zero_speed(self):
-        from formscout.agents.visualizer import compute_joint_velocity
-        # All frames empty
-        empty_kps = [{} for _ in range(5)]
-        result = compute_joint_velocity(empty_kps, fps=30.0)
-        for speeds in result.values():
-            assert all(s == 0.0 for s in speeds)
-```
-- [ ] **Step 2: Run to confirm failure**
-```bash
-pytest tests/test_visualizer.py::TestComputeJointVelocity -v
-```
-Expected: `ERROR` — `ModuleNotFoundError: No module named 'formscout.agents.visualizer'`
-- [ ] **Step 3: Create `formscout/agents/visualizer.py` with Kalman + velocity**
-```python
-"""
-PoseVisualizer — annotated overlay video with skeleton, trails, velocity arrows.
-Input:  IngestResult + Pose2DResult
-Output: .mp4 path (or None on failure/empty layers)
-Failure: returns None, never raises.
-"""
-from __future__ import annotations
-import colorsys
-import logging
-import math
-import tempfile
-from collections import deque
-import cv2
-import numpy as np
-logger = logging.getLogger(__name__)
-# ── COCO constants ────────────────────────────────────────────────────────────
-COCO_KEYPOINTS = [
-    "nose", "left_eye", "right_eye", "left_ear", "right_ear",
-    "left_shoulder", "right_shoulder", "left_elbow", "right_elbow",
-    "left_wrist", "right_wrist", "left_hip", "right_hip",
-    "left_knee", "right_knee", "left_ankle", "right_ankle",
-]
-COCO_SKELETON = [
-    (0, 1), (0, 2), (1, 3), (2, 4),          # face
-    (5, 6), (5, 7), (7, 9), (6, 8), (8, 10), # arms
-    (5, 11), (6, 12), (11, 12),               # torso
-    (11, 13), (13, 15), (12, 14), (14, 16),  # legs
-]
-TRAIL_LENGTH = 10
-MAX_ARROW_PX = 40
-CONF_THRESHOLD = 0.3
-# ── Kalman filter ─────────────────────────────────────────────────────────────
-class SimpleKalmanFilter:
-    """4-state Kalman filter (x, y, vx, vy) for joint tracking."""
-    def __init__(self, process_noise: float = 0.01, measurement_noise: float = 0.1):
-        self.is_initialized = False
-        self.state = np.zeros(4)
-        self.cov = np.eye(4) * 0.1
-        self.Q = np.eye(4) * process_noise
-        self.R = np.eye(2) * measurement_noise
-        self.H = np.array([[1, 0, 0, 0], [0, 1, 0, 0]], dtype=float)
-    def predict(self, dt: float = 1.0):
-        F = np.array([[1, 0, dt, 0], [0, 1, 0, dt], [0, 0, 1, 0], [0, 0, 0, 1]], dtype=float)
-        self.state = F @ self.state
-        self.cov = F @ self.cov @ F.T + self.Q
-    def update(self, x: float, y: float):
-        z = np.array([x, y])
-        if not self.is_initialized:
-            self.state[:2] = z
-            self.is_initialized = True
-            return
-        S = self.H @ self.cov @ self.H.T + self.R
-        K = self.cov @ self.H.T @ np.linalg.inv(S)
-        self.state = self.state + K @ (z - self.H @ self.state)
-        self.cov = (np.eye(4) - K @ self.H) @ self.cov
-    def velocity_magnitude(self) -> float:
-        vx, vy = self.state[2], self.state[3]
-        return math.sqrt(vx * vx + vy * vy)
-    def velocity_vector(self) -> tuple[float, float]:
-        return float(self.state[2]), float(self.state[3])
-# ── Velocity computation ──────────────────────────────────────────────────────
-def compute_joint_velocity(
-    keypoints_per_frame: list[dict],
-    fps: float,
-) -> dict[int, list[float]]:
-    """
-    Compute Kalman-filtered per-joint speed (px/s) for each frame.
-    Returns dict[joint_idx, [speed_frame0, speed_frame1, ...]] for all 17 COCO joints.
-    Missing/low-confidence keypoints yield speed=0.0 for that frame.
-    """
-    dt = 1.0 / fps if fps > 0 else 1.0
-    filters: dict[int, SimpleKalmanFilter] = {j: SimpleKalmanFilter() for j in range(17)}
-    result: dict[int, list[float]] = {j: [] for j in range(17)}
-    for frame_kps in keypoints_per_frame:
-        for j in range(17):
-            kf = filters[j]
-            kp = frame_kps.get(j)
-            kf.predict(dt)
-            if kp and kp.get("conf", 0.0) >= CONF_THRESHOLD:
-                kf.update(kp["x"], kp["y"])
-                speed = kf.velocity_magnitude()
-            else:
-                speed = 0.0
-            result[j].append(speed)
-    return result
-```
-- [ ] **Step 4: Run tests**
-```bash
-pytest tests/test_visualizer.py::TestComputeJointVelocity -v
-```
-Expected: 4 PASS
-- [ ] **Step 5: Commit**
-```bash
-git add formscout/agents/visualizer.py tests/test_visualizer.py
-git commit -m "feat: SimpleKalmanFilter + compute_joint_velocity (4 tests pass)"
-```
----
-## Task 2: `PoseVisualizer._draw_skeleton`
-**Files:**
-- Modify: `formscout/agents/visualizer.py`
-- Modify: `tests/test_visualizer.py`
-- [ ] **Step 1: Write failing test**
-Append to `tests/test_visualizer.py`:
-```python
-class TestDrawSkeleton:
-    def test_skeleton_draws_without_error(self):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        frame = np.zeros((480, 640, 3), dtype=np.uint8)
-        kps = {j: {"x": float(50 + j * 30), "y": float(100 + j * 20), "conf": 0.9}
-               for j in range(17)}
-        result = vis._draw_skeleton(frame.copy(), kps)
-        assert result.shape == frame.shape
-        # Frame must be modified (not all zeros after drawing)
-        assert not np.array_equal(result, frame)
-    def test_low_confidence_keypoints_not_drawn(self):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        frame = np.zeros((480, 640, 3), dtype=np.uint8)
-        # All keypoints below threshold
-        kps = {j: {"x": float(50 + j * 30), "y": 100.0, "conf": 0.1} for j in range(17)}
-        result = vis._draw_skeleton(frame.copy(), kps)
-        # Nothing drawn — frame stays all zeros
-        assert np.array_equal(result, frame)
-```
-- [ ] **Step 2: Run to confirm failure**
-```bash
-pytest tests/test_visualizer.py::TestDrawSkeleton -v
-```
-Expected: FAIL — `AttributeError: 'PoseVisualizer' object has no attribute '_draw_skeleton'`
-- [ ] **Step 3: Add `PoseVisualizer` class with `_draw_skeleton` to `visualizer.py`**
-Append after `compute_joint_velocity`:
-```python
-# ── Helpers ───────────────────────────────────────────────────────────────────
-def _conf_to_bgr(conf: float) -> tuple[int, int, int]:
-    """Map confidence 0→1 to BGR color red→green via HSV."""
-    hue = conf * 120.0 / 360.0
-    r, g, b = colorsys.hsv_to_rgb(hue, 1.0, 1.0)
-    return (int(b * 255), int(g * 255), int(r * 255))
-# ── PoseVisualizer ────────────────────────────────────────────────────────────
-class PoseVisualizer:
-    """Renders skeleton, trails, and velocity arrows onto video frames."""
-    def __init__(self):
-        self.last_velocities: dict[int, list[float]] = {}
-    # ── Skeleton ──────────────────────────────────────────────────────────────
-    def _draw_skeleton(self, frame: np.ndarray, kps: dict) -> np.ndarray:
-        """Draw COCO-17 bones (white) and joints (confidence-colored) onto frame."""
-        visible = {j: kp for j, kp in kps.items() if kp.get("conf", 0.0) >= CONF_THRESHOLD}
-        # Bones
-        for j1, j2 in COCO_SKELETON:
-            if j1 in visible and j2 in visible:
-                p1 = (int(visible[j1]["x"]), int(visible[j1]["y"]))
-                p2 = (int(visible[j2]["x"]), int(visible[j2]["y"]))
-                cv2.line(frame, p1, p2, (255, 255, 255), 2)
-        # Joints
-        for j, kp in visible.items():
-            pt = (int(kp["x"]), int(kp["y"]))
-            color = _conf_to_bgr(kp["conf"])
-            cv2.circle(frame, pt, 4, color, -1)
-            cv2.circle(frame, pt, 5, (255, 255, 255), 1)
-        return frame
-```
-- [ ] **Step 4: Run tests**
-```bash
-pytest tests/test_visualizer.py::TestDrawSkeleton -v
-```
-Expected: 2 PASS
-- [ ] **Step 5: Commit**
-```bash
-git add formscout/agents/visualizer.py tests/test_visualizer.py
-git commit -m "feat: PoseVisualizer._draw_skeleton with confidence-colored joints"
-```
----
-## Task 3: `PoseVisualizer._draw_trails`
-**Files:**
-- Modify: `formscout/agents/visualizer.py`
-- Modify: `tests/test_visualizer.py`
-- [ ] **Step 1: Write failing test**
-Append to `tests/test_visualizer.py`:
-```python
-class TestDrawTrails:
-    def test_trails_draw_without_error(self):
-        from formscout.agents.visualizer import PoseVisualizer, TRAIL_LENGTH
-        from collections import deque
-        vis = PoseVisualizer()
-        frame = np.zeros((480, 640, 3), dtype=np.uint8)
-        # Build a trail history for joint 0 with 5 positions
-        trail_history = {
-            0: deque([(100 + i * 5, 200 + i * 3) for i in range(5)], maxlen=TRAIL_LENGTH)
-        }
-        result = vis._draw_trails(frame.copy(), trail_history)
-        assert result.shape == frame.shape
-        # Trail should modify at least some pixels
-        assert not np.array_equal(result, frame)
-    def test_short_trail_no_crash(self):
-        from formscout.agents.visualizer import PoseVisualizer, TRAIL_LENGTH
-        from collections import deque
-        vis = PoseVisualizer()
-        frame = np.zeros((480, 640, 3), dtype=np.uint8)
-        # Only one point — no line possible
-        trail_history = {0: deque([(100, 200)], maxlen=TRAIL_LENGTH)}
-        result = vis._draw_trails(frame.copy(), trail_history)
-        # No crash, frame unchanged (single point = no segment)
-        assert np.array_equal(result, frame)
-```
-- [ ] **Step 2: Run to confirm failure**
-```bash
-pytest tests/test_visualizer.py::TestDrawTrails -v
-```
-Expected: FAIL — `AttributeError: 'PoseVisualizer' object has no attribute '_draw_trails'`
-- [ ] **Step 3: Add `_draw_trails` to `PoseVisualizer`**
-Inside the `PoseVisualizer` class, after `_draw_skeleton`:
-```python
-    # ── Trails ───────────────────────────────────────────────────────────────
-    def _draw_trails(self, frame: np.ndarray, trail_history: dict) -> np.ndarray:
-        """Draw fading motion trails for each joint."""
-        for joint_idx, trail in trail_history.items():
-            pts = list(trail)
-            if len(pts) < 2:
-                continue
-            for i in range(1, len(pts)):
-                alpha = i / len(pts)
-                brightness = int(255 * alpha)
-                color = (brightness, brightness, brightness)
-                thickness = max(1, int(3 * alpha))
-                p1 = (int(pts[i - 1][0]), int(pts[i - 1][1]))
-                p2 = (int(pts[i][0]), int(pts[i][1]))
-                cv2.line(frame, p1, p2, color, thickness)
-        return frame
-```
-- [ ] **Step 4: Run tests**
-```bash
-pytest tests/test_visualizer.py::TestDrawTrails -v
-```
-Expected: 2 PASS
-- [ ] **Step 5: Commit**
-```bash
-git add formscout/agents/visualizer.py tests/test_visualizer.py
-git commit -m "feat: PoseVisualizer._draw_trails with fading alpha"
-```
----
-## Task 4: `PoseVisualizer._draw_velocity_arrows`
-**Files:**
-- Modify: `formscout/agents/visualizer.py`
-- Modify: `tests/test_visualizer.py`
-- [ ] **Step 1: Write failing test**
-Append to `tests/test_visualizer.py`:
-```python
-class TestDrawVelocityArrows:
-    def test_arrows_draw_without_error(self):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        frame = np.zeros((480, 640, 3), dtype=np.uint8)
-        kps = {j: {"x": float(50 + j * 30), "y": float(100 + j * 20), "conf": 0.9}
-               for j in range(17)}
-        prev_kps = {j: {"x": float(48 + j * 30), "y": float(98 + j * 20), "conf": 0.9}
-                    for j in range(17)}
-        # velocities: joint 5 moving fast
-        velocities = {j: [0.0] * 5 for j in range(17)}
-        velocities[5] = [0.0, 10.0, 50.0, 80.0, 120.0]
-        result = vis._draw_velocity_arrows(frame.copy(), kps, prev_kps, velocities, frame_idx=4)
-        assert result.shape == frame.shape
-    def test_no_prev_kps_no_crash(self):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        frame = np.zeros((480, 640, 3), dtype=np.uint8)
-        kps = {j: {"x": float(50 + j * 30), "y": 100.0, "conf": 0.9} for j in range(17)}
-        velocities = {j: [50.0] * 5 for j in range(17)}
-        # prev_kps is None — should skip without crash
-        result = vis._draw_velocity_arrows(frame.copy(), kps, None, velocities, frame_idx=0)
-        assert result.shape == frame.shape
-```
-- [ ] **Step 2: Run to confirm failure**
-```bash
-pytest tests/test_visualizer.py::TestDrawVelocityArrows -v
-```
-Expected: FAIL — `AttributeError: 'PoseVisualizer' object has no attribute '_draw_velocity_arrows'`
-- [ ] **Step 3: Add `_draw_velocity_arrows` to `PoseVisualizer`**
-Inside the `PoseVisualizer` class, after `_draw_trails`:
-```python
-    # ── Velocity arrows ───────────────────────────────────────────────────────
-    def _draw_velocity_arrows(
-        self,
-        frame: np.ndarray,
-        kps: dict,
-        prev_kps: dict | None,
-        velocities: dict[int, list[float]],
-        frame_idx: int,
-    ) -> np.ndarray:
-        """Draw per-joint velocity arrows scaled by speed."""
-        if prev_kps is None:
-            return frame
-        all_speeds = [velocities[j][frame_idx] for j in range(17) if frame_idx < len(velocities.get(j, []))]
-        peak = max(all_speeds) if all_speeds else 1.0
-        if peak == 0.0:
-            return frame
-        for j in range(17):
-            kp = kps.get(j)
-            pk = prev_kps.get(j)
-            if not kp or not pk:
-                continue
-            if kp.get("conf", 0.0) < CONF_THRESHOLD:
-                continue
-            speeds = velocities.get(j, [])
-            if frame_idx >= len(speeds):
-                continue
-            speed = speeds[frame_idx]
-            if speed == 0.0:
-                continue
-            dx = kp["x"] - pk["x"]
-            dy = kp["y"] - pk["y"]
-            mag = math.sqrt(dx * dx + dy * dy)
-            if mag < 1e-6:
-                continue
-            # Normalize direction, scale to arrow length
-            length = min(speed / peak * MAX_ARROW_PX, MAX_ARROW_PX)
-            nx, ny = dx / mag, dy / mag
-            start = (int(kp["x"]), int(kp["y"]))
-            end = (int(kp["x"] + nx * length), int(kp["y"] + ny * length))
-            ratio = speed / peak
-            if ratio < 0.33:
-                color = (0, 200, 0)     # green
-            elif ratio < 0.66:
-                color = (0, 140, 255)   # orange
-            else:
-                color = (0, 0, 255)     # red
-            cv2.arrowedLine(frame, start, end, color, 2, tipLength=0.35)
-        return frame
-```
-- [ ] **Step 4: Run tests**
-```bash
-pytest tests/test_visualizer.py::TestDrawVelocityArrows -v
-```
-Expected: 2 PASS
-- [ ] **Step 5: Commit**
-```bash
-git add formscout/agents/visualizer.py tests/test_visualizer.py
-git commit -m "feat: PoseVisualizer._draw_velocity_arrows speed-colored"
-```
----
-## Task 5: `render_video` + `build_velocity_summary`
-**Files:**
-- Modify: `formscout/agents/visualizer.py`
-- Modify: `tests/test_visualizer.py`
-- [ ] **Step 1: Write failing tests**
-Append to `tests/test_visualizer.py`:
-```python
-class TestRenderVideo:
-    def test_creates_mp4_file(self, tmp_path):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        ingest = _make_ingest(n=5)
-        pose = _make_pose(n=5)
-        out = str(tmp_path / "out.mp4")
-        result = vis.render_video(ingest, pose, {"skeleton"}, out)
-        assert result is not None
-        import os
-        assert os.path.exists(result)
-        assert os.path.getsize(result) > 0
-    def test_empty_layers_returns_none(self, tmp_path):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        out = str(tmp_path / "out.mp4")
-        result = vis.render_video(_make_ingest(), _make_pose(), set(), out)
-        assert result is None
-    def test_no_detections_returns_none(self, tmp_path):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        ingest = _make_ingest(n=5)
-        empty_pose = Pose2DResult(
-            keypoints=[{} for _ in range(5)], fps=30.0, confidence=0.0, notes=""
-        )
-        out = str(tmp_path / "out.mp4")
-        result = vis.render_video(ingest, empty_pose, {"skeleton"}, out)
-        assert result is None
-    def test_last_velocities_set_after_render(self, tmp_path):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        out = str(tmp_path / "out.mp4")
-        vis.render_video(_make_ingest(n=5), _make_pose(n=5), {"skeleton"}, out)
-        assert len(vis.last_velocities) == 17
-class TestBuildVelocitySummary:
-    def test_returns_markdown_table(self):
-        from formscout.agents.visualizer import build_velocity_summary, compute_joint_velocity
-        pose = _make_pose(n=10)
-        vels = compute_joint_velocity(pose.keypoints, fps=30.0)
-        result = build_velocity_summary(pose.keypoints, vels)
-        assert "|" in result
-        # At least one COCO joint name appears
-        assert any(name in result for name in ["knee", "shoulder", "hip", "ankle"])
-    def test_empty_keypoints_returns_empty_string(self):
-        from formscout.agents.visualizer import build_velocity_summary
-        empty_kps = [{} for _ in range(5)]
-        vels = {j: [0.0] * 5 for j in range(17)}
-        result = build_velocity_summary(empty_kps, vels)
-        assert result == ""
-```
-- [ ] **Step 2: Run to confirm failure**
-```bash
-pytest tests/test_visualizer.py::TestRenderVideo tests/test_visualizer.py::TestBuildVelocitySummary -v
-```
-Expected: FAIL — `AttributeError: 'PoseVisualizer' object has no attribute 'render_video'`
-- [ ] **Step 3: Add `render_video` to `PoseVisualizer`**
-Inside the `PoseVisualizer` class, after `_draw_velocity_arrows`:
-```python
-    # ── Public ────────────────────────────────────────────────────────────────
-    def render_video(
-        self,
-        ingest,
-        pose2d,
-        layers: set[str],
-        output_path: str,
-    ) -> str | None:
-        """
-        Render annotated video. Returns output_path on success, None otherwise.
-        layers: subset of {"skeleton", "trails", "velocity_arrows"}
-        """
-        if not layers:
-            return None
-        # Require at least one detected frame
-        if not any(pose2d.keypoints):
-            return None
-        try:
-            velocities = compute_joint_velocity(pose2d.keypoints, ingest.fps)
-            self.last_velocities = velocities
-            frames = ingest.frames
-            h, w = frames[0].shape[:2]
-            fps = ingest.fps or 30.0
-            fourcc = cv2.VideoWriter_fourcc(*"mp4v")
-            writer = cv2.VideoWriter(output_path, fourcc, fps, (w, h))
-            if not writer.isOpened():
-                logger.warning("VideoWriter failed to open: %s", output_path)
-                return None
-            trail_history: dict[int, deque] = {j: deque(maxlen=TRAIL_LENGTH) for j in range(17)}
-            prev_kps: dict | None = None
-            for frame_idx, (frame, kps) in enumerate(zip(frames, pose2d.keypoints)):
-                out_frame = frame.copy()
-                if "trails" in layers:
-                    # Update trail history before drawing
-                    for j, kp in kps.items():
-                        if kp.get("conf", 0.0) >= CONF_THRESHOLD:
-                            trail_history[j].append((kp["x"], kp["y"]))
-                    out_frame = self._draw_trails(out_frame, trail_history)
-                if "skeleton" in layers:
-                    out_frame = self._draw_skeleton(out_frame, kps)
-                if "velocity_arrows" in layers:
-                    out_frame = self._draw_velocity_arrows(
-                        out_frame, kps, prev_kps, velocities, frame_idx
-                    )
-                writer.write(out_frame)
-                prev_kps = kps
-            writer.release()
-            return output_path
-        except Exception as e:
-            logger.warning("render_video failed: %s", e)
-            return None
-```
-- [ ] **Step 4: Add `build_velocity_summary` after the class**
-After the `PoseVisualizer` class definition, add:
-```python
-# ── Velocity summary ──────────────────────────────────────────────────────────
-def build_velocity_summary(
-    keypoints_per_frame: list[dict],
-    velocities: dict[int, list[float]],
-) -> str:
-    """Return markdown table of per-joint avg/peak velocity. Empty string if no valid joints."""
-    n_frames = len(keypoints_per_frame)
-    if n_frames == 0:
-        return ""
-    rows = []
-    for j in range(17):
-        # Count frames where this joint is detected
-        detected = sum(
-            1 for kps in keypoints_per_frame
-            if kps.get(j, {}).get("conf", 0.0) >= CONF_THRESHOLD
-        )
-        if detected < n_frames * 0.5:
-            continue  # skip joints present in <50% of frames
-        speeds = velocities.get(j, [])
-        if not speeds:
-            continue
-        avg_speed = sum(speeds) / len(speeds)
-        peak_speed = max(speeds)
-        rows.append((COCO_KEYPOINTS[j], avg_speed, peak_speed))
-    if not rows:
-        return ""
-    rows.sort(key=lambda r: r[2], reverse=True)  # sort by peak descending
-    lines = [
-        "| Joint | Avg (px/s) | Peak (px/s) |",
-        "|---|---|---|",
-    ]
-    for name, avg, peak in rows:
-        lines.append(f"| {name} | {avg:.1f} | {peak:.1f} |")
-    return "\n".join(lines)
-```
-- [ ] **Step 5: Run all visualizer tests**
-```bash
-pytest tests/test_visualizer.py -v
-```
-Expected: all tests PASS (4 + 2 + 2 + 2 + 4 + 2 = 16 total)
-- [ ] **Step 6: Commit**
-```bash
-git add formscout/agents/visualizer.py tests/test_visualizer.py
-git commit -m "feat: PoseVisualizer.render_video + build_velocity_summary (16 tests pass)"
-```
----
-## Task 6: Wire `app.py`
-**Files:**
-- Modify: `app.py`
-- [ ] **Step 1: Add `import tempfile` if not present and import visualizer in `process_video`**
-Check the top of `app.py` for `import tempfile`. If missing, add it alongside the other stdlib imports. (Look at the existing import block and add `import tempfile` there.)
-- [ ] **Step 2: Update `process_video()` signature and body**
-Replace the existing `process_video` function (lines 46–83) with:
-```python
-def process_video(video_path: str, test_name: str, side: str, model_key: str, layers: list[str]):
-    """Process an uploaded video through the FormScout pipeline."""
-    if not video_path:
-        return (
-            _render_empty_state(),
-            "Upload a video to begin analysis.",
-            "",
-            "",
-            None,
-            "",
-        )
-    director = Director()
-    state = director.run(video_path, test_name=test_name, side=side, model_key=model_key)
-    # ─── Score card ───
-    score_html = _render_empty_state()
-    score_details = ""
-    if state.features:
-        result = score_test(state.features)
-        judge = state.judge
-        if judge and judge.score is not None:
-            score_html = _render_score_card(judge.score, judge.confidence, judge.needs_human)
-            score_details = _render_score_details_judge(judge, result, state.features)
-        elif judge and judge.needs_human:
-            score_html = _render_score_card(0, 0, True)
-            score_details = f"### Needs Clinician Review\n{judge.rationale}"
-        else:
-            score_html = _render_score_card(result.score, result.confidence, result.needs_human)
-            score_details = _render_score_details(result, state.features)
-    # ─── Pipeline info ───
-    pipeline_md = _render_pipeline_status(state)
-    # ─── Warnings/errors ───
-    alerts = _render_alerts(state)
-    # ─── Overlay video ───
-    overlay_path = None
-    vel_summary = ""
-    layer_set = {lbl.lower().replace(" ", "_") for lbl in (layers or [])}
-    if layer_set and state.ingest and state.pose2d:
-        try:
-            from formscout.agents.visualizer import PoseVisualizer, build_velocity_summary
-            vis = PoseVisualizer()
-            with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as f:
-                out_path = f.name
-            overlay_path = vis.render_video(state.ingest, state.pose2d, layer_set, out_path)
-            if overlay_path:
-                vel_summary = build_velocity_summary(state.pose2d.keypoints, vis.last_velocities)
-        except Exception as e:
-            alerts = (alerts or "") + f"\n⚠️ Visualizer error: {e}"
-    return score_html, pipeline_md, score_details, alerts, overlay_path, vel_summary
-```
-- [ ] **Step 3: Add `overlay_layers` CheckboxGroup in `build_app()`**
-After the `pose_model_dropdown` block (around line 270), and before `submit_btn`:
-```python
-                overlay_layers = gr.CheckboxGroup(
-                    choices=["Skeleton", "Trails", "Velocity arrows"],
-                    value=["Skeleton", "Trails"],
-                    label="Overlay Layers",
-                )
-```
-- [ ] **Step 4: Add overlay tab in the results panel**
-Inside the `with gr.Tabs():` block (after the `⚠️ Alerts` tab):
-```python
-                    with gr.TabItem("🎬 Overlay Video"):
-                        overlay_video = gr.Video(label="Annotated Movement")
-                        velocity_md = gr.Markdown("")
-```
-- [ ] **Step 5: Update `_map_inputs` and `submit_btn.click`**
-Replace the `_map_inputs` closure and `submit_btn.click` call:
-```python
-        def _map_inputs(video, test_display_name, side_display, pose_model_key, overlay_layers):
-            """Map UI display values to internal values."""
-            test_map = {name: val for name, val in FMS_TESTS}
-            test_name = test_map.get(test_display_name, "deep_squat")
-            side = {"N/A": "na", "Left": "left", "Right": "right"}.get(side_display, "na")
-            return process_video(video, test_name, side, pose_model_key, overlay_layers)
-        submit_btn.click(
-            fn=_map_inputs,
-            inputs=[video_input, test_dropdown, side_dropdown, pose_model_dropdown, overlay_layers],
-            outputs=[score_html, pipeline_md, score_details, alerts_md, overlay_video, velocity_md],
-        )
-```
-- [ ] **Step 6: Smoke-test the app builds**
-```bash
-python3 -c "from app import build_app; build_app(); print('ok')"
-```
-Expected: `ok` (Gradio UserWarning about theme is fine, not an error)
-- [ ] **Step 7: Run full test suite to check for regressions**
-```bash
-pytest tests/ -v --tb=short 2>&1 | tail -15
-```
-Expected: all previous tests still pass (62 passing, 1 pre-existing fail in biomechanics), plus 16 new visualizer tests = 78 passing.
-- [ ] **Step 8: Commit**
-```bash
-git add app.py
-git commit -m "feat: overlay video tab + velocity summary wired in Gradio UI"
-```
----
-## Self-review
-**Spec coverage:**
-- ✅ `SimpleKalmanFilter` 4-state (Task 1)
-- ✅ `compute_joint_velocity` Kalman-filtered px/s (Task 1)
-- ✅ `_draw_skeleton` COCO bones, confidence-colored joints (Task 2)
-- ✅ `_draw_trails` fading deque-based trails (Task 3)
-- ✅ `_draw_velocity_arrows` speed-colored, direction from consecutive frames (Task 4)
-- ✅ `render_video` layer dispatch, trail history, VideoWriter (Task 5)
-- ✅ `build_velocity_summary` markdown table, >50% detection filter (Task 5)
-- ✅ `overlay_layers` CheckboxGroup in UI (Task 6)
-- ✅ New `🎬 Overlay Video` tab with `gr.Video` + `gr.Markdown` (Task 6)
-- ✅ `process_video` wired with layers param (Task 6)
-- ✅ `vis.last_velocities` stored on instance after `render_video` (Task 5)
-- ✅ Error handling: empty layers → None, empty detections → None, exception → alerts (Task 5 + 6)
-- ✅ All 5 spec test cases covered across Tasks 1–5
-**Placeholder scan:** None found. All code blocks are complete.
-**Type consistency:**
-- `compute_joint_velocity` returns `dict[int, list[float]]` — used identically in `render_video`, `_draw_velocity_arrows`, and `build_velocity_summary`. ✓
-- `layers: set[str]` in `render_video`; converted from `list[str]` in `process_video` via set comprehension. ✓
-- `vis.last_velocities` set in `render_video`, read in `process_video`. ✓
-- `_draw_velocity_arrows(frame, kps, prev_kps, velocities, frame_idx)` — signature matches call in `render_video`. ✓

+# Pose Overlay Visualizer Implementation Plan
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+**Goal:** Add a pose overlay video output to FormScout with skeleton, motion trails, and velocity arrows, plus a per-joint velocity summary table.
+**Architecture:** A new `formscout/agents/visualizer.py` runs after `director.run()` in `process_video()`; it uses Kalman-filtered per-joint velocity and OpenCV rendering. `app.py` gains a `gr.CheckboxGroup` for layer selection, a new `gr.Video` output tab, and a `gr.Markdown` velocity summary.
+**Tech Stack:** `opencv-python`, `numpy`, `colorsys` (stdlib), `gradio`.
+---
+## File map
+| File | Change |
+|---|---|
+| `formscout/agents/visualizer.py` | Create — Kalman filter, velocity, PoseVisualizer, summary |
+| `tests/test_visualizer.py` | Create — all visualizer tests |
+| `app.py` | Modify — overlay_layers checkbox, new tab, wiring |
+---
+## Task 1: `SimpleKalmanFilter` + `compute_joint_velocity`
+**Files:**
+- Create: `formscout/agents/visualizer.py`
+- Create: `tests/test_visualizer.py`
+- [ ] **Step 1: Write failing tests**
+Create `tests/test_visualizer.py`:
+```python
+"""Tests for PoseVisualizer — no GPU, no model downloads."""
+import numpy as np
+import pytest
+from formscout.types import IngestResult, Pose2DResult
+def _make_ingest(n=5, h=480, w=640, fps=30.0):
+    frames = [np.zeros((h, w, 3), dtype=np.uint8) for _ in range(n)]
+    return IngestResult(frames=frames, fps=fps, duration=n/fps, n_people=1, width=w, height=h)
+def _make_pose(n=5, w=640, h=480):
+    """Synthetic Pose2DResult: 17 joints at fixed pixel positions, conf=0.9."""
+    kps_per_frame = []
+    for i in range(n):
+        frame_kps = {}
+        for j in range(17):
+            frame_kps[j] = {
+                "x": float(50 + j * 30 + i * 2),  # slight movement each frame
+                "y": float(100 + j * 20),
+                "conf": 0.9,
+            }
+        kps_per_frame.append(frame_kps)
+    return Pose2DResult(keypoints=kps_per_frame, fps=30.0, confidence=0.9, notes="")
+class TestComputeJointVelocity:
+    def test_returns_17_joints(self):
+        from formscout.agents.visualizer import compute_joint_velocity
+        pose = _make_pose(n=5)
+        result = compute_joint_velocity(pose.keypoints, fps=30.0)
+        assert len(result) == 17
+    def test_each_list_has_n_frames(self):
+        from formscout.agents.visualizer import compute_joint_velocity
+        pose = _make_pose(n=5)
+        result = compute_joint_velocity(pose.keypoints, fps=30.0)
+        for joint_idx, speeds in result.items():
+            assert len(speeds) == 5, f"joint {joint_idx} has {len(speeds)} speeds, expected 5"
+    def test_speeds_are_non_negative(self):
+        from formscout.agents.visualizer import compute_joint_velocity
+        pose = _make_pose(n=5)
+        result = compute_joint_velocity(pose.keypoints, fps=30.0)
+        for speeds in result.values():
+            assert all(s >= 0.0 for s in speeds)
+    def test_missing_keypoints_give_zero_speed(self):
+        from formscout.agents.visualizer import compute_joint_velocity
+        # All frames empty
+        empty_kps = [{} for _ in range(5)]
+        result = compute_joint_velocity(empty_kps, fps=30.0)
+        for speeds in result.values():
+            assert all(s == 0.0 for s in speeds)
+```
+- [ ] **Step 2: Run to confirm failure**
+```bash
+pytest tests/test_visualizer.py::TestComputeJointVelocity -v
+```
+Expected: `ERROR` — `ModuleNotFoundError: No module named 'formscout.agents.visualizer'`
+- [ ] **Step 3: Create `formscout/agents/visualizer.py` with Kalman + velocity**
+```python
+"""
+PoseVisualizer — annotated overlay video with skeleton, trails, velocity arrows.
+Input:  IngestResult + Pose2DResult
+Output: .mp4 path (or None on failure/empty layers)
+Failure: returns None, never raises.
+"""
+from __future__ import annotations
+import colorsys
+import logging
+import math
+import tempfile
+from collections import deque
+import cv2
+import numpy as np
+logger = logging.getLogger(__name__)
+# ── COCO constants ────────────────────────────────────────────────────────────
+COCO_KEYPOINTS = [
+    "nose", "left_eye", "right_eye", "left_ear", "right_ear",
+    "left_shoulder", "right_shoulder", "left_elbow", "right_elbow",
+    "left_wrist", "right_wrist", "left_hip", "right_hip",
+    "left_knee", "right_knee", "left_ankle", "right_ankle",
+]
+COCO_SKELETON = [
+    (0, 1), (0, 2), (1, 3), (2, 4),          # face
+    (5, 6), (5, 7), (7, 9), (6, 8), (8, 10), # arms
+    (5, 11), (6, 12), (11, 12),               # torso
+    (11, 13), (13, 15), (12, 14), (14, 16),  # legs
+]
+TRAIL_LENGTH = 10
+MAX_ARROW_PX = 40
+CONF_THRESHOLD = 0.3
+# ── Kalman filter ─────────────────────────────────────────────────────────────
+class SimpleKalmanFilter:
+    """4-state Kalman filter (x, y, vx, vy) for joint tracking."""
+    def __init__(self, process_noise: float = 0.01, measurement_noise: float = 0.1):
+        self.is_initialized = False
+        self.state = np.zeros(4)
+        self.cov = np.eye(4) * 0.1
+        self.Q = np.eye(4) * process_noise
+        self.R = np.eye(2) * measurement_noise
+        self.H = np.array([[1, 0, 0, 0], [0, 1, 0, 0]], dtype=float)
+    def predict(self, dt: float = 1.0):
+        F = np.array([[1, 0, dt, 0], [0, 1, 0, dt], [0, 0, 1, 0], [0, 0, 0, 1]], dtype=float)
+        self.state = F @ self.state
+        self.cov = F @ self.cov @ F.T + self.Q
+    def update(self, x: float, y: float):
+        z = np.array([x, y])
+        if not self.is_initialized:
+            self.state[:2] = z
+            self.is_initialized = True
+            return
+        S = self.H @ self.cov @ self.H.T + self.R
+        K = self.cov @ self.H.T @ np.linalg.inv(S)
+        self.state = self.state + K @ (z - self.H @ self.state)
+        self.cov = (np.eye(4) - K @ self.H) @ self.cov
+    def velocity_magnitude(self) -> float:
+        vx, vy = self.state[2], self.state[3]
+        return math.sqrt(vx * vx + vy * vy)
+    def velocity_vector(self) -> tuple[float, float]:
+        return float(self.state[2]), float(self.state[3])
+# ── Velocity computation ──────────────────────────────────────────────────────
+def compute_joint_velocity(
+    keypoints_per_frame: list[dict],
+    fps: float,
+) -> dict[int, list[float]]:
+    """
+    Compute Kalman-filtered per-joint speed (px/s) for each frame.
+    Returns dict[joint_idx, [speed_frame0, speed_frame1, ...]] for all 17 COCO joints.
+    Missing/low-confidence keypoints yield speed=0.0 for that frame.
+    """
+    dt = 1.0 / fps if fps > 0 else 1.0
+    filters: dict[int, SimpleKalmanFilter] = {j: SimpleKalmanFilter() for j in range(17)}
+    result: dict[int, list[float]] = {j: [] for j in range(17)}
+    for frame_kps in keypoints_per_frame:
+        for j in range(17):
+            kf = filters[j]
+            kp = frame_kps.get(j)
+            kf.predict(dt)
+            if kp and kp.get("conf", 0.0) >= CONF_THRESHOLD:
+                kf.update(kp["x"], kp["y"])
+                speed = kf.velocity_magnitude()
+            else:
+                speed = 0.0
+            result[j].append(speed)
+    return result
+```
+- [ ] **Step 4: Run tests**
+```bash
+pytest tests/test_visualizer.py::TestComputeJointVelocity -v
+```
+Expected: 4 PASS
+- [ ] **Step 5: Commit**
+```bash
+git add formscout/agents/visualizer.py tests/test_visualizer.py
+git commit -m "feat: SimpleKalmanFilter + compute_joint_velocity (4 tests pass)"
+```
+---
+## Task 2: `PoseVisualizer._draw_skeleton`
+**Files:**
+- Modify: `formscout/agents/visualizer.py`
+- Modify: `tests/test_visualizer.py`
+- [ ] **Step 1: Write failing test**
+Append to `tests/test_visualizer.py`:
+```python
+class TestDrawSkeleton:
+    def test_skeleton_draws_without_error(self):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        frame = np.zeros((480, 640, 3), dtype=np.uint8)
+        kps = {j: {"x": float(50 + j * 30), "y": float(100 + j * 20), "conf": 0.9}
+               for j in range(17)}
+        result = vis._draw_skeleton(frame.copy(), kps)
+        assert result.shape == frame.shape
+        # Frame must be modified (not all zeros after drawing)
+        assert not np.array_equal(result, frame)
+    def test_low_confidence_keypoints_not_drawn(self):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        frame = np.zeros((480, 640, 3), dtype=np.uint8)
+        # All keypoints below threshold
+        kps = {j: {"x": float(50 + j * 30), "y": 100.0, "conf": 0.1} for j in range(17)}
+        result = vis._draw_skeleton(frame.copy(), kps)
+        # Nothing drawn — frame stays all zeros
+        assert np.array_equal(result, frame)
+```
+- [ ] **Step 2: Run to confirm failure**
+```bash
+pytest tests/test_visualizer.py::TestDrawSkeleton -v
+```
+Expected: FAIL — `AttributeError: 'PoseVisualizer' object has no attribute '_draw_skeleton'`
+- [ ] **Step 3: Add `PoseVisualizer` class with `_draw_skeleton` to `visualizer.py`**
+Append after `compute_joint_velocity`:
+```python
+# ── Helpers ───────────────────────────────────────────────────────────────────
+def _conf_to_bgr(conf: float) -> tuple[int, int, int]:
+    """Map confidence 0→1 to BGR color red→green via HSV."""
+    hue = conf * 120.0 / 360.0
+    r, g, b = colorsys.hsv_to_rgb(hue, 1.0, 1.0)
+    return (int(b * 255), int(g * 255), int(r * 255))
+# ── PoseVisualizer ────────────────────────────────────────────────────────────
+class PoseVisualizer:
+    """Renders skeleton, trails, and velocity arrows onto video frames."""
+    def __init__(self):
+        self.last_velocities: dict[int, list[float]] = {}
+    # ── Skeleton ──────────────────────────────────────────────────────────────
+    def _draw_skeleton(self, frame: np.ndarray, kps: dict) -> np.ndarray:
+        """Draw COCO-17 bones (white) and joints (confidence-colored) onto frame."""
+        visible = {j: kp for j, kp in kps.items() if kp.get("conf", 0.0) >= CONF_THRESHOLD}
+        # Bones
+        for j1, j2 in COCO_SKELETON:
+            if j1 in visible and j2 in visible:
+                p1 = (int(visible[j1]["x"]), int(visible[j1]["y"]))
+                p2 = (int(visible[j2]["x"]), int(visible[j2]["y"]))
+                cv2.line(frame, p1, p2, (255, 255, 255), 2)
+        # Joints
+        for j, kp in visible.items():
+            pt = (int(kp["x"]), int(kp["y"]))
+            color = _conf_to_bgr(kp["conf"])
+            cv2.circle(frame, pt, 4, color, -1)
+            cv2.circle(frame, pt, 5, (255, 255, 255), 1)
+        return frame
+```
+- [ ] **Step 4: Run tests**
+```bash
+pytest tests/test_visualizer.py::TestDrawSkeleton -v
+```
+Expected: 2 PASS
+- [ ] **Step 5: Commit**
+```bash
+git add formscout/agents/visualizer.py tests/test_visualizer.py
+git commit -m "feat: PoseVisualizer._draw_skeleton with confidence-colored joints"
+```
+---
+## Task 3: `PoseVisualizer._draw_trails`
+**Files:**
+- Modify: `formscout/agents/visualizer.py`
+- Modify: `tests/test_visualizer.py`
+- [ ] **Step 1: Write failing test**
+Append to `tests/test_visualizer.py`:
+```python
+class TestDrawTrails:
+    def test_trails_draw_without_error(self):
+        from formscout.agents.visualizer import PoseVisualizer, TRAIL_LENGTH
+        from collections import deque
+        vis = PoseVisualizer()
+        frame = np.zeros((480, 640, 3), dtype=np.uint8)
+        # Build a trail history for joint 0 with 5 positions
+        trail_history = {
+            0: deque([(100 + i * 5, 200 + i * 3) for i in range(5)], maxlen=TRAIL_LENGTH)
+        }
+        result = vis._draw_trails(frame.copy(), trail_history)
+        assert result.shape == frame.shape
+        # Trail should modify at least some pixels
+        assert not np.array_equal(result, frame)
+    def test_short_trail_no_crash(self):
+        from formscout.agents.visualizer import PoseVisualizer, TRAIL_LENGTH
+        from collections import deque
+        vis = PoseVisualizer()
+        frame = np.zeros((480, 640, 3), dtype=np.uint8)
+        # Only one point — no line possible
+        trail_history = {0: deque([(100, 200)], maxlen=TRAIL_LENGTH)}
+        result = vis._draw_trails(frame.copy(), trail_history)
+        # No crash, frame unchanged (single point = no segment)
+        assert np.array_equal(result, frame)
+```
+- [ ] **Step 2: Run to confirm failure**
+```bash
+pytest tests/test_visualizer.py::TestDrawTrails -v
+```
+Expected: FAIL — `AttributeError: 'PoseVisualizer' object has no attribute '_draw_trails'`
+- [ ] **Step 3: Add `_draw_trails` to `PoseVisualizer`**
+Inside the `PoseVisualizer` class, after `_draw_skeleton`:
+```python
+    # ── Trails ───────────────────────────────────────────────────────────────
+    def _draw_trails(self, frame: np.ndarray, trail_history: dict) -> np.ndarray:
+        """Draw fading motion trails for each joint."""
+        for joint_idx, trail in trail_history.items():
+            pts = list(trail)
+            if len(pts) < 2:
+                continue
+            for i in range(1, len(pts)):
+                alpha = i / len(pts)
+                brightness = int(255 * alpha)
+                color = (brightness, brightness, brightness)
+                thickness = max(1, int(3 * alpha))
+                p1 = (int(pts[i - 1][0]), int(pts[i - 1][1]))
+                p2 = (int(pts[i][0]), int(pts[i][1]))
+                cv2.line(frame, p1, p2, color, thickness)
+        return frame
+```
+- [ ] **Step 4: Run tests**
+```bash
+pytest tests/test_visualizer.py::TestDrawTrails -v
+```
+Expected: 2 PASS
+- [ ] **Step 5: Commit**
+```bash
+git add formscout/agents/visualizer.py tests/test_visualizer.py
+git commit -m "feat: PoseVisualizer._draw_trails with fading alpha"
+```
+---
+## Task 4: `PoseVisualizer._draw_velocity_arrows`
+**Files:**
+- Modify: `formscout/agents/visualizer.py`
+- Modify: `tests/test_visualizer.py`
+- [ ] **Step 1: Write failing test**
+Append to `tests/test_visualizer.py`:
+```python
+class TestDrawVelocityArrows:
+    def test_arrows_draw_without_error(self):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        frame = np.zeros((480, 640, 3), dtype=np.uint8)
+        kps = {j: {"x": float(50 + j * 30), "y": float(100 + j * 20), "conf": 0.9}
+               for j in range(17)}
+        prev_kps = {j: {"x": float(48 + j * 30), "y": float(98 + j * 20), "conf": 0.9}
+                    for j in range(17)}
+        # velocities: joint 5 moving fast
+        velocities = {j: [0.0] * 5 for j in range(17)}
+        velocities[5] = [0.0, 10.0, 50.0, 80.0, 120.0]
+        result = vis._draw_velocity_arrows(frame.copy(), kps, prev_kps, velocities, frame_idx=4)
+        assert result.shape == frame.shape
+    def test_no_prev_kps_no_crash(self):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        frame = np.zeros((480, 640, 3), dtype=np.uint8)
+        kps = {j: {"x": float(50 + j * 30), "y": 100.0, "conf": 0.9} for j in range(17)}
+        velocities = {j: [50.0] * 5 for j in range(17)}
+        # prev_kps is None — should skip without crash
+        result = vis._draw_velocity_arrows(frame.copy(), kps, None, velocities, frame_idx=0)
+        assert result.shape == frame.shape
+```
+- [ ] **Step 2: Run to confirm failure**
+```bash
+pytest tests/test_visualizer.py::TestDrawVelocityArrows -v
+```
+Expected: FAIL — `AttributeError: 'PoseVisualizer' object has no attribute '_draw_velocity_arrows'`
+- [ ] **Step 3: Add `_draw_velocity_arrows` to `PoseVisualizer`**
+Inside the `PoseVisualizer` class, after `_draw_trails`:
+```python
+    # ── Velocity arrows ───────────────────────────────────────────────────────
+    def _draw_velocity_arrows(
+        self,
+        frame: np.ndarray,
+        kps: dict,
+        prev_kps: dict | None,
+        velocities: dict[int, list[float]],
+        frame_idx: int,
+    ) -> np.ndarray:
+        """Draw per-joint velocity arrows scaled by speed."""
+        if prev_kps is None:
+            return frame
+        all_speeds = [velocities[j][frame_idx] for j in range(17) if frame_idx < len(velocities.get(j, []))]
+        peak = max(all_speeds) if all_speeds else 1.0
+        if peak == 0.0:
+            return frame
+        for j in range(17):
+            kp = kps.get(j)
+            pk = prev_kps.get(j)
+            if not kp or not pk:
+                continue
+            if kp.get("conf", 0.0) < CONF_THRESHOLD:
+                continue
+            speeds = velocities.get(j, [])
+            if frame_idx >= len(speeds):
+                continue
+            speed = speeds[frame_idx]
+            if speed == 0.0:
+                continue
+            dx = kp["x"] - pk["x"]
+            dy = kp["y"] - pk["y"]
+            mag = math.sqrt(dx * dx + dy * dy)
+            if mag < 1e-6:
+                continue
+            # Normalize direction, scale to arrow length
+            length = min(speed / peak * MAX_ARROW_PX, MAX_ARROW_PX)
+            nx, ny = dx / mag, dy / mag
+            start = (int(kp["x"]), int(kp["y"]))
+            end = (int(kp["x"] + nx * length), int(kp["y"] + ny * length))
+            ratio = speed / peak
+            if ratio < 0.33:
+                color = (0, 200, 0)     # green
+            elif ratio < 0.66:
+                color = (0, 140, 255)   # orange
+            else:
+                color = (0, 0, 255)     # red
+            cv2.arrowedLine(frame, start, end, color, 2, tipLength=0.35)
+        return frame
+```
+- [ ] **Step 4: Run tests**
+```bash
+pytest tests/test_visualizer.py::TestDrawVelocityArrows -v
+```
+Expected: 2 PASS
+- [ ] **Step 5: Commit**
+```bash
+git add formscout/agents/visualizer.py tests/test_visualizer.py
+git commit -m "feat: PoseVisualizer._draw_velocity_arrows speed-colored"
+```
+---
+## Task 5: `render_video` + `build_velocity_summary`
+**Files:**
+- Modify: `formscout/agents/visualizer.py`
+- Modify: `tests/test_visualizer.py`
+- [ ] **Step 1: Write failing tests**
+Append to `tests/test_visualizer.py`:
+```python
+class TestRenderVideo:
+    def test_creates_mp4_file(self, tmp_path):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        ingest = _make_ingest(n=5)
+        pose = _make_pose(n=5)
+        out = str(tmp_path / "out.mp4")
+        result = vis.render_video(ingest, pose, {"skeleton"}, out)
+        assert result is not None
+        import os
+        assert os.path.exists(result)
+        assert os.path.getsize(result) > 0
+    def test_empty_layers_returns_none(self, tmp_path):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        out = str(tmp_path / "out.mp4")
+        result = vis.render_video(_make_ingest(), _make_pose(), set(), out)
+        assert result is None
+    def test_no_detections_returns_none(self, tmp_path):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        ingest = _make_ingest(n=5)
+        empty_pose = Pose2DResult(
+            keypoints=[{} for _ in range(5)], fps=30.0, confidence=0.0, notes=""
+        )
+        out = str(tmp_path / "out.mp4")
+        result = vis.render_video(ingest, empty_pose, {"skeleton"}, out)
+        assert result is None
+    def test_last_velocities_set_after_render(self, tmp_path):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        out = str(tmp_path / "out.mp4")
+        vis.render_video(_make_ingest(n=5), _make_pose(n=5), {"skeleton"}, out)
+        assert len(vis.last_velocities) == 17
+class TestBuildVelocitySummary:
+    def test_returns_markdown_table(self):
+        from formscout.agents.visualizer import build_velocity_summary, compute_joint_velocity
+        pose = _make_pose(n=10)
+        vels = compute_joint_velocity(pose.keypoints, fps=30.0)
+        result = build_velocity_summary(pose.keypoints, vels)
+        assert "|" in result
+        # At least one COCO joint name appears
+        assert any(name in result for name in ["knee", "shoulder", "hip", "ankle"])
+    def test_empty_keypoints_returns_empty_string(self):
+        from formscout.agents.visualizer import build_velocity_summary
+        empty_kps = [{} for _ in range(5)]
+        vels = {j: [0.0] * 5 for j in range(17)}
+        result = build_velocity_summary(empty_kps, vels)
+        assert result == ""
+```
+- [ ] **Step 2: Run to confirm failure**
+```bash
+pytest tests/test_visualizer.py::TestRenderVideo tests/test_visualizer.py::TestBuildVelocitySummary -v
+```
+Expected: FAIL — `AttributeError: 'PoseVisualizer' object has no attribute 'render_video'`
+- [ ] **Step 3: Add `render_video` to `PoseVisualizer`**
+Inside the `PoseVisualizer` class, after `_draw_velocity_arrows`:
+```python
+    # ── Public ────────────────────────────────────────────────────────────────
+    def render_video(
+        self,
+        ingest,
+        pose2d,
+        layers: set[str],
+        output_path: str,
+    ) -> str | None:
+        """
+        Render annotated video. Returns output_path on success, None otherwise.
+        layers: subset of {"skeleton", "trails", "velocity_arrows"}
+        """
+        if not layers:
+            return None
+        # Require at least one detected frame
+        if not any(pose2d.keypoints):
+            return None
+        try:
+            velocities = compute_joint_velocity(pose2d.keypoints, ingest.fps)
+            self.last_velocities = velocities
+            frames = ingest.frames
+            h, w = frames[0].shape[:2]
+            fps = ingest.fps or 30.0
+            fourcc = cv2.VideoWriter_fourcc(*"mp4v")
+            writer = cv2.VideoWriter(output_path, fourcc, fps, (w, h))
+            if not writer.isOpened():
+                logger.warning("VideoWriter failed to open: %s", output_path)
+                return None
+            trail_history: dict[int, deque] = {j: deque(maxlen=TRAIL_LENGTH) for j in range(17)}
+            prev_kps: dict | None = None
+            for frame_idx, (frame, kps) in enumerate(zip(frames, pose2d.keypoints)):
+                out_frame = frame.copy()
+                if "trails" in layers:
+                    # Update trail history before drawing
+                    for j, kp in kps.items():
+                        if kp.get("conf", 0.0) >= CONF_THRESHOLD:
+                            trail_history[j].append((kp["x"], kp["y"]))
+                    out_frame = self._draw_trails(out_frame, trail_history)
+                if "skeleton" in layers:
+                    out_frame = self._draw_skeleton(out_frame, kps)
+                if "velocity_arrows" in layers:
+                    out_frame = self._draw_velocity_arrows(
+                        out_frame, kps, prev_kps, velocities, frame_idx
+                    )
+                writer.write(out_frame)
+                prev_kps = kps
+            writer.release()
+            return output_path
+        except Exception as e:
+            logger.warning("render_video failed: %s", e)
+            return None
+```
+- [ ] **Step 4: Add `build_velocity_summary` after the class**
+After the `PoseVisualizer` class definition, add:
+```python
+# ── Velocity summary ──────────────────────────────────────────────────────────
+def build_velocity_summary(
+    keypoints_per_frame: list[dict],
+    velocities: dict[int, list[float]],
+) -> str:
+    """Return markdown table of per-joint avg/peak velocity. Empty string if no valid joints."""
+    n_frames = len(keypoints_per_frame)
+    if n_frames == 0:
+        return ""
+    rows = []
+    for j in range(17):
+        # Count frames where this joint is detected
+        detected = sum(
+            1 for kps in keypoints_per_frame
+            if kps.get(j, {}).get("conf", 0.0) >= CONF_THRESHOLD
+        )
+        if detected < n_frames * 0.5:
+            continue  # skip joints present in <50% of frames
+        speeds = velocities.get(j, [])
+        if not speeds:
+            continue
+        avg_speed = sum(speeds) / len(speeds)
+        peak_speed = max(speeds)
+        rows.append((COCO_KEYPOINTS[j], avg_speed, peak_speed))
+    if not rows:
+        return ""
+    rows.sort(key=lambda r: r[2], reverse=True)  # sort by peak descending
+    lines = [
+        "| Joint | Avg (px/s) | Peak (px/s) |",
+        "|---|---|---|",
+    ]
+    for name, avg, peak in rows:
+        lines.append(f"| {name} | {avg:.1f} | {peak:.1f} |")
+    return "\n".join(lines)
+```
+- [ ] **Step 5: Run all visualizer tests**
+```bash
+pytest tests/test_visualizer.py -v
+```
+Expected: all tests PASS (4 + 2 + 2 + 2 + 4 + 2 = 16 total)
+- [ ] **Step 6: Commit**
+```bash
+git add formscout/agents/visualizer.py tests/test_visualizer.py
+git commit -m "feat: PoseVisualizer.render_video + build_velocity_summary (16 tests pass)"
+```
+---
+## Task 6: Wire `app.py`
+**Files:**
+- Modify: `app.py`
+- [ ] **Step 1: Add `import tempfile` if not present and import visualizer in `process_video`**
+Check the top of `app.py` for `import tempfile`. If missing, add it alongside the other stdlib imports. (Look at the existing import block and add `import tempfile` there.)
+- [ ] **Step 2: Update `process_video()` signature and body**
+Replace the existing `process_video` function (lines 46–83) with:
+```python
+def process_video(video_path: str, test_name: str, side: str, model_key: str, layers: list[str]):
+    """Process an uploaded video through the FormScout pipeline."""
+    if not video_path:
+        return (
+            _render_empty_state(),
+            "Upload a video to begin analysis.",
+            "",
+            "",
+            None,
+            "",
+        )
+    director = Director()
+    state = director.run(video_path, test_name=test_name, side=side, model_key=model_key)
+    # ─── Score card ───
+    score_html = _render_empty_state()
+    score_details = ""
+    if state.features:
+        result = score_test(state.features)
+        judge = state.judge
+        if judge and judge.score is not None:
+            score_html = _render_score_card(judge.score, judge.confidence, judge.needs_human)
+            score_details = _render_score_details_judge(judge, result, state.features)
+        elif judge and judge.needs_human:
+            score_html = _render_score_card(0, 0, True)
+            score_details = f"### Needs Clinician Review\n{judge.rationale}"
+        else:
+            score_html = _render_score_card(result.score, result.confidence, result.needs_human)
+            score_details = _render_score_details(result, state.features)
+    # ─── Pipeline info ───
+    pipeline_md = _render_pipeline_status(state)
+    # ─── Warnings/errors ───
+    alerts = _render_alerts(state)
+    # ─── Overlay video ───
+    overlay_path = None
+    vel_summary = ""
+    layer_set = {lbl.lower().replace(" ", "_") for lbl in (layers or [])}
+    if layer_set and state.ingest and state.pose2d:
+        try:
+            from formscout.agents.visualizer import PoseVisualizer, build_velocity_summary
+            vis = PoseVisualizer()
+            with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as f:
+                out_path = f.name
+            overlay_path = vis.render_video(state.ingest, state.pose2d, layer_set, out_path)
+            if overlay_path:
+                vel_summary = build_velocity_summary(state.pose2d.keypoints, vis.last_velocities)
+        except Exception as e:
+            alerts = (alerts or "") + f"\n⚠️ Visualizer error: {e}"
+    return score_html, pipeline_md, score_details, alerts, overlay_path, vel_summary
+```
+- [ ] **Step 3: Add `overlay_layers` CheckboxGroup in `build_app()`**
+After the `pose_model_dropdown` block (around line 270), and before `submit_btn`:
+```python
+                overlay_layers = gr.CheckboxGroup(
+                    choices=["Skeleton", "Trails", "Velocity arrows"],
+                    value=["Skeleton", "Trails"],
+                    label="Overlay Layers",
+                )
+```
+- [ ] **Step 4: Add overlay tab in the results panel**
+Inside the `with gr.Tabs():` block (after the `⚠️ Alerts` tab):
+```python
+                    with gr.TabItem("🎬 Overlay Video"):
+                        overlay_video = gr.Video(label="Annotated Movement")
+                        velocity_md = gr.Markdown("")
+```
+- [ ] **Step 5: Update `_map_inputs` and `submit_btn.click`**
+Replace the `_map_inputs` closure and `submit_btn.click` call:
+```python
+        def _map_inputs(video, test_display_name, side_display, pose_model_key, overlay_layers):
+            """Map UI display values to internal values."""
+            test_map = {name: val for name, val in FMS_TESTS}
+            test_name = test_map.get(test_display_name, "deep_squat")
+            side = {"N/A": "na", "Left": "left", "Right": "right"}.get(side_display, "na")
+            return process_video(video, test_name, side, pose_model_key, overlay_layers)
+        submit_btn.click(
+            fn=_map_inputs,
+            inputs=[video_input, test_dropdown, side_dropdown, pose_model_dropdown, overlay_layers],
+            outputs=[score_html, pipeline_md, score_details, alerts_md, overlay_video, velocity_md],
+        )
+```
+- [ ] **Step 6: Smoke-test the app builds**
+```bash
+python3 -c "from app import build_app; build_app(); print('ok')"
+```
+Expected: `ok` (Gradio UserWarning about theme is fine, not an error)
+- [ ] **Step 7: Run full test suite to check for regressions**
+```bash
+pytest tests/ -v --tb=short 2>&1 | tail -15
+```
+Expected: all previous tests still pass (62 passing, 1 pre-existing fail in biomechanics), plus 16 new visualizer tests = 78 passing.
+- [ ] **Step 8: Commit**
+```bash
+git add app.py
+git commit -m "feat: overlay video tab + velocity summary wired in Gradio UI"
+```
+---
+## Self-review
+**Spec coverage:**
+- ✅ `SimpleKalmanFilter` 4-state (Task 1)
+- ✅ `compute_joint_velocity` Kalman-filtered px/s (Task 1)
+- ✅ `_draw_skeleton` COCO bones, confidence-colored joints (Task 2)
+- ✅ `_draw_trails` fading deque-based trails (Task 3)
+- ✅ `_draw_velocity_arrows` speed-colored, direction from consecutive frames (Task 4)
+- ✅ `render_video` layer dispatch, trail history, VideoWriter (Task 5)
+- ✅ `build_velocity_summary` markdown table, >50% detection filter (Task 5)
+- ✅ `overlay_layers` CheckboxGroup in UI (Task 6)
+- ✅ New `🎬 Overlay Video` tab with `gr.Video` + `gr.Markdown` (Task 6)
+- ✅ `process_video` wired with layers param (Task 6)
+- ✅ `vis.last_velocities` stored on instance after `render_video` (Task 5)
+- ✅ Error handling: empty layers → None, empty detections → None, exception → alerts (Task 5 + 6)
+- ✅ All 5 spec test cases covered across Tasks 1–5
+**Placeholder scan:** None found. All code blocks are complete.
+**Type consistency:**
+- `compute_joint_velocity` returns `dict[int, list[float]]` — used identically in `render_video`, `_draw_velocity_arrows`, and `build_velocity_summary`. ✓
+- `layers: set[str]` in `render_video`; converted from `list[str]` in `process_video` via set comprehension. ✓
+- `vis.last_velocities` set in `render_video`, read in `process_video`. ✓
+- `_draw_velocity_arrows(frame, kps, prev_kps, velocities, frame_idx)` — signature matches call in `render_video`. ✓

docs/superpowers/plans/2026-06-13-full-fms-session-pdf.md CHANGED Viewed

@@ -1,1209 +1,1209 @@
-# Full FMS Session + PDF Report — Implementation Plan
-> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
-**Goal:** Turn FormScout's one-clip scorer into a screening session that accumulates analyzed clips into a composite 0–21 report and exports a branded PDF with annotated worst-moment key-frame stills.
-**Architecture:** A new `formscout/session.py` accumulates typed `SessionEntry` objects (one per analyzed clip), persisting each to a temp session dir. `PoseVisualizer.render_frame()` captures the governing frame (already computed by `BiomechanicsAgent` and stored in `features.timing`) as an annotated PNG. On "Finish", the existing `ReportAgent` computes composite + asymmetries, and a new `PdfReportAgent` renders a ReportLab PDF. The UI (`app.py`) gains `gr.State` session accumulation with "Analyse new clip" / "Finish & generate PDF" buttons.
-**Tech Stack:** Python 3.13, ReportLab (new dep), OpenCV (existing), Gradio 5, pytest. No model downloads in tests.
----
-## File Structure
-- `requirements.txt` — add `reportlab`.
-- `formscout/types.py` — add `SessionEntry` frozen dataclass.
-- `formscout/agents/biomechanics.py` — add `max_sag_frame` to `trunk_stability_pushup` timing (rotary already has `peak_extension_frame`).
-- `formscout/agents/visualizer.py` — add `PoseVisualizer.render_frame()`.
-- `formscout/session.py` — **new**: session accumulator (new/add/finish + persistence + key-frame helpers).
-- `formscout/agents/pdf_report.py` — **new**: `PdfReportAgent` (ReportLab).
-- `app.py` — wire `gr.State`, two buttons, "Session so far" table, finish handler.
-- `tests/test_session.py`, `tests/test_keyframe.py`, `tests/test_pdf_report.py` — **new**.
----
-## Task 1: Add ReportLab dependency
-**Files:**
-- Modify: `requirements.txt`
-- [ ] **Step 1: Add the dependency**
-Add this line to `requirements.txt` (after `pillow>=10.3`):
-```
-reportlab>=4.0
-```
-- [ ] **Step 2: Install it**
-Run: `pip install 'reportlab>=4.0'`
-Expected: `Successfully installed reportlab-4.x.x`
-- [ ] **Step 3: Verify import**
-Run: `python3 -c "import reportlab; print(reportlab.Version)"`
-Expected: prints a version like `4.x.x`
-- [ ] **Step 4: Commit**
-```bash
-git add requirements.txt
-git commit -m "build: add reportlab for PDF report generation"
-```
----
-## Task 2: Add `SessionEntry` dataclass
-**Files:**
-- Modify: `formscout/types.py` (after `ReportResult`, before `PipelineState`)
-- Test: `tests/test_session.py`
-- [ ] **Step 1: Write the failing test**
-Create `tests/test_session.py` with:
-```python
-"""Tests for the FMS session accumulator — no GPU, no model downloads."""
-import numpy as np
-from formscout.types import (
-    IngestResult, Pose2DResult, BiomechFeatures, ScoreResult, JudgeResult,
-    MovementResult, SessionEntry,
-)
-def test_session_entry_holds_typed_objects():
-    movement = MovementResult(test_name="deep_squat", side="na", confidence=1.0)
-    features = BiomechFeatures(
-        test_name="deep_squat", view="2d", side="na",
-        angles={"left_knee_flexion_deg": 95.0}, alignments={"knees_tracking_over_feet": True},
-        symmetry_delta=None, timing={"deepest_frame": 2}, confidence=0.9,
-    )
-    rubric = ScoreResult(score=2, rationale="ok", confidence=0.8)
-    judge = JudgeResult(score=2, rationale="ok", compensation_tags=["heels elevated"],
-                        corrective_hint="ankle mobility", confidence=0.85)
-    entry = SessionEntry(
-        test_name="deep_squat", side="na", score=2, needs_human=False,
-        rationale="ok", compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
-        measurements={"left_knee_flexion_deg": 95.0}, confidence=0.85, view="2d",
-        keyframe_path=None, movement=movement, features=features,
-        rubric_score=rubric, judge=judge,
-    )
-    assert entry.score == 2
-    assert entry.movement.test_name == "deep_squat"
-    assert entry.rubric_score.score == 2
-    assert entry.judge.compensation_tags == ["heels elevated"]
-```
-- [ ] **Step 2: Run test to verify it fails**
-Run: `pytest tests/test_session.py::test_session_entry_holds_typed_objects -v`
-Expected: FAIL with `ImportError: cannot import name 'SessionEntry'`
-- [ ] **Step 3: Add the dataclass**
-In `formscout/types.py`, insert after the `ReportResult` class (line ~142) and before `PipelineState`:
-```python
-@dataclass(frozen=True)
-class SessionEntry:
-    """One accumulated analysis in a screening session.
-    Display fields (test_name…keyframe_path) feed the PDF/JSON/MD artifacts;
-    the trailing typed objects (movement…judge) feed ReportAgent.run().
-    """
-    test_name: str
-    side: str
-    score: int | None
-    needs_human: bool
-    rationale: str
-    compensation_tags: list
-    corrective_hint: str
-    measurements: dict
-    confidence: float
-    view: str
-    keyframe_path: str | None
-    movement: MovementResult
-    features: BiomechFeatures
-    rubric_score: ScoreResult
-    judge: JudgeResult | None
-```
-- [ ] **Step 4: Run test to verify it passes**
-Run: `pytest tests/test_session.py::test_session_entry_holds_typed_objects -v`
-Expected: PASS
-- [ ] **Step 5: Commit**
-```bash
-git add formscout/types.py tests/test_session.py
-git commit -m "feat: add SessionEntry typed contract for screening sessions"
-```
----
-## Task 3: Add governing-frame index to push-up biomechanics
-**Files:**
-- Modify: `formscout/agents/biomechanics.py:468-529` (`_trunk_stability_pushup`)
-- Test: `tests/test_biomechanics.py` (append a test)
-The other six tests already store a governing frame index in `features.timing`
-(`deepest_frame`, `peak_step_frame`, `deepest_lunge_frame`, `measure_frame`,
-`peak_raise_frame`, `peak_extension_frame`). Only `trunk_stability_pushup` is missing one.
-- [ ] **Step 1: Write the failing test**
-Append to `tests/test_biomechanics.py`:
-```python
-def test_pushup_timing_has_max_sag_frame():
-    from formscout.agents.biomechanics import BiomechanicsAgent
-    from formscout.types import Pose2DResult, Body3DResult, MovementResult
-    # 4 frames; frame 2 has the largest hip sag (hip far below shoulder/ankle midline)
-    def kps(hip_y):
-        base = {
-            5: {"x": 200, "y": 200, "conf": 0.9},   # L shoulder
-            6: {"x": 220, "y": 200, "conf": 0.9},   # R shoulder
-            11: {"x": 300, "y": hip_y, "conf": 0.9}, # L hip
-            12: {"x": 320, "y": hip_y, "conf": 0.9}, # R hip
-            15: {"x": 400, "y": 200, "conf": 0.9},   # L ankle
-            16: {"x": 420, "y": 200, "conf": 0.9},   # R ankle
-        }
-        return base
-    frames = [kps(200), kps(210), kps(260), kps(205)]
-    pose = Pose2DResult(keypoints=frames, fps=30.0, confidence=0.9)
-    body3d = Body3DResult(used=False, joints_3d=[])
-    movement = MovementResult(test_name="trunk_stability_pushup", side="na", confidence=1.0)
-    feats = BiomechanicsAgent().run(pose, body3d, movement)
-    assert "max_sag_frame" in feats.timing
-    assert feats.timing["max_sag_frame"] == 2
-```
-- [ ] **Step 2: Run test to verify it fails**
-Run: `pytest tests/test_biomechanics.py::test_pushup_timing_has_max_sag_frame -v`
-Expected: FAIL with `assert 'max_sag_frame' in {...}` (KeyError-style assertion failure)
-- [ ] **Step 3: Track the max-sag frame index**
-In `formscout/agents/biomechanics.py`, replace the body of `_trunk_stability_pushup` from the
-`trunk_angles_over_time = []` loop through the `if trunk_angles_over_time:` block. Replace:
-```python
-        # Analyze multiple frames to detect sag/lag
-        trunk_angles_over_time = []
-        for i, kps in enumerate(pose2d.keypoints):
-```
-…down to and including the `alignments["no_sag"] = max_sag < 30` line, with:
-```python
-        # Analyze multiple frames to detect sag/lag
-        trunk_sags: list[tuple[int, float]] = []  # (frame_idx, sag_px)
-        for i, kps in enumerate(pose2d.keypoints):
-            l_sh = _get_joint(kps, L_SHOULDER)
-            r_sh = _get_joint(kps, R_SHOULDER)
-            l_hip = _get_joint(kps, L_HIP)
-            r_hip = _get_joint(kps, R_HIP)
-            l_ankle = _get_joint(kps, L_ANKLE)
-            r_ankle = _get_joint(kps, R_ANKLE)
-            if l_sh and r_sh and l_hip and r_hip and l_ankle and r_ankle:
-                sh_y = (l_sh[1] + r_sh[1]) / 2
-                hip_y = (l_hip[1] + r_hip[1]) / 2
-                ankle_y = (l_ankle[1] + r_ankle[1]) / 2
-                expected_hip_y = (sh_y + ankle_y) / 2
-                sag_px = hip_y - expected_hip_y
-                trunk_sags.append((i, sag_px))
-        max_sag_frame = 0
-        if trunk_sags:
-            sags = [s for _, s in trunk_sags]
-            max_sag_frame = max(trunk_sags, key=lambda t: t[1])[0]
-            mean = sum(sags) / len(sags)
-            variance = (sum((x - mean) ** 2 for x in sags) / len(sags)) ** 0.5
-            max_sag = max(sags)
-            angles["max_sag_px"] = max_sag
-            angles["trunk_variance_px"] = variance
-            alignments["body_rigid"] = max_sag < 30 and variance < 15
-            alignments["no_sag"] = max_sag < 30
-        else:
-            notes_parts.append("insufficient landmarks for trunk analysis")
-```
-Then update the `return BiomechFeatures(...)` `timing=` argument at the end of the method from:
-```python
-            timing={"n_frames_analyzed": len(trunk_angles_over_time)},
-```
-to:
-```python
-            timing={"n_frames_analyzed": len(trunk_sags), "max_sag_frame": max_sag_frame},
-```
-- [ ] **Step 4: Run test to verify it passes**
-Run: `pytest tests/test_biomechanics.py::test_pushup_timing_has_max_sag_frame -v`
-Expected: PASS
-- [ ] **Step 5: Run the full biomechanics suite (no regressions)**
-Run: `pytest tests/test_biomechanics.py -v`
-Expected: all previously-passing tests still pass (the pre-existing `test_unimplemented_test_returns_low_confidence` known-failure may remain failing — that is unrelated and documented in CLAUDE.md).
-- [ ] **Step 6: Commit**
-```bash
-git add formscout/agents/biomechanics.py tests/test_biomechanics.py
-git commit -m "feat: track max-sag frame index in push-up biomechanics for key-frame capture"
-```
----
-## Task 4: Add `PoseVisualizer.render_frame()`
-**Files:**
-- Modify: `formscout/agents/visualizer.py` (add method to `PoseVisualizer`, after `render_video`)
-- Test: `tests/test_keyframe.py`
-- [ ] **Step 1: Write the failing test**
-Create `tests/test_keyframe.py`:
-```python
-"""Tests for PoseVisualizer.render_frame — single annotated still."""
-import os
-import numpy as np
-from formscout.types import IngestResult, Pose2DResult
-def _ingest(n=5, h=480, w=640):
-    frames = [np.zeros((h, w, 3), dtype=np.uint8) for _ in range(n)]
-    return IngestResult(frames=frames, fps=30.0, duration=n / 30.0, n_people=1, width=w, height=h)
-def _pose(n=5):
-    kps = []
-    for i in range(n):
-        kps.append({j: {"x": float(50 + j * 25), "y": float(80 + j * 18), "conf": 0.9}
-                    for j in range(17)})
-    return Pose2DResult(keypoints=kps, fps=30.0, confidence=0.9)
-def test_render_frame_writes_png(tmp_path):
-    from formscout.agents.visualizer import PoseVisualizer
-    out = str(tmp_path / "key.png")
-    path = PoseVisualizer().render_frame(_ingest(), _pose(), frame_idx=2,
-                                         layers={"skeleton"}, caption="Deep Squat — heels elevated",
-                                         out_png=out)
-    assert path == out
-    assert os.path.exists(out)
-    assert os.path.getsize(out) > 0
-def test_render_frame_bad_index_returns_none(tmp_path):
-    from formscout.agents.visualizer import PoseVisualizer
-    out = str(tmp_path / "key.png")
-    path = PoseVisualizer().render_frame(_ingest(n=3), _pose(n=3), frame_idx=99,
-                                         layers={"skeleton"}, caption="", out_png=out)
-    assert path is None
-```
-- [ ] **Step 2: Run test to verify it fails**
-Run: `pytest tests/test_keyframe.py -v`
-Expected: FAIL with `AttributeError: 'PoseVisualizer' object has no attribute 'render_frame'`
-- [ ] **Step 3: Add the method**
-In `formscout/agents/visualizer.py`, inside the `PoseVisualizer` class, add this method
-immediately after `render_video` (before the closing of the class / the module-level
-`build_velocity_summary`):
-```python
-    def render_frame(
-        self,
-        ingest,
-        pose2d,
-        frame_idx: int,
-        layers: set[str],
-        caption: str = "",
-        out_png: str | None = None,
-    ) -> str | None:
-        """Render a single annotated still (skeleton + optional trails + caption).
-        frame_idx is typically the governing frame from BiomechFeatures.timing.
-        Returns the PNG path on success, None on any failure. Never raises.
-        """
-        try:
-            if not (0 <= frame_idx < len(ingest.frames)) or frame_idx >= len(pose2d.keypoints):
-                return None
-            frame = ingest.frames[frame_idx].copy()
-            kps = pose2d.keypoints[frame_idx]
-            if "trails" in layers:
-                trail: dict[int, deque] = {j: deque(maxlen=TRAIL_LENGTH) for j in range(17)}
-                start = max(0, frame_idx - TRAIL_LENGTH)
-                for fi in range(start, frame_idx + 1):
-                    for j, kp in pose2d.keypoints[fi].items():
-                        if kp.get("conf", 0.0) >= CONF_THRESHOLD:
-                            trail[j].append((kp["x"], kp["y"]))
-                frame = self._draw_trails(frame, trail)
-            if "skeleton" in layers:
-                frame = self._draw_skeleton(frame, kps)
-            if caption:
-                cv2.rectangle(frame, (0, 0), (frame.shape[1], 28), (0, 0, 0), -1)
-                cv2.putText(frame, caption[:80], (8, 20), cv2.FONT_HERSHEY_SIMPLEX,
-                            0.55, (255, 255, 255), 1, cv2.LINE_AA)
-            if out_png is None:
-                out_png = tempfile.NamedTemporaryFile(suffix=".png", delete=False).name
-            ok = cv2.imwrite(out_png, frame)
-            return out_png if ok else None
-        except Exception as e:
-            logger.warning("render_frame failed: %s", e)
-            return None
-```
-(`deque`, `cv2`, `tempfile`, `logger`, `TRAIL_LENGTH`, `CONF_THRESHOLD` are all already imported at the top of this file.)
-- [ ] **Step 4: Run test to verify it passes**
-Run: `pytest tests/test_keyframe.py -v`
-Expected: both tests PASS
-- [ ] **Step 5: Commit**
-```bash
-git add formscout/agents/visualizer.py tests/test_keyframe.py
-git commit -m "feat: add PoseVisualizer.render_frame for annotated key-frame stills"
-```
----
-## Task 5: Create the session accumulator
-**Files:**
-- Create: `formscout/session.py`
-- Test: `tests/test_session.py` (append tests)
-- [ ] **Step 1: Write the failing tests**
-Append to `tests/test_session.py`:
-```python
-def _ingest(n=5, h=480, w=640):
-    frames = [np.zeros((h, w, 3), dtype=np.uint8) for _ in range(n)]
-    return IngestResult(frames=frames, fps=30.0, duration=n / 30.0, n_people=1, width=w, height=h)
-def _pose(n=5):
-    kps = []
-    for i in range(n):
-        kps.append({j: {"x": float(50 + j * 25), "y": float(80 + j * 18), "conf": 0.9}
-                    for j in range(17)})
-    return Pose2DResult(keypoints=kps, fps=30.0, confidence=0.9)
-def _features(test_name="deep_squat", side="na", frame_key="deepest_frame"):
-    return BiomechFeatures(
-        test_name=test_name, view="2d", side=side,
-        angles={"left_knee_flexion_deg": 95.0},
-        alignments={"knees_tracking_over_feet": False},
-        symmetry_delta=None, timing={frame_key: 2}, confidence=0.9,
-    )
-def _judge(score=2, needs_human=False):
-    return JudgeResult(
-        score=None if needs_human else score, rationale="r",
-        compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
-        confidence=0.85, needs_human=needs_human,
-    )
-def test_add_analysis_appends_entry_and_writes_files():
-    import os
-    from formscout import session as S
-    sess = S.new_session()
-    entry = S.add_analysis(sess, ingest=_ingest(), pose2d=_pose(),
-                           features=_features(), judge=_judge(), test_name="deep_squat", side="na")
-    assert len(sess.entries) == 1
-    assert entry.score == 2
-    assert os.path.exists(os.path.join(sess.session_dir, "session.json"))
-    assert os.path.exists(os.path.join(sess.session_dir, "analysis.md"))
-    # key-frame still written (deepest_frame=2 is valid)
-    assert entry.keyframe_path and os.path.exists(entry.keyframe_path)
-def test_finish_composite_null_when_needs_human():
-    from formscout import session as S
-    sess = S.new_session()
-    S.add_analysis(sess, ingest=_ingest(), pose2d=_pose(), features=_features(),
-                   judge=_judge(score=3), test_name="deep_squat", side="na")
-    S.add_analysis(sess, ingest=_ingest(), pose2d=_pose(),
-                   features=_features("trunk_stability_pushup", frame_key="max_sag_frame"),
-                   judge=_judge(needs_human=True), test_name="trunk_stability_pushup", side="na")
-    report, pdf_path = S.finish_session(sess)
-    assert report is not None
-    assert report.composite is None  # one test needs_human
-def test_finish_empty_session_returns_none():
-    from formscout import session as S
-    sess = S.new_session()
-    report, pdf_path = S.finish_session(sess)
-    assert report is None and pdf_path is None
-```
-- [ ] **Step 2: Run tests to verify they fail**
-Run: `pytest tests/test_session.py -v`
-Expected: the three new tests FAIL with `ModuleNotFoundError: No module named 'formscout.session'`
-- [ ] **Step 3: Create the module**
-Create `formscout/session.py`:
-```python
-"""
-Screening-session accumulator.
-Accumulates one SessionEntry per analyzed clip, persists each to a temp session
-dir (session.json + analysis.md + key-frame PNGs), and on finish builds a
-ReportResult (via ReportAgent) + a PDF (via PdfReportAgent).
-Pure orchestration — no Gradio imports. Disk writes tolerate failure with a
-logged warning and never block scoring.
-"""
-from __future__ import annotations
-import json
-import logging
-import os
-import tempfile
-import uuid
-from dataclasses import dataclass, replace
-from formscout.rubric import score_test
-from formscout.types import MovementResult, ReportResult, SessionEntry
-logger = logging.getLogger(__name__)
-# Maps each test to the BiomechFeatures.timing key holding its governing frame.
-TIMING_KEY = {
-    "deep_squat": "deepest_frame",
-    "hurdle_step": "peak_step_frame",
-    "inline_lunge": "deepest_lunge_frame",
-    "shoulder_mobility": "measure_frame",
-    "active_slr": "peak_raise_frame",
-    "trunk_stability_pushup": "max_sag_frame",
-    "rotary_stability": "peak_extension_frame",
-}
-@dataclass
-class Session:
-    """Mutable session: an id, its temp dir, and accumulated entries."""
-    session_id: str
-    session_dir: str
-    entries: list  # list[SessionEntry]
-def new_session() -> Session:
-    sid = uuid.uuid4().hex[:12]
-    base = os.path.join(tempfile.gettempdir(), "formscout_sessions", sid)
-    try:
-        os.makedirs(os.path.join(base, "keyframes"), exist_ok=True)
-    except Exception as e:
-        logger.warning("session dir create failed: %s", e)
-    return Session(session_id=sid, session_dir=base, entries=[])
-def governing_frame_index(features) -> int | None:
-    """Return the governing frame index for this test, or None."""
-    key = TIMING_KEY.get(features.test_name)
-    if key is None:
-        return None
-    idx = features.timing.get(key)
-    return int(idx) if isinstance(idx, (int, float)) else None
-def worst_compensation_caption(judge, features) -> str:
-    """Short caption naming the worst compensation for the key-frame still."""
-    if judge and getattr(judge, "compensation_tags", None):
-        return ", ".join(judge.compensation_tags)
-    failed = [k.replace("_", " ") for k, v in features.alignments.items() if v is False]
-    return ("compensation: " + ", ".join(failed)) if failed else "key position"
-def add_analysis(session, *, ingest, pose2d, features, judge, test_name, side,
-                 draw_trails: bool = False) -> SessionEntry:
-    """Build a SessionEntry from a completed analysis, render its key-frame,
-    persist the session, append, and return the entry."""
-    movement = MovementResult(test_name=test_name, side=side, confidence=1.0)
-    rubric = score_test(features)
-    needs_human = bool((judge and judge.needs_human) or rubric.needs_human)
-    if needs_human:
-        score = None
-    elif judge and judge.score is not None:
-        score = judge.score
-    else:
-        score = rubric.score
-    keyframe_path = None
-    idx = governing_frame_index(features)
-    if idx is not None and 0 <= idx < len(pose2d.keypoints):
-        from formscout.agents.visualizer import PoseVisualizer
-        caption = (f"{test_name.replace('_', ' ').title()} "
-                   f"({side}) — {worst_compensation_caption(judge, features)}")
-        layers = {"skeleton", "trails"} if draw_trails else {"skeleton"}
-        out_png = os.path.join(session.session_dir, "keyframes", f"{test_name}_{side}.png")
-        try:
-            keyframe_path = PoseVisualizer().render_frame(ingest, pose2d, idx, layers, caption, out_png)
-        except Exception as e:
-            logger.warning("keyframe render failed: %s", e)
-    measurements = {}
-    measurements.update(features.angles)
-    measurements.update(features.alignments)
-    entry = SessionEntry(
-        test_name=test_name, side=side, score=score, needs_human=needs_human,
-        rationale=(judge.rationale if judge else rubric.rationale),
-        compensation_tags=list(judge.compensation_tags) if judge else [],
-        corrective_hint=(judge.corrective_hint if judge else ""),
-        measurements=measurements,
-        confidence=(judge.confidence if judge else rubric.confidence),
-        view=features.view,
-        keyframe_path=keyframe_path,
-        movement=movement, features=features, rubric_score=rubric, judge=judge,
-    )
-    session.entries.append(entry)
-    _persist(session)
-    return entry
-def finish_session(session) -> tuple[ReportResult | None, str | None]:
-    """Build the composite report + PDF. Returns (report, pdf_path).
-    Returns (None, None) for an empty session."""
-    if not session.entries:
-        return None, None
-    from formscout.agents.report import ReportAgent
-    report_inputs = [{
-        "movement": e.movement, "features": e.features,
-        "rubric_score": e.rubric_score, "judge": e.judge, "side": e.side,
-    } for e in session.entries]
-    report = ReportAgent().run(report_inputs)
-    pdf_path = None
-    try:
-        from formscout.agents.pdf_report import PdfReportAgent
-        pdf_path = PdfReportAgent().run(report, session.entries, session.session_dir)
-    except Exception as e:
-        logger.warning("pdf generation failed: %s", e)
-    report = replace(report, pdf_path=pdf_path)
-    return report, pdf_path
-# ── Persistence ───────────────────────────────────────────────────────────────
-def _jsonable(d: dict) -> dict:
-    out = {}
-    for k, v in d.items():
-        if isinstance(v, float):
-            out[k] = round(v, 2)
-        elif isinstance(v, (int, str, bool)) or v is None:
-            out[k] = v
-        else:
-            out[k] = str(v)
-    return out
-def _entry_display(e: SessionEntry) -> dict:
-    return {
-        "test_name": e.test_name, "side": e.side, "score": e.score,
-        "needs_human": e.needs_human, "rationale": e.rationale,
-        "compensation_tags": list(e.compensation_tags), "corrective_hint": e.corrective_hint,
-        "measurements": _jsonable(e.measurements), "confidence": round(e.confidence, 2),
-        "view": e.view, "keyframe_path": e.keyframe_path,
-    }
-def _render_markdown(session: Session) -> str:
-    lines = ["# FormScout — Session Log", ""]
-    for e in session.entries:
-        title = e.test_name.replace("_", " ").title()
-        if e.side in ("left", "right"):
-            title += f" ({e.side})"
-        score = "Clinician review required" if e.needs_human else f"{e.score}/3"
-        lines.append(f"## {title} ��� {score}")
-        lines.append(e.rationale or "")
-        if e.compensation_tags:
-            lines.append(f"- Compensations: {', '.join(e.compensation_tags)}")
-        if e.corrective_hint:
-            lines.append(f"- Corrective: {e.corrective_hint}")
-        if e.keyframe_path:
-            lines.append(f"- Key frame: `{e.keyframe_path}`")
-        lines.append("")
-    return "\n".join(lines)
-def _persist(session: Session) -> None:
-    try:
-        with open(os.path.join(session.session_dir, "session.json"), "w") as f:
-            json.dump([_entry_display(e) for e in session.entries], f, indent=2)
-        with open(os.path.join(session.session_dir, "analysis.md"), "w") as f:
-            f.write(_render_markdown(session))
-    except Exception as e:
-        logger.warning("session persist failed: %s", e)
-```
-- [ ] **Step 4: Run tests to verify they pass**
-Run: `pytest tests/test_session.py -v`
-Expected: all session tests PASS (Task 6 provides `PdfReportAgent`; `finish_session` tolerates its
-absence via the try/except, so these pass now — `pdf_path` may be `None` until Task 6).
-- [ ] **Step 5: Commit**
-```bash
-git add formscout/session.py tests/test_session.py
-git commit -m "feat: add screening-session accumulator with key-frame capture and persistence"
-```
----
-## Task 6: Create `PdfReportAgent`
-**Files:**
-- Create: `formscout/agents/pdf_report.py`
-- Test: `tests/test_pdf_report.py`
-- [ ] **Step 1: Write the failing test**
-Create `tests/test_pdf_report.py`:
-```python
-"""Tests for PdfReportAgent — no GPU, no model downloads."""
-import os
-from formscout.types import (
-    ReportResult, SessionEntry, MovementResult, BiomechFeatures, ScoreResult, JudgeResult,
-)
-def _entry(test_name="deep_squat", score=2, needs_human=False):
-    movement = MovementResult(test_name=test_name, side="na", confidence=1.0)
-    features = BiomechFeatures(
-        test_name=test_name, view="2d", side="na",
-        angles={"left_knee_flexion_deg": 95.0}, alignments={"knees_tracking_over_feet": False},
-        symmetry_delta=None, timing={"deepest_frame": 1}, confidence=0.9,
-    )
-    rubric = ScoreResult(score=2, rationale="rubric ok", confidence=0.8)
-    judge = JudgeResult(score=None if needs_human else score, rationale="judge rationale",
-                        compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
-                        confidence=0.85, needs_human=needs_human)
-    return SessionEntry(
-        test_name=test_name, side="na", score=None if needs_human else score,
-        needs_human=needs_human, rationale="judge rationale",
-        compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
-        measurements={"left_knee_flexion_deg": 95.0, "knees_tracking_over_feet": False},
-        confidence=0.85, view="2d", keyframe_path=None,
-        movement=movement, features=features, rubric_score=rubric, judge=judge,
-    )
-def _report(composite=2):
-    return ReportResult(
-        per_test=[], composite=composite, asymmetries=[],
-        overlay_video_path=None, pdf_path=None,
-        low_confidence_flags=[], disagreement_flags=[],
-    )
-def test_pdf_is_created(tmp_path):
-    from formscout.agents.pdf_report import PdfReportAgent
-    path = PdfReportAgent().run(_report(2), [_entry()], str(tmp_path))
-    assert path is not None
-    assert os.path.exists(path)
-    assert os.path.getsize(path) > 1000  # a real PDF, not an empty file
-    with open(path, "rb") as f:
-        assert f.read(5) == b"%PDF-"
-def test_pdf_handles_incomplete_composite(tmp_path):
-    from formscout.agents.pdf_report import PdfReportAgent
-    path = PdfReportAgent().run(_report(None), [_entry(needs_human=True)], str(tmp_path))
-    assert path is not None and os.path.exists(path)
-```
-- [ ] **Step 2: Run test to verify it fails**
-Run: `pytest tests/test_pdf_report.py -v`
-Expected: FAIL with `ModuleNotFoundError: No module named 'formscout.agents.pdf_report'`
-- [ ] **Step 3: Create the agent**
-Create `formscout/agents/pdf_report.py`:
-```python
-"""
-PdfReportAgent — renders a ReportResult + session entries to a branded PDF.
-Input:  ReportResult, list[SessionEntry], session_dir (str)
-Output: path to the written PDF (str), or None on failure.
-Failure: returns None, never raises.
-Params: 0 (pure rendering — no model).
-License: n/a.
-Gated: no.
-"""
-from __future__ import annotations
-import logging
-import os
-from formscout.types import ReportResult
-logger = logging.getLogger(__name__)
-DISCLAIMER = "Screening aid — not a diagnosis. Pain or clearing tests require a clinician."
-class PdfReportAgent:
-    """Assembles the screening-session PDF via ReportLab."""
-    def run(self, report: ReportResult, entries: list, session_dir: str) -> str | None:
-        try:
-            from reportlab.lib import colors
-            from reportlab.lib.pagesizes import LETTER
-            from reportlab.lib.styles import ParagraphStyle, getSampleStyleSheet
-            from reportlab.lib.units import inch
-            from reportlab.platypus import (
-                Image, Paragraph, SimpleDocTemplate, Spacer, Table, TableStyle,
-            )
-        except Exception as e:
-            logger.warning("reportlab unavailable: %s", e)
-            return None
-        out_path = os.path.join(session_dir, "formscout_report.pdf")
-        try:
-            styles = getSampleStyleSheet()
-            banner = ParagraphStyle(
-                "banner", parent=styles["Normal"], fontSize=9, textColor=colors.white,
-                backColor=colors.HexColor("#b45309"), alignment=1, borderPadding=6, spaceAfter=12,
-            )
-            story = []
-            story.append(Paragraph(f"<b>&#9888; {DISCLAIMER}</b>", banner))
-            story.append(Paragraph("FormScout — FMS Screening Report", styles["Title"]))
-            if report.composite is not None:
-                comp = f"Composite: <b>{report.composite} / 21</b>"
-            else:
-                comp = f"Composite: <b>Incomplete</b> — {len(entries)}/7 tests scored"
-            story.append(Paragraph(comp, styles["Heading2"]))
-            story.append(Spacer(1, 0.2 * inch))
-            for e in entries:
-                title = e.test_name.replace("_", " ").title()
-                if e.side in ("left", "right"):
-                    title += f" ({e.side})"
-                score_txt = "Clinician review required" if e.needs_human else f"Score: {e.score}/3"
-                story.append(Paragraph(f"<b>{title}</b> — {score_txt}", styles["Heading3"]))
-                if e.rationale:
-                    story.append(Paragraph(e.rationale, styles["Normal"]))
-                if e.compensation_tags:
-                    story.append(Paragraph("Compensations: " + ", ".join(e.compensation_tags),
-                                           styles["Normal"]))
-                if e.corrective_hint:
-                    story.append(Paragraph("Corrective: " + e.corrective_hint, styles["Normal"]))
-                items = list(e.measurements.items())[:6]
-                if items:
-                    rows = [[k.replace("_", " "),
-                             (f"{v:.1f}" if isinstance(v, float) else str(v))] for k, v in items]
-                    tbl = Table(rows, colWidths=[3 * inch, 1.5 * inch])
-                    tbl.setStyle(TableStyle([
-                        ("FONTSIZE", (0, 0), (-1, -1), 8),
-                        ("TEXTCOLOR", (0, 0), (-1, -1), colors.HexColor("#334155")),
-                    ]))
-                    story.append(tbl)
-                if e.keyframe_path and os.path.exists(e.keyframe_path):
-                    try:
-                        story.append(Image(e.keyframe_path, width=3.0 * inch, height=2.25 * inch))
-                    except Exception:
-                        story.append(Paragraph("<i>(key-frame image unavailable)</i>", styles["Normal"]))
-                else:
-                    story.append(Paragraph("<i>(key-frame image unavailable)</i>", styles["Normal"]))
-                story.append(Spacer(1, 0.2 * inch))
-            if report.asymmetries:
-                story.append(Paragraph("Asymmetries", styles["Heading2"]))
-                for a in report.asymmetries:
-                    story.append(Paragraph(
-                        f"{a['test'].replace('_', ' ').title()}: "
-                        f"L={a['left_score']} R={a['right_score']} (&#916; {a['delta']})",
-                        styles["Normal"]))
-            flags = list(report.low_confidence_flags) + list(report.disagreement_flags)
-            if flags:
-                story.append(Paragraph("Flags", styles["Heading2"]))
-                for fl in flags:
-                    story.append(Paragraph(fl, styles["Normal"]))
-            story.append(Spacer(1, 0.3 * inch))
-            story.append(Paragraph(f"<b>&#9888; {DISCLAIMER}</b>", banner))
-            doc = SimpleDocTemplate(out_path, pagesize=LETTER,
-                                    topMargin=0.6 * inch, bottomMargin=0.6 * inch)
-            doc.build(story)
-            return out_path
-        except Exception as e:
-            logger.warning("pdf build failed: %s", e)
-            return None
-```
-- [ ] **Step 4: Run test to verify it passes**
-Run: `pytest tests/test_pdf_report.py -v`
-Expected: both tests PASS
-- [ ] **Step 5: Re-run the session suite (pdf_path now populated)**
-Run: `pytest tests/test_session.py -v`
-Expected: all PASS (now `finish_session` returns a real `pdf_path`).
-- [ ] **Step 6: Commit**
-```bash
-git add formscout/agents/pdf_report.py tests/test_pdf_report.py
-git commit -m "feat: add PdfReportAgent — branded ReportLab session PDF"
-```
----
-## Task 7: Wire the session UI in `app.py`
-**Files:**
-- Modify: `app.py` (`process_video`, `build_app`, event wiring)
-This task is verified by running the app (Gradio event wiring is not unit-tested; the
-orchestration it calls is already covered by `tests/test_session.py`).
-- [ ] **Step 1: Import the session module**
-In `app.py`, add to the imports block (after `from formscout.startup import ensure_checkpoints`):
-```python
-from formscout import session as session_mod
-```
-- [ ] **Step 2: Refactor `process_video` to accumulate into a session**
-Replace the entire `process_video` function (lines ~51-105) with a version that takes and
-returns the session, appends an entry on success, and builds the "Session so far" table.
-Replace from `def process_video(` through its final `return ...` with:
-```python
-def process_video(video_path: str, test_name: str, side: str, model_key: str,
-                  layers: list[str], session_state):
-    """Analyse one clip and accumulate it into the screening session."""
-    if not video_path:
-        return (
-            session_state, _render_empty_state(), "Upload a video to begin analysis.",
-            "", "", None, "", _render_session_table(session_state),
-            gr.update(visible=False), gr.update(visible=False),
-        )
-    if session_state is None:
-        session_state = session_mod.new_session()
-    director = Director()
-    state = director.run(video_path, test_name=test_name, side=side, model_key=model_key)
-    score_html = _render_empty_state()
-    score_details = ""
-    if state.features:
-        result = score_test(state.features)
-        judge = state.judge
-        if judge and judge.score is not None:
-            score_html = _render_score_card(judge.score, judge.confidence, judge.needs_human)
-            score_details = _render_score_details_judge(judge, result, state.features)
-        elif judge and judge.needs_human:
-            score_html = _render_score_card(0, 0, True)
-            score_details = f"### Needs Clinician Review\n{judge.rationale}"
-        else:
-            score_html = _render_score_card(result.score, result.confidence, result.needs_human)
-            score_details = _render_score_details(result, state.features)
-        # Accumulate into the session (only when we have a real analysis)
-        if state.ingest and state.pose2d and state.judge:
-            draw_trails = "trails" in {lbl.lower().replace(" ", "_") for lbl in (layers or [])}
-            try:
-                session_mod.add_analysis(
-                    session_state, ingest=state.ingest, pose2d=state.pose2d,
-                    features=state.features, judge=state.judge,
-                    test_name=test_name, side=side, draw_trails=draw_trails,
-                )
-            except Exception as e:
-                state.warnings.append(f"session accumulation failed: {e}")
-    pipeline_md = _render_pipeline_status(state)
-    alerts = _render_alerts(state)
-    overlay_path = None
-    vel_summary = ""
-    layer_set = {lbl.lower().replace(" ", "_") for lbl in (layers or [])}
-    if layer_set and state.ingest and state.pose2d:
-        try:
-            from formscout.agents.visualizer import PoseVisualizer, build_velocity_summary
-            vis = PoseVisualizer()
-            with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as f:
-                out_path = f.name
-            overlay_path = vis.render_video(state.ingest, state.pose2d, layer_set, out_path)
-            if overlay_path:
-                vel_summary = build_velocity_summary(state.pose2d.keypoints, vis.last_velocities)
-        except Exception as e:
-            alerts = (alerts or "") + f"\n⚠️ Visualizer error: {e}"
-    has_entries = bool(session_state and session_state.entries)
-    return (
-        session_state, score_html, pipeline_md, score_details, alerts,
-        overlay_path, vel_summary, _render_session_table(session_state),
-        gr.update(visible=has_entries), gr.update(visible=has_entries),
-    )
-```
-- [ ] **Step 3: Add the session-table renderer and finish handler**
-In `app.py`, add these two functions just before `def build_app()`:
-```python
-def _render_session_table(session_state) -> str:
-    """Render the accumulated 'Session so far' table as markdown."""
-    if not session_state or not session_state.entries:
-        return "*No clips analysed yet.*"
-    lines = ["| Test | Side | Score | Status |", "|---|---|---|---|"]
-    for e in session_state.entries:
-        test = e.test_name.replace("_", " ").title()
-        side = e.side if e.side in ("left", "right") else "—"
-        if e.needs_human:
-            score, status = "—", "⚠️ Clinician review"
-        else:
-            score, status = f"{e.score}/3", "✓ scored"
-        lines.append(f"| {test} | {side} | {score} | {status} |")
-    return "\n".join(lines)
-def _finish_session(session_state):
-    """Build the composite report + PDF for the whole session."""
-    if not session_state or not session_state.entries:
-        return ("⚠️ No clips analysed yet — analyse at least one clip first.",
-                None, None)
-    report, pdf_path = session_mod.finish_session(session_state)
-    if report is None:
-        return ("⚠️ Nothing to report.", None, None)
-    if report.composite is not None:
-        summary = [f"## Composite: {report.composite} / 21"]
-    else:
-        n = len(session_state.entries)
-        summary = [f"## Composite: Incomplete — {n}/7 tests scored",
-                   "*(One or more tests need clinician review or were unscored.)*"]
-    if report.asymmetries:
-        summary.append("\n### Asymmetries")
-        for a in report.asymmetries:
-            test = a["test"].replace("_", " ").title()
-            summary.append(f"- **{test}:** L={a['left_score']} R={a['right_score']} (Δ {a['delta']})")
-    flags = list(report.low_confidence_flags) + list(report.disagreement_flags)
-    if flags:
-        summary.append("\n### Flags")
-        for fl in flags:
-            summary.append(f"- {fl}")
-    md_path = os.path.join(session_state.session_dir, "analysis.md")
-    md_out = md_path if os.path.exists(md_path) else None
-    return "\n".join(summary), pdf_path, md_out
-```
-Also add `import os` to the top of `app.py` if not already present (it currently imports only
-`tempfile` and `gradio`). Add after `import tempfile`:
-```python
-import os
-```
-- [ ] **Step 4: Add the session state, buttons, and outputs to `build_app`**
-In `build_app`, inside the `with gr.Blocks(...) as app:` block, immediately after the line
-`with gr.Blocks(title="FormScout — FMS Screening Aid") as app:` add:
-```python
-        session_state = gr.State(None)
-```
-Then, in the left input column, replace the single submit button block:
-```python
-                submit_btn = gr.Button(
-                    "🎯 Score Movement",
-                    variant="primary",
-                    size="lg",
-                )
-```
-with:
-```python
-                submit_btn = gr.Button(
-                    "🎯 Score Movement",
-                    variant="primary",
-                    size="lg",
-                )
-                with gr.Row():
-                    new_clip_btn = gr.Button("➕ Analyse new clip", visible=False)
-                    finish_btn = gr.Button("✅ Finish & generate PDF",
-                                           variant="primary", visible=False)
-```
-In the right results column, add a "Session" tab and a finish-output area. Inside `with gr.Tabs():`
-add a new tab after the "🎬 Overlay Video" tab:
-```python
-                    with gr.TabItem("🗂️ Session"):
-                        session_table = gr.Markdown("*No clips analysed yet.*")
-                        finish_summary = gr.Markdown("")
-                        pdf_file = gr.File(label="Screening Report (PDF)", visible=True)
-                        md_file = gr.File(label="Analysis Log (Markdown)", visible=True)
-```
-- [ ] **Step 5: Update event wiring**
-Replace the `_map_inputs` function and `submit_btn.click(...)` block at the bottom of `build_app`
-with:
-```python
-        def _map_inputs(video, test_display_name, side_display, pose_model_key, overlay_layers, sess):
-            """Map UI display values to internal values and accumulate into the session."""
-            test_map = {name: val for name, val in FMS_TESTS}
-            test_name = test_map.get(test_display_name, "deep_squat")
-            side = {"N/A": "na", "Left": "left", "Right": "right"}.get(side_display, "na")
-            return process_video(video, test_name, side, pose_model_key, overlay_layers, sess)
-        submit_btn.click(
-            fn=_map_inputs,
-            inputs=[video_input, test_dropdown, side_dropdown, pose_model_dropdown,
-                    overlay_layers, session_state],
-            outputs=[session_state, score_html, pipeline_md, score_details, alerts_md,
-                     overlay_video, velocity_md, session_table, new_clip_btn, finish_btn],
-        )
-        def _new_clip():
-            """Clear inputs for the next clip; keep the session intact."""
-            return None, _render_empty_state(), ""
-        new_clip_btn.click(
-            fn=_new_clip,
-            inputs=[],
-            outputs=[video_input, score_html, score_details],
-        )
-        finish_btn.click(
-            fn=_finish_session,
-            inputs=[session_state],
-            outputs=[finish_summary, pdf_file, md_file],
-        )
-```
-- [ ] **Step 6: Verify the full test suite still passes**
-Run: `pytest tests/ -q`
-Expected: all tests pass except the single pre-existing known failure documented in CLAUDE.md
-(`test_unimplemented_test_returns_low_confidence`). No new failures.
-- [ ] **Step 7: Manually verify the app**
-Run: `python3 app.py`
-Then in the browser:
-1. Upload a clip, pick a test, click **Score Movement** → score card appears; the **Session** tab
-   shows one row; the two new buttons appear.
-2. Click **➕ Analyse new clip** → the video input clears, the session row persists.
-3. Analyse a second test → a second row appears.
-4. Click **✅ Finish & generate PDF** → the Session tab shows the composite summary and a
-   downloadable PDF (open it: disclaimer top + bottom, per-test blocks with key-frame images,
-   composite or "Incomplete"). The Markdown log is also downloadable.
-Expected: all four steps work; PDF opens and contains the disclaimer, composite, and per-test sections.
-- [ ] **Step 8: Commit**
-```bash
-git add app.py
-git commit -m "feat: accumulate FMS clips into a session with composite report + PDF export"
-```
----
-## Task 8: Update docs
-**Files:**
-- Modify: `CLAUDE.md` (Build phases / status)
-- Modify: `MODEL_BUDGET.md` (no param change — note PDF agent adds 0 params, for completeness)
-- [ ] **Step 1: Update the Phase 4 line in CLAUDE.md**
-In `CLAUDE.md`, in the "Build phases" section, update the Phase 4 line from:
-```
-4. **Phase 4 — Polish + ship:** Custom Svelte UI components, PDF export, agent trace to Hub, blog post. (Overlay video already done via `PoseVisualizer`.)
-```
-to:
-```
-4. **Phase 4 — Polish + ship:** Custom Svelte UI components, agent trace to Hub, blog post. (Overlay video done via `PoseVisualizer`; full 7-test session + PDF export done via `formscout/session.py` + `PdfReportAgent`.)
-```
-- [ ] **Step 2: Note the PDF agent in the architecture section**
-In `CLAUDE.md`, under "### Rubric scorers" or near the ReportAgent description, this is optional
-context; no required change. Skip if no natural home.
-- [ ] **Step 3: Commit**
-```bash
-git add CLAUDE.md
-git commit -m "docs: mark full FMS session + PDF export complete in build phases"
-```
----
-## Self-Review Notes (already applied)
-- **Spec coverage:** session accumulation (Task 5), two-button UX (Task 7), on-disk MD/JSON/keyframes (Task 5), key-frame from `features.timing` (Tasks 3–5), ReportLab PDF top/bottom disclaimer + composite + per-test + asymmetry + flags (Task 6), `SessionEntry` type (Task 2), `ReportAgent` reuse (Task 5 `finish_session`), composite-null-on-needs-human (Task 5 test), error tolerance / never-raise (Tasks 4–6). All covered.
-- **Type consistency:** `SessionEntry` field names are identical across Tasks 2, 5, 6, 7. `finish_session` returns `(ReportResult | None, str | None)` and is consumed that way in Task 7. `render_frame(ingest, pose2d, frame_idx, layers, caption, out_png)` signature matches its callers.
-- **No placeholders:** every code step shows complete code; every run step states the exact command + expected outcome.

+# Full FMS Session + PDF Report — Implementation Plan
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+**Goal:** Turn FormScout's one-clip scorer into a screening session that accumulates analyzed clips into a composite 0–21 report and exports a branded PDF with annotated worst-moment key-frame stills.
+**Architecture:** A new `formscout/session.py` accumulates typed `SessionEntry` objects (one per analyzed clip), persisting each to a temp session dir. `PoseVisualizer.render_frame()` captures the governing frame (already computed by `BiomechanicsAgent` and stored in `features.timing`) as an annotated PNG. On "Finish", the existing `ReportAgent` computes composite + asymmetries, and a new `PdfReportAgent` renders a ReportLab PDF. The UI (`app.py`) gains `gr.State` session accumulation with "Analyse new clip" / "Finish & generate PDF" buttons.
+**Tech Stack:** Python 3.13, ReportLab (new dep), OpenCV (existing), Gradio 5, pytest. No model downloads in tests.
+---
+## File Structure
+- `requirements.txt` — add `reportlab`.
+- `formscout/types.py` — add `SessionEntry` frozen dataclass.
+- `formscout/agents/biomechanics.py` — add `max_sag_frame` to `trunk_stability_pushup` timing (rotary already has `peak_extension_frame`).
+- `formscout/agents/visualizer.py` — add `PoseVisualizer.render_frame()`.
+- `formscout/session.py` — **new**: session accumulator (new/add/finish + persistence + key-frame helpers).
+- `formscout/agents/pdf_report.py` — **new**: `PdfReportAgent` (ReportLab).
+- `app.py` — wire `gr.State`, two buttons, "Session so far" table, finish handler.
+- `tests/test_session.py`, `tests/test_keyframe.py`, `tests/test_pdf_report.py` — **new**.
+---
+## Task 1: Add ReportLab dependency
+**Files:**
+- Modify: `requirements.txt`
+- [ ] **Step 1: Add the dependency**
+Add this line to `requirements.txt` (after `pillow>=10.3`):
+```
+reportlab>=4.0
+```
+- [ ] **Step 2: Install it**
+Run: `pip install 'reportlab>=4.0'`
+Expected: `Successfully installed reportlab-4.x.x`
+- [ ] **Step 3: Verify import**
+Run: `python3 -c "import reportlab; print(reportlab.Version)"`
+Expected: prints a version like `4.x.x`
+- [ ] **Step 4: Commit**
+```bash
+git add requirements.txt
+git commit -m "build: add reportlab for PDF report generation"
+```
+---
+## Task 2: Add `SessionEntry` dataclass
+**Files:**
+- Modify: `formscout/types.py` (after `ReportResult`, before `PipelineState`)
+- Test: `tests/test_session.py`
+- [ ] **Step 1: Write the failing test**
+Create `tests/test_session.py` with:
+```python
+"""Tests for the FMS session accumulator — no GPU, no model downloads."""
+import numpy as np
+from formscout.types import (
+    IngestResult, Pose2DResult, BiomechFeatures, ScoreResult, JudgeResult,
+    MovementResult, SessionEntry,
+)
+def test_session_entry_holds_typed_objects():
+    movement = MovementResult(test_name="deep_squat", side="na", confidence=1.0)
+    features = BiomechFeatures(
+        test_name="deep_squat", view="2d", side="na",
+        angles={"left_knee_flexion_deg": 95.0}, alignments={"knees_tracking_over_feet": True},
+        symmetry_delta=None, timing={"deepest_frame": 2}, confidence=0.9,
+    )
+    rubric = ScoreResult(score=2, rationale="ok", confidence=0.8)
+    judge = JudgeResult(score=2, rationale="ok", compensation_tags=["heels elevated"],
+                        corrective_hint="ankle mobility", confidence=0.85)
+    entry = SessionEntry(
+        test_name="deep_squat", side="na", score=2, needs_human=False,
+        rationale="ok", compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
+        measurements={"left_knee_flexion_deg": 95.0}, confidence=0.85, view="2d",
+        keyframe_path=None, movement=movement, features=features,
+        rubric_score=rubric, judge=judge,
+    )
+    assert entry.score == 2
+    assert entry.movement.test_name == "deep_squat"
+    assert entry.rubric_score.score == 2
+    assert entry.judge.compensation_tags == ["heels elevated"]
+```
+- [ ] **Step 2: Run test to verify it fails**
+Run: `pytest tests/test_session.py::test_session_entry_holds_typed_objects -v`
+Expected: FAIL with `ImportError: cannot import name 'SessionEntry'`
+- [ ] **Step 3: Add the dataclass**
+In `formscout/types.py`, insert after the `ReportResult` class (line ~142) and before `PipelineState`:
+```python
+@dataclass(frozen=True)
+class SessionEntry:
+    """One accumulated analysis in a screening session.
+    Display fields (test_name…keyframe_path) feed the PDF/JSON/MD artifacts;
+    the trailing typed objects (movement…judge) feed ReportAgent.run().
+    """
+    test_name: str
+    side: str
+    score: int | None
+    needs_human: bool
+    rationale: str
+    compensation_tags: list
+    corrective_hint: str
+    measurements: dict
+    confidence: float
+    view: str
+    keyframe_path: str | None
+    movement: MovementResult
+    features: BiomechFeatures
+    rubric_score: ScoreResult
+    judge: JudgeResult | None
+```
+- [ ] **Step 4: Run test to verify it passes**
+Run: `pytest tests/test_session.py::test_session_entry_holds_typed_objects -v`
+Expected: PASS
+- [ ] **Step 5: Commit**
+```bash
+git add formscout/types.py tests/test_session.py
+git commit -m "feat: add SessionEntry typed contract for screening sessions"
+```
+---
+## Task 3: Add governing-frame index to push-up biomechanics
+**Files:**
+- Modify: `formscout/agents/biomechanics.py:468-529` (`_trunk_stability_pushup`)
+- Test: `tests/test_biomechanics.py` (append a test)
+The other six tests already store a governing frame index in `features.timing`
+(`deepest_frame`, `peak_step_frame`, `deepest_lunge_frame`, `measure_frame`,
+`peak_raise_frame`, `peak_extension_frame`). Only `trunk_stability_pushup` is missing one.
+- [ ] **Step 1: Write the failing test**
+Append to `tests/test_biomechanics.py`:
+```python
+def test_pushup_timing_has_max_sag_frame():
+    from formscout.agents.biomechanics import BiomechanicsAgent
+    from formscout.types import Pose2DResult, Body3DResult, MovementResult
+    # 4 frames; frame 2 has the largest hip sag (hip far below shoulder/ankle midline)
+    def kps(hip_y):
+        base = {
+            5: {"x": 200, "y": 200, "conf": 0.9},   # L shoulder
+            6: {"x": 220, "y": 200, "conf": 0.9},   # R shoulder
+            11: {"x": 300, "y": hip_y, "conf": 0.9}, # L hip
+            12: {"x": 320, "y": hip_y, "conf": 0.9}, # R hip
+            15: {"x": 400, "y": 200, "conf": 0.9},   # L ankle
+            16: {"x": 420, "y": 200, "conf": 0.9},   # R ankle
+        }
+        return base
+    frames = [kps(200), kps(210), kps(260), kps(205)]
+    pose = Pose2DResult(keypoints=frames, fps=30.0, confidence=0.9)
+    body3d = Body3DResult(used=False, joints_3d=[])
+    movement = MovementResult(test_name="trunk_stability_pushup", side="na", confidence=1.0)
+    feats = BiomechanicsAgent().run(pose, body3d, movement)
+    assert "max_sag_frame" in feats.timing
+    assert feats.timing["max_sag_frame"] == 2
+```
+- [ ] **Step 2: Run test to verify it fails**
+Run: `pytest tests/test_biomechanics.py::test_pushup_timing_has_max_sag_frame -v`
+Expected: FAIL with `assert 'max_sag_frame' in {...}` (KeyError-style assertion failure)
+- [ ] **Step 3: Track the max-sag frame index**
+In `formscout/agents/biomechanics.py`, replace the body of `_trunk_stability_pushup` from the
+`trunk_angles_over_time = []` loop through the `if trunk_angles_over_time:` block. Replace:
+```python
+        # Analyze multiple frames to detect sag/lag
+        trunk_angles_over_time = []
+        for i, kps in enumerate(pose2d.keypoints):
+```
+…down to and including the `alignments["no_sag"] = max_sag < 30` line, with:
+```python
+        # Analyze multiple frames to detect sag/lag
+        trunk_sags: list[tuple[int, float]] = []  # (frame_idx, sag_px)
+        for i, kps in enumerate(pose2d.keypoints):
+            l_sh = _get_joint(kps, L_SHOULDER)
+            r_sh = _get_joint(kps, R_SHOULDER)
+            l_hip = _get_joint(kps, L_HIP)
+            r_hip = _get_joint(kps, R_HIP)
+            l_ankle = _get_joint(kps, L_ANKLE)
+            r_ankle = _get_joint(kps, R_ANKLE)
+            if l_sh and r_sh and l_hip and r_hip and l_ankle and r_ankle:
+                sh_y = (l_sh[1] + r_sh[1]) / 2
+                hip_y = (l_hip[1] + r_hip[1]) / 2
+                ankle_y = (l_ankle[1] + r_ankle[1]) / 2
+                expected_hip_y = (sh_y + ankle_y) / 2
+                sag_px = hip_y - expected_hip_y
+                trunk_sags.append((i, sag_px))
+        max_sag_frame = 0
+        if trunk_sags:
+            sags = [s for _, s in trunk_sags]
+            max_sag_frame = max(trunk_sags, key=lambda t: t[1])[0]
+            mean = sum(sags) / len(sags)
+            variance = (sum((x - mean) ** 2 for x in sags) / len(sags)) ** 0.5
+            max_sag = max(sags)
+            angles["max_sag_px"] = max_sag
+            angles["trunk_variance_px"] = variance
+            alignments["body_rigid"] = max_sag < 30 and variance < 15
+            alignments["no_sag"] = max_sag < 30
+        else:
+            notes_parts.append("insufficient landmarks for trunk analysis")
+```
+Then update the `return BiomechFeatures(...)` `timing=` argument at the end of the method from:
+```python
+            timing={"n_frames_analyzed": len(trunk_angles_over_time)},
+```
+to:
+```python
+            timing={"n_frames_analyzed": len(trunk_sags), "max_sag_frame": max_sag_frame},
+```
+- [ ] **Step 4: Run test to verify it passes**
+Run: `pytest tests/test_biomechanics.py::test_pushup_timing_has_max_sag_frame -v`
+Expected: PASS
+- [ ] **Step 5: Run the full biomechanics suite (no regressions)**
+Run: `pytest tests/test_biomechanics.py -v`
+Expected: all previously-passing tests still pass (the pre-existing `test_unimplemented_test_returns_low_confidence` known-failure may remain failing — that is unrelated and documented in CLAUDE.md).
+- [ ] **Step 6: Commit**
+```bash
+git add formscout/agents/biomechanics.py tests/test_biomechanics.py
+git commit -m "feat: track max-sag frame index in push-up biomechanics for key-frame capture"
+```
+---
+## Task 4: Add `PoseVisualizer.render_frame()`
+**Files:**
+- Modify: `formscout/agents/visualizer.py` (add method to `PoseVisualizer`, after `render_video`)
+- Test: `tests/test_keyframe.py`
+- [ ] **Step 1: Write the failing test**
+Create `tests/test_keyframe.py`:
+```python
+"""Tests for PoseVisualizer.render_frame — single annotated still."""
+import os
+import numpy as np
+from formscout.types import IngestResult, Pose2DResult
+def _ingest(n=5, h=480, w=640):
+    frames = [np.zeros((h, w, 3), dtype=np.uint8) for _ in range(n)]
+    return IngestResult(frames=frames, fps=30.0, duration=n / 30.0, n_people=1, width=w, height=h)
+def _pose(n=5):
+    kps = []
+    for i in range(n):
+        kps.append({j: {"x": float(50 + j * 25), "y": float(80 + j * 18), "conf": 0.9}
+                    for j in range(17)})
+    return Pose2DResult(keypoints=kps, fps=30.0, confidence=0.9)
+def test_render_frame_writes_png(tmp_path):
+    from formscout.agents.visualizer import PoseVisualizer
+    out = str(tmp_path / "key.png")
+    path = PoseVisualizer().render_frame(_ingest(), _pose(), frame_idx=2,
+                                         layers={"skeleton"}, caption="Deep Squat — heels elevated",
+                                         out_png=out)
+    assert path == out
+    assert os.path.exists(out)
+    assert os.path.getsize(out) > 0
+def test_render_frame_bad_index_returns_none(tmp_path):
+    from formscout.agents.visualizer import PoseVisualizer
+    out = str(tmp_path / "key.png")
+    path = PoseVisualizer().render_frame(_ingest(n=3), _pose(n=3), frame_idx=99,
+                                         layers={"skeleton"}, caption="", out_png=out)
+    assert path is None
+```
+- [ ] **Step 2: Run test to verify it fails**
+Run: `pytest tests/test_keyframe.py -v`
+Expected: FAIL with `AttributeError: 'PoseVisualizer' object has no attribute 'render_frame'`
+- [ ] **Step 3: Add the method**
+In `formscout/agents/visualizer.py`, inside the `PoseVisualizer` class, add this method
+immediately after `render_video` (before the closing of the class / the module-level
+`build_velocity_summary`):
+```python
+    def render_frame(
+        self,
+        ingest,
+        pose2d,
+        frame_idx: int,
+        layers: set[str],
+        caption: str = "",
+        out_png: str | None = None,
+    ) -> str | None:
+        """Render a single annotated still (skeleton + optional trails + caption).
+        frame_idx is typically the governing frame from BiomechFeatures.timing.
+        Returns the PNG path on success, None on any failure. Never raises.
+        """
+        try:
+            if not (0 <= frame_idx < len(ingest.frames)) or frame_idx >= len(pose2d.keypoints):
+                return None
+            frame = ingest.frames[frame_idx].copy()
+            kps = pose2d.keypoints[frame_idx]
+            if "trails" in layers:
+                trail: dict[int, deque] = {j: deque(maxlen=TRAIL_LENGTH) for j in range(17)}
+                start = max(0, frame_idx - TRAIL_LENGTH)
+                for fi in range(start, frame_idx + 1):
+                    for j, kp in pose2d.keypoints[fi].items():
+                        if kp.get("conf", 0.0) >= CONF_THRESHOLD:
+                            trail[j].append((kp["x"], kp["y"]))
+                frame = self._draw_trails(frame, trail)
+            if "skeleton" in layers:
+                frame = self._draw_skeleton(frame, kps)
+            if caption:
+                cv2.rectangle(frame, (0, 0), (frame.shape[1], 28), (0, 0, 0), -1)
+                cv2.putText(frame, caption[:80], (8, 20), cv2.FONT_HERSHEY_SIMPLEX,
+                            0.55, (255, 255, 255), 1, cv2.LINE_AA)
+            if out_png is None:
+                out_png = tempfile.NamedTemporaryFile(suffix=".png", delete=False).name
+            ok = cv2.imwrite(out_png, frame)
+            return out_png if ok else None
+        except Exception as e:
+            logger.warning("render_frame failed: %s", e)
+            return None
+```
+(`deque`, `cv2`, `tempfile`, `logger`, `TRAIL_LENGTH`, `CONF_THRESHOLD` are all already imported at the top of this file.)
+- [ ] **Step 4: Run test to verify it passes**
+Run: `pytest tests/test_keyframe.py -v`
+Expected: both tests PASS
+- [ ] **Step 5: Commit**
+```bash
+git add formscout/agents/visualizer.py tests/test_keyframe.py
+git commit -m "feat: add PoseVisualizer.render_frame for annotated key-frame stills"
+```
+---
+## Task 5: Create the session accumulator
+**Files:**
+- Create: `formscout/session.py`
+- Test: `tests/test_session.py` (append tests)
+- [ ] **Step 1: Write the failing tests**
+Append to `tests/test_session.py`:
+```python
+def _ingest(n=5, h=480, w=640):
+    frames = [np.zeros((h, w, 3), dtype=np.uint8) for _ in range(n)]
+    return IngestResult(frames=frames, fps=30.0, duration=n / 30.0, n_people=1, width=w, height=h)
+def _pose(n=5):
+    kps = []
+    for i in range(n):
+        kps.append({j: {"x": float(50 + j * 25), "y": float(80 + j * 18), "conf": 0.9}
+                    for j in range(17)})
+    return Pose2DResult(keypoints=kps, fps=30.0, confidence=0.9)
+def _features(test_name="deep_squat", side="na", frame_key="deepest_frame"):
+    return BiomechFeatures(
+        test_name=test_name, view="2d", side=side,
+        angles={"left_knee_flexion_deg": 95.0},
+        alignments={"knees_tracking_over_feet": False},
+        symmetry_delta=None, timing={frame_key: 2}, confidence=0.9,
+    )
+def _judge(score=2, needs_human=False):
+    return JudgeResult(
+        score=None if needs_human else score, rationale="r",
+        compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
+        confidence=0.85, needs_human=needs_human,
+    )
+def test_add_analysis_appends_entry_and_writes_files():
+    import os
+    from formscout import session as S
+    sess = S.new_session()
+    entry = S.add_analysis(sess, ingest=_ingest(), pose2d=_pose(),
+                           features=_features(), judge=_judge(), test_name="deep_squat", side="na")
+    assert len(sess.entries) == 1
+    assert entry.score == 2
+    assert os.path.exists(os.path.join(sess.session_dir, "session.json"))
+    assert os.path.exists(os.path.join(sess.session_dir, "analysis.md"))
+    # key-frame still written (deepest_frame=2 is valid)
+    assert entry.keyframe_path and os.path.exists(entry.keyframe_path)
+def test_finish_composite_null_when_needs_human():
+    from formscout import session as S
+    sess = S.new_session()
+    S.add_analysis(sess, ingest=_ingest(), pose2d=_pose(), features=_features(),
+                   judge=_judge(score=3), test_name="deep_squat", side="na")
+    S.add_analysis(sess, ingest=_ingest(), pose2d=_pose(),
+                   features=_features("trunk_stability_pushup", frame_key="max_sag_frame"),
+                   judge=_judge(needs_human=True), test_name="trunk_stability_pushup", side="na")
+    report, pdf_path = S.finish_session(sess)
+    assert report is not None
+    assert report.composite is None  # one test needs_human
+def test_finish_empty_session_returns_none():
+    from formscout import session as S
+    sess = S.new_session()
+    report, pdf_path = S.finish_session(sess)
+    assert report is None and pdf_path is None
+```
+- [ ] **Step 2: Run tests to verify they fail**
+Run: `pytest tests/test_session.py -v`
+Expected: the three new tests FAIL with `ModuleNotFoundError: No module named 'formscout.session'`
+- [ ] **Step 3: Create the module**
+Create `formscout/session.py`:
+```python
+"""
+Screening-session accumulator.
+Accumulates one SessionEntry per analyzed clip, persists each to a temp session
+dir (session.json + analysis.md + key-frame PNGs), and on finish builds a
+ReportResult (via ReportAgent) + a PDF (via PdfReportAgent).
+Pure orchestration — no Gradio imports. Disk writes tolerate failure with a
+logged warning and never block scoring.
+"""
+from __future__ import annotations
+import json
+import logging
+import os
+import tempfile
+import uuid
+from dataclasses import dataclass, replace
+from formscout.rubric import score_test
+from formscout.types import MovementResult, ReportResult, SessionEntry
+logger = logging.getLogger(__name__)
+# Maps each test to the BiomechFeatures.timing key holding its governing frame.
+TIMING_KEY = {
+    "deep_squat": "deepest_frame",
+    "hurdle_step": "peak_step_frame",
+    "inline_lunge": "deepest_lunge_frame",
+    "shoulder_mobility": "measure_frame",
+    "active_slr": "peak_raise_frame",
+    "trunk_stability_pushup": "max_sag_frame",
+    "rotary_stability": "peak_extension_frame",
+}
+@dataclass
+class Session:
+    """Mutable session: an id, its temp dir, and accumulated entries."""
+    session_id: str
+    session_dir: str
+    entries: list  # list[SessionEntry]
+def new_session() -> Session:
+    sid = uuid.uuid4().hex[:12]
+    base = os.path.join(tempfile.gettempdir(), "formscout_sessions", sid)
+    try:
+        os.makedirs(os.path.join(base, "keyframes"), exist_ok=True)
+    except Exception as e:
+        logger.warning("session dir create failed: %s", e)
+    return Session(session_id=sid, session_dir=base, entries=[])
+def governing_frame_index(features) -> int | None:
+    """Return the governing frame index for this test, or None."""
+    key = TIMING_KEY.get(features.test_name)
+    if key is None:
+        return None
+    idx = features.timing.get(key)
+    return int(idx) if isinstance(idx, (int, float)) else None
+def worst_compensation_caption(judge, features) -> str:
+    """Short caption naming the worst compensation for the key-frame still."""
+    if judge and getattr(judge, "compensation_tags", None):
+        return ", ".join(judge.compensation_tags)
+    failed = [k.replace("_", " ") for k, v in features.alignments.items() if v is False]
+    return ("compensation: " + ", ".join(failed)) if failed else "key position"
+def add_analysis(session, *, ingest, pose2d, features, judge, test_name, side,
+                 draw_trails: bool = False) -> SessionEntry:
+    """Build a SessionEntry from a completed analysis, render its key-frame,
+    persist the session, append, and return the entry."""
+    movement = MovementResult(test_name=test_name, side=side, confidence=1.0)
+    rubric = score_test(features)
+    needs_human = bool((judge and judge.needs_human) or rubric.needs_human)
+    if needs_human:
+        score = None
+    elif judge and judge.score is not None:
+        score = judge.score
+    else:
+        score = rubric.score
+    keyframe_path = None
+    idx = governing_frame_index(features)
+    if idx is not None and 0 <= idx < len(pose2d.keypoints):
+        from formscout.agents.visualizer import PoseVisualizer
+        caption = (f"{test_name.replace('_', ' ').title()} "
+                   f"({side}) — {worst_compensation_caption(judge, features)}")
+        layers = {"skeleton", "trails"} if draw_trails else {"skeleton"}
+        out_png = os.path.join(session.session_dir, "keyframes", f"{test_name}_{side}.png")
+        try:
+            keyframe_path = PoseVisualizer().render_frame(ingest, pose2d, idx, layers, caption, out_png)
+        except Exception as e:
+            logger.warning("keyframe render failed: %s", e)
+    measurements = {}
+    measurements.update(features.angles)
+    measurements.update(features.alignments)
+    entry = SessionEntry(
+        test_name=test_name, side=side, score=score, needs_human=needs_human,
+        rationale=(judge.rationale if judge else rubric.rationale),
+        compensation_tags=list(judge.compensation_tags) if judge else [],
+        corrective_hint=(judge.corrective_hint if judge else ""),
+        measurements=measurements,
+        confidence=(judge.confidence if judge else rubric.confidence),
+        view=features.view,
+        keyframe_path=keyframe_path,
+        movement=movement, features=features, rubric_score=rubric, judge=judge,
+    )
+    session.entries.append(entry)
+    _persist(session)
+    return entry
+def finish_session(session) -> tuple[ReportResult | None, str | None]:
+    """Build the composite report + PDF. Returns (report, pdf_path).
+    Returns (None, None) for an empty session."""
+    if not session.entries:
+        return None, None
+    from formscout.agents.report import ReportAgent
+    report_inputs = [{
+        "movement": e.movement, "features": e.features,
+        "rubric_score": e.rubric_score, "judge": e.judge, "side": e.side,
+    } for e in session.entries]
+    report = ReportAgent().run(report_inputs)
+    pdf_path = None
+    try:
+        from formscout.agents.pdf_report import PdfReportAgent
+        pdf_path = PdfReportAgent().run(report, session.entries, session.session_dir)
+    except Exception as e:
+        logger.warning("pdf generation failed: %s", e)
+    report = replace(report, pdf_path=pdf_path)
+    return report, pdf_path
+# ── Persistence ───────────────────────────────────────────────────────────────
+def _jsonable(d: dict) -> dict:
+    out = {}
+    for k, v in d.items():
+        if isinstance(v, float):
+            out[k] = round(v, 2)
+        elif isinstance(v, (int, str, bool)) or v is None:
+            out[k] = v
+        else:
+            out[k] = str(v)
+    return out
+def _entry_display(e: SessionEntry) -> dict:
+    return {
+        "test_name": e.test_name, "side": e.side, "score": e.score,
+        "needs_human": e.needs_human, "rationale": e.rationale,
+        "compensation_tags": list(e.compensation_tags), "corrective_hint": e.corrective_hint,
+        "measurements": _jsonable(e.measurements), "confidence": round(e.confidence, 2),
+        "view": e.view, "keyframe_path": e.keyframe_path,
+    }
+def _render_markdown(session: Session) -> str:
+    lines = ["# FormScout — Session Log", ""]
+    for e in session.entries:
+        title = e.test_name.replace("_", " ").title()
+        if e.side in ("left", "right"):
+            title += f" ({e.side})"
+        score = "Clinician review required" if e.needs_human else f"{e.score}/3"
+        lines.append(f"## {title} — {score}")
+        lines.append(e.rationale or "")
+        if e.compensation_tags:
+            lines.append(f"- Compensations: {', '.join(e.compensation_tags)}")
+        if e.corrective_hint:
+            lines.append(f"- Corrective: {e.corrective_hint}")
+        if e.keyframe_path:
+            lines.append(f"- Key frame: `{e.keyframe_path}`")
+        lines.append("")
+    return "\n".join(lines)
+def _persist(session: Session) -> None:
+    try:
+        with open(os.path.join(session.session_dir, "session.json"), "w") as f:
+            json.dump([_entry_display(e) for e in session.entries], f, indent=2)
+        with open(os.path.join(session.session_dir, "analysis.md"), "w") as f:
+            f.write(_render_markdown(session))
+    except Exception as e:
+        logger.warning("session persist failed: %s", e)
+```
+- [ ] **Step 4: Run tests to verify they pass**
+Run: `pytest tests/test_session.py -v`
+Expected: all session tests PASS (Task 6 provides `PdfReportAgent`; `finish_session` tolerates its
+absence via the try/except, so these pass now — `pdf_path` may be `None` until Task 6).
+- [ ] **Step 5: Commit**
+```bash
+git add formscout/session.py tests/test_session.py
+git commit -m "feat: add screening-session accumulator with key-frame capture and persistence"
+```
+---
+## Task 6: Create `PdfReportAgent`
+**Files:**
+- Create: `formscout/agents/pdf_report.py`
+- Test: `tests/test_pdf_report.py`
+- [ ] **Step 1: Write the failing test**
+Create `tests/test_pdf_report.py`:
+```python
+"""Tests for PdfReportAgent — no GPU, no model downloads."""
+import os
+from formscout.types import (
+    ReportResult, SessionEntry, MovementResult, BiomechFeatures, ScoreResult, JudgeResult,
+)
+def _entry(test_name="deep_squat", score=2, needs_human=False):
+    movement = MovementResult(test_name=test_name, side="na", confidence=1.0)
+    features = BiomechFeatures(
+        test_name=test_name, view="2d", side="na",
+        angles={"left_knee_flexion_deg": 95.0}, alignments={"knees_tracking_over_feet": False},
+        symmetry_delta=None, timing={"deepest_frame": 1}, confidence=0.9,
+    )
+    rubric = ScoreResult(score=2, rationale="rubric ok", confidence=0.8)
+    judge = JudgeResult(score=None if needs_human else score, rationale="judge rationale",
+                        compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
+                        confidence=0.85, needs_human=needs_human)
+    return SessionEntry(
+        test_name=test_name, side="na", score=None if needs_human else score,
+        needs_human=needs_human, rationale="judge rationale",
+        compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
+        measurements={"left_knee_flexion_deg": 95.0, "knees_tracking_over_feet": False},
+        confidence=0.85, view="2d", keyframe_path=None,
+        movement=movement, features=features, rubric_score=rubric, judge=judge,
+    )
+def _report(composite=2):
+    return ReportResult(
+        per_test=[], composite=composite, asymmetries=[],
+        overlay_video_path=None, pdf_path=None,
+        low_confidence_flags=[], disagreement_flags=[],
+    )
+def test_pdf_is_created(tmp_path):
+    from formscout.agents.pdf_report import PdfReportAgent
+    path = PdfReportAgent().run(_report(2), [_entry()], str(tmp_path))
+    assert path is not None
+    assert os.path.exists(path)
+    assert os.path.getsize(path) > 1000  # a real PDF, not an empty file
+    with open(path, "rb") as f:
+        assert f.read(5) == b"%PDF-"
+def test_pdf_handles_incomplete_composite(tmp_path):
+    from formscout.agents.pdf_report import PdfReportAgent
+    path = PdfReportAgent().run(_report(None), [_entry(needs_human=True)], str(tmp_path))
+    assert path is not None and os.path.exists(path)
+```
+- [ ] **Step 2: Run test to verify it fails**
+Run: `pytest tests/test_pdf_report.py -v`
+Expected: FAIL with `ModuleNotFoundError: No module named 'formscout.agents.pdf_report'`
+- [ ] **Step 3: Create the agent**
+Create `formscout/agents/pdf_report.py`:
+```python
+"""
+PdfReportAgent — renders a ReportResult + session entries to a branded PDF.
+Input:  ReportResult, list[SessionEntry], session_dir (str)
+Output: path to the written PDF (str), or None on failure.
+Failure: returns None, never raises.
+Params: 0 (pure rendering — no model).
+License: n/a.
+Gated: no.
+"""
+from __future__ import annotations
+import logging
+import os
+from formscout.types import ReportResult
+logger = logging.getLogger(__name__)
+DISCLAIMER = "Screening aid — not a diagnosis. Pain or clearing tests require a clinician."
+class PdfReportAgent:
+    """Assembles the screening-session PDF via ReportLab."""
+    def run(self, report: ReportResult, entries: list, session_dir: str) -> str | None:
+        try:
+            from reportlab.lib import colors
+            from reportlab.lib.pagesizes import LETTER
+            from reportlab.lib.styles import ParagraphStyle, getSampleStyleSheet
+            from reportlab.lib.units import inch
+            from reportlab.platypus import (
+                Image, Paragraph, SimpleDocTemplate, Spacer, Table, TableStyle,
+            )
+        except Exception as e:
+            logger.warning("reportlab unavailable: %s", e)
+            return None
+        out_path = os.path.join(session_dir, "formscout_report.pdf")
+        try:
+            styles = getSampleStyleSheet()
+            banner = ParagraphStyle(
+                "banner", parent=styles["Normal"], fontSize=9, textColor=colors.white,
+                backColor=colors.HexColor("#b45309"), alignment=1, borderPadding=6, spaceAfter=12,
+            )
+            story = []
+            story.append(Paragraph(f"<b>&#9888; {DISCLAIMER}</b>", banner))
+            story.append(Paragraph("FormScout — FMS Screening Report", styles["Title"]))
+            if report.composite is not None:
+                comp = f"Composite: <b>{report.composite} / 21</b>"
+            else:
+                comp = f"Composite: <b>Incomplete</b> — {len(entries)}/7 tests scored"
+            story.append(Paragraph(comp, styles["Heading2"]))
+            story.append(Spacer(1, 0.2 * inch))
+            for e in entries:
+                title = e.test_name.replace("_", " ").title()
+                if e.side in ("left", "right"):
+                    title += f" ({e.side})"
+                score_txt = "Clinician review required" if e.needs_human else f"Score: {e.score}/3"
+                story.append(Paragraph(f"<b>{title}</b> — {score_txt}", styles["Heading3"]))
+                if e.rationale:
+                    story.append(Paragraph(e.rationale, styles["Normal"]))
+                if e.compensation_tags:
+                    story.append(Paragraph("Compensations: " + ", ".join(e.compensation_tags),
+                                           styles["Normal"]))
+                if e.corrective_hint:
+                    story.append(Paragraph("Corrective: " + e.corrective_hint, styles["Normal"]))
+                items = list(e.measurements.items())[:6]
+                if items:
+                    rows = [[k.replace("_", " "),
+                             (f"{v:.1f}" if isinstance(v, float) else str(v))] for k, v in items]
+                    tbl = Table(rows, colWidths=[3 * inch, 1.5 * inch])
+                    tbl.setStyle(TableStyle([
+                        ("FONTSIZE", (0, 0), (-1, -1), 8),
+                        ("TEXTCOLOR", (0, 0), (-1, -1), colors.HexColor("#334155")),
+                    ]))
+                    story.append(tbl)
+                if e.keyframe_path and os.path.exists(e.keyframe_path):
+                    try:
+                        story.append(Image(e.keyframe_path, width=3.0 * inch, height=2.25 * inch))
+                    except Exception:
+                        story.append(Paragraph("<i>(key-frame image unavailable)</i>", styles["Normal"]))
+                else:
+                    story.append(Paragraph("<i>(key-frame image unavailable)</i>", styles["Normal"]))
+                story.append(Spacer(1, 0.2 * inch))
+            if report.asymmetries:
+                story.append(Paragraph("Asymmetries", styles["Heading2"]))
+                for a in report.asymmetries:
+                    story.append(Paragraph(
+                        f"{a['test'].replace('_', ' ').title()}: "
+                        f"L={a['left_score']} R={a['right_score']} (&#916; {a['delta']})",
+                        styles["Normal"]))
+            flags = list(report.low_confidence_flags) + list(report.disagreement_flags)
+            if flags:
+                story.append(Paragraph("Flags", styles["Heading2"]))
+                for fl in flags:
+                    story.append(Paragraph(fl, styles["Normal"]))
+            story.append(Spacer(1, 0.3 * inch))
+            story.append(Paragraph(f"<b>&#9888; {DISCLAIMER}</b>", banner))
+            doc = SimpleDocTemplate(out_path, pagesize=LETTER,
+                                    topMargin=0.6 * inch, bottomMargin=0.6 * inch)
+            doc.build(story)
+            return out_path
+        except Exception as e:
+            logger.warning("pdf build failed: %s", e)
+            return None
+```
+- [ ] **Step 4: Run test to verify it passes**
+Run: `pytest tests/test_pdf_report.py -v`
+Expected: both tests PASS
+- [ ] **Step 5: Re-run the session suite (pdf_path now populated)**
+Run: `pytest tests/test_session.py -v`
+Expected: all PASS (now `finish_session` returns a real `pdf_path`).
+- [ ] **Step 6: Commit**
+```bash
+git add formscout/agents/pdf_report.py tests/test_pdf_report.py
+git commit -m "feat: add PdfReportAgent — branded ReportLab session PDF"
+```
+---
+## Task 7: Wire the session UI in `app.py`
+**Files:**
+- Modify: `app.py` (`process_video`, `build_app`, event wiring)
+This task is verified by running the app (Gradio event wiring is not unit-tested; the
+orchestration it calls is already covered by `tests/test_session.py`).
+- [ ] **Step 1: Import the session module**
+In `app.py`, add to the imports block (after `from formscout.startup import ensure_checkpoints`):
+```python
+from formscout import session as session_mod
+```
+- [ ] **Step 2: Refactor `process_video` to accumulate into a session**
+Replace the entire `process_video` function (lines ~51-105) with a version that takes and
+returns the session, appends an entry on success, and builds the "Session so far" table.
+Replace from `def process_video(` through its final `return ...` with:
+```python
+def process_video(video_path: str, test_name: str, side: str, model_key: str,
+                  layers: list[str], session_state):
+    """Analyse one clip and accumulate it into the screening session."""
+    if not video_path:
+        return (
+            session_state, _render_empty_state(), "Upload a video to begin analysis.",
+            "", "", None, "", _render_session_table(session_state),
+            gr.update(visible=False), gr.update(visible=False),
+        )
+    if session_state is None:
+        session_state = session_mod.new_session()
+    director = Director()
+    state = director.run(video_path, test_name=test_name, side=side, model_key=model_key)
+    score_html = _render_empty_state()
+    score_details = ""
+    if state.features:
+        result = score_test(state.features)
+        judge = state.judge
+        if judge and judge.score is not None:
+            score_html = _render_score_card(judge.score, judge.confidence, judge.needs_human)
+            score_details = _render_score_details_judge(judge, result, state.features)
+        elif judge and judge.needs_human:
+            score_html = _render_score_card(0, 0, True)
+            score_details = f"### Needs Clinician Review\n{judge.rationale}"
+        else:
+            score_html = _render_score_card(result.score, result.confidence, result.needs_human)
+            score_details = _render_score_details(result, state.features)
+        # Accumulate into the session (only when we have a real analysis)
+        if state.ingest and state.pose2d and state.judge:
+            draw_trails = "trails" in {lbl.lower().replace(" ", "_") for lbl in (layers or [])}
+            try:
+                session_mod.add_analysis(
+                    session_state, ingest=state.ingest, pose2d=state.pose2d,
+                    features=state.features, judge=state.judge,
+                    test_name=test_name, side=side, draw_trails=draw_trails,
+                )
+            except Exception as e:
+                state.warnings.append(f"session accumulation failed: {e}")
+    pipeline_md = _render_pipeline_status(state)
+    alerts = _render_alerts(state)
+    overlay_path = None
+    vel_summary = ""
+    layer_set = {lbl.lower().replace(" ", "_") for lbl in (layers or [])}
+    if layer_set and state.ingest and state.pose2d:
+        try:
+            from formscout.agents.visualizer import PoseVisualizer, build_velocity_summary
+            vis = PoseVisualizer()
+            with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as f:
+                out_path = f.name
+            overlay_path = vis.render_video(state.ingest, state.pose2d, layer_set, out_path)
+            if overlay_path:
+                vel_summary = build_velocity_summary(state.pose2d.keypoints, vis.last_velocities)
+        except Exception as e:
+            alerts = (alerts or "") + f"\n⚠️ Visualizer error: {e}"
+    has_entries = bool(session_state and session_state.entries)
+    return (
+        session_state, score_html, pipeline_md, score_details, alerts,
+        overlay_path, vel_summary, _render_session_table(session_state),
+        gr.update(visible=has_entries), gr.update(visible=has_entries),
+    )
+```
+- [ ] **Step 3: Add the session-table renderer and finish handler**
+In `app.py`, add these two functions just before `def build_app()`:
+```python
+def _render_session_table(session_state) -> str:
+    """Render the accumulated 'Session so far' table as markdown."""
+    if not session_state or not session_state.entries:
+        return "*No clips analysed yet.*"
+    lines = ["| Test | Side | Score | Status |", "|---|---|---|---|"]
+    for e in session_state.entries:
+        test = e.test_name.replace("_", " ").title()
+        side = e.side if e.side in ("left", "right") else "—"
+        if e.needs_human:
+            score, status = "—", "⚠️ Clinician review"
+        else:
+            score, status = f"{e.score}/3", "✓ scored"
+        lines.append(f"| {test} | {side} | {score} | {status} |")
+    return "\n".join(lines)
+def _finish_session(session_state):
+    """Build the composite report + PDF for the whole session."""
+    if not session_state or not session_state.entries:
+        return ("⚠️ No clips analysed yet — analyse at least one clip first.",
+                None, None)
+    report, pdf_path = session_mod.finish_session(session_state)
+    if report is None:
+        return ("⚠️ Nothing to report.", None, None)
+    if report.composite is not None:
+        summary = [f"## Composite: {report.composite} / 21"]
+    else:
+        n = len(session_state.entries)
+        summary = [f"## Composite: Incomplete — {n}/7 tests scored",
+                   "*(One or more tests need clinician review or were unscored.)*"]
+    if report.asymmetries:
+        summary.append("\n### Asymmetries")
+        for a in report.asymmetries:
+            test = a["test"].replace("_", " ").title()
+            summary.append(f"- **{test}:** L={a['left_score']} R={a['right_score']} (Δ {a['delta']})")
+    flags = list(report.low_confidence_flags) + list(report.disagreement_flags)
+    if flags:
+        summary.append("\n### Flags")
+        for fl in flags:
+            summary.append(f"- {fl}")
+    md_path = os.path.join(session_state.session_dir, "analysis.md")
+    md_out = md_path if os.path.exists(md_path) else None
+    return "\n".join(summary), pdf_path, md_out
+```
+Also add `import os` to the top of `app.py` if not already present (it currently imports only
+`tempfile` and `gradio`). Add after `import tempfile`:
+```python
+import os
+```
+- [ ] **Step 4: Add the session state, buttons, and outputs to `build_app`**
+In `build_app`, inside the `with gr.Blocks(...) as app:` block, immediately after the line
+`with gr.Blocks(title="FormScout — FMS Screening Aid") as app:` add:
+```python
+        session_state = gr.State(None)
+```
+Then, in the left input column, replace the single submit button block:
+```python
+                submit_btn = gr.Button(
+                    "🎯 Score Movement",
+                    variant="primary",
+                    size="lg",
+                )
+```
+with:
+```python
+                submit_btn = gr.Button(
+                    "🎯 Score Movement",
+                    variant="primary",
+                    size="lg",
+                )
+                with gr.Row():
+                    new_clip_btn = gr.Button("➕ Analyse new clip", visible=False)
+                    finish_btn = gr.Button("✅ Finish & generate PDF",
+                                           variant="primary", visible=False)
+```
+In the right results column, add a "Session" tab and a finish-output area. Inside `with gr.Tabs():`
+add a new tab after the "🎬 Overlay Video" tab:
+```python
+                    with gr.TabItem("🗂️ Session"):
+                        session_table = gr.Markdown("*No clips analysed yet.*")
+                        finish_summary = gr.Markdown("")
+                        pdf_file = gr.File(label="Screening Report (PDF)", visible=True)
+                        md_file = gr.File(label="Analysis Log (Markdown)", visible=True)
+```
+- [ ] **Step 5: Update event wiring**
+Replace the `_map_inputs` function and `submit_btn.click(...)` block at the bottom of `build_app`
+with:
+```python
+        def _map_inputs(video, test_display_name, side_display, pose_model_key, overlay_layers, sess):
+            """Map UI display values to internal values and accumulate into the session."""
+            test_map = {name: val for name, val in FMS_TESTS}
+            test_name = test_map.get(test_display_name, "deep_squat")
+            side = {"N/A": "na", "Left": "left", "Right": "right"}.get(side_display, "na")
+            return process_video(video, test_name, side, pose_model_key, overlay_layers, sess)
+        submit_btn.click(
+            fn=_map_inputs,
+            inputs=[video_input, test_dropdown, side_dropdown, pose_model_dropdown,
+                    overlay_layers, session_state],
+            outputs=[session_state, score_html, pipeline_md, score_details, alerts_md,
+                     overlay_video, velocity_md, session_table, new_clip_btn, finish_btn],
+        )
+        def _new_clip():
+            """Clear inputs for the next clip; keep the session intact."""
+            return None, _render_empty_state(), ""
+        new_clip_btn.click(
+            fn=_new_clip,
+            inputs=[],
+            outputs=[video_input, score_html, score_details],
+        )
+        finish_btn.click(
+            fn=_finish_session,
+            inputs=[session_state],
+            outputs=[finish_summary, pdf_file, md_file],
+        )
+```
+- [ ] **Step 6: Verify the full test suite still passes**
+Run: `pytest tests/ -q`
+Expected: all tests pass except the single pre-existing known failure documented in CLAUDE.md
+(`test_unimplemented_test_returns_low_confidence`). No new failures.
+- [ ] **Step 7: Manually verify the app**
+Run: `python3 app.py`
+Then in the browser:
+1. Upload a clip, pick a test, click **Score Movement** → score card appears; the **Session** tab
+   shows one row; the two new buttons appear.
+2. Click **➕ Analyse new clip** → the video input clears, the session row persists.
+3. Analyse a second test → a second row appears.
+4. Click **✅ Finish & generate PDF** → the Session tab shows the composite summary and a
+   downloadable PDF (open it: disclaimer top + bottom, per-test blocks with key-frame images,
+   composite or "Incomplete"). The Markdown log is also downloadable.
+Expected: all four steps work; PDF opens and contains the disclaimer, composite, and per-test sections.
+- [ ] **Step 8: Commit**
+```bash
+git add app.py
+git commit -m "feat: accumulate FMS clips into a session with composite report + PDF export"
+```
+---
+## Task 8: Update docs
+**Files:**
+- Modify: `CLAUDE.md` (Build phases / status)
+- Modify: `MODEL_BUDGET.md` (no param change — note PDF agent adds 0 params, for completeness)
+- [ ] **Step 1: Update the Phase 4 line in CLAUDE.md**
+In `CLAUDE.md`, in the "Build phases" section, update the Phase 4 line from:
+```
+4. **Phase 4 — Polish + ship:** Custom Svelte UI components, PDF export, agent trace to Hub, blog post. (Overlay video already done via `PoseVisualizer`.)
+```
+to:
+```
+4. **Phase 4 — Polish + ship:** Custom Svelte UI components, agent trace to Hub, blog post. (Overlay video done via `PoseVisualizer`; full 7-test session + PDF export done via `formscout/session.py` + `PdfReportAgent`.)
+```
+- [ ] **Step 2: Note the PDF agent in the architecture section**
+In `CLAUDE.md`, under "### Rubric scorers" or near the ReportAgent description, this is optional
+context; no required change. Skip if no natural home.
+- [ ] **Step 3: Commit**
+```bash
+git add CLAUDE.md
+git commit -m "docs: mark full FMS session + PDF export complete in build phases"
+```
+---
+## Self-Review Notes (already applied)
+- **Spec coverage:** session accumulation (Task 5), two-button UX (Task 7), on-disk MD/JSON/keyframes (Task 5), key-frame from `features.timing` (Tasks 3–5), ReportLab PDF top/bottom disclaimer + composite + per-test + asymmetry + flags (Task 6), `SessionEntry` type (Task 2), `ReportAgent` reuse (Task 5 `finish_session`), composite-null-on-needs-human (Task 5 test), error tolerance / never-raise (Tasks 4–6). All covered.
+- **Type consistency:** `SessionEntry` field names are identical across Tasks 2, 5, 6, 7. `finish_session` returns `(ReportResult | None, str | None)` and is consumed that way in Task 7. `render_frame(ingest, pose2d, frame_idx, layers, caption, out_png)` signature matches its callers.
+- **No placeholders:** every code step shows complete code; every run step states the exact command + expected outcome.

docs/superpowers/specs/2026-06-09-pose-model-selector-design.md CHANGED Viewed

@@ -1,171 +1,171 @@
-# Pose Model Selector — Design Spec
-**Date:** 2026-06-09
-**Status:** Approved
-## Goal
-Expose all available pose estimation models as a selectable dropdown in the Gradio UI, replacing the hard-coded YOLO26l default. Supported families: MediaPipe (Qualcomm HF/ONNX), YOLO26 n→x (local), Sapiens2 0.4B→5B (HF/transformers).
----
-## Architecture
-### Unified model registry (`config.py`)
-Replace `YOLO_POSE_MODELS` with a single `POSE_MODELS` dict. Each entry:
-```python
-{
-    "backend": "yolo" | "mediapipe" | "sapiens2",
-    "path": str,        # yolo only — absolute path to local .pt
-    "hf_id": str,       # mediapipe + sapiens2 — HuggingFace repo id
-    "params_m": float,  # millions of parameters
-}
-```
-Ordered as displayed in the UI:
-| Label | backend | source |
-|---|---|---|
-| `MediaPipe-Pose ⬇ ~16 MB, CPU-friendly` | mediapipe | `qualcomm/MediaPipe-Pose-Estimation` |
-| `YOLO26n — nano (0.7M, fastest)` ★ default | yolo | local checkpoint |
-| `YOLO26s — small (3.5M)` | yolo | local checkpoint |
-| `YOLO26m — medium (9M)` | yolo | local checkpoint |
-| `YOLO26l — large (25.9M)` | yolo | local checkpoint |
-| `YOLO26x — extra-large (57.6M)` | yolo | local checkpoint |
-| `Sapiens2-0.4B ⬇ ~1.6 GB` | sapiens2 | `facebook/sapiens2-pose-0.4b` |
-| `Sapiens2-0.8B ⬇ ~3.2 GB` | sapiens2 | `facebook/sapiens2-pose-0.8b` |
-| `Sapiens2-1B ⬇ ~4 GB` | sapiens2 | `facebook/sapiens2-pose-1b` |
-| `Sapiens2-5B ⬇ ~20 GB, large GPU` | sapiens2 | `facebook/sapiens2-pose-5b` |
-```python
-DEFAULT_POSE_MODEL = "YOLO26n — nano (0.7M, fastest)"
-```
-Keep `YOLO_POSE_MODEL` and `YOLO_POSE_MODEL_HQ` as string aliases for backward compat with any direct references outside the agent.
----
-### Pose2DAgent (`formscout/agents/pose2d.py`)
-Three private sub-runners, all returning `list[dict[int, dict]]` (COCO 17 keypoints per frame, same format as today):
-#### `_run_yolo(frames, path) -> list[dict]`
-Existing logic, lifted into a named function. Model cached in `_model_cache[path]`.
-#### `_run_mediapipe(frames, hf_id) -> list[dict]`
-- Download repo snapshot via `huggingface_hub.snapshot_download(hf_id)`
-- Locate the pose landmark `.onnx` file in the snapshot
-- Load with `onnxruntime.InferenceSession`
-- Preprocess each frame: resize to 256×256, normalize
-- Run inference → 33 BlazePose landmarks
-- Map BlazePose 33 → COCO 17 via fixed index table:
-  ```
-  COCO  0=nose        → BlazePose 0
-  COCO  1=left_eye    → BlazePose 2
-  COCO  2=right_eye   → BlazePose 5
-  COCO  3=left_ear    → BlazePose 7
-  COCO  4=right_ear   → BlazePose 8
-  COCO  5=left_shld   → BlazePose 11
-  COCO  6=right_shld  → BlazePose 12
-  COCO  7=left_elbow  → BlazePose 13
-  COCO  8=right_elbow → BlazePose 14
-  COCO  9=left_wrist  → BlazePose 15
-  COCO 10=right_wrist → BlazePose 16
-  COCO 11=left_hip    → BlazePose 23
-  COCO 12=right_hip   → BlazePose 24
-  COCO 13=left_knee   → BlazePose 25
-  COCO 14=right_knee  → BlazePose 26
-  COCO 15=left_ankle  → BlazePose 27
-  COCO 16=right_ankle → BlazePose 28
-  ```
-- Session cached in `_model_cache[hf_id]`
-#### `_run_sapiens2(frames, hf_id) -> list[dict]`
-- Load via `transformers.pipeline("pose-estimation", model=hf_id)`
-- Sapiens2 outputs 308 whole-body keypoints; map first 17 (indices 0–16) to COCO 17 — Sapiens2 preserves COCO ordering for the body subset
-- Pipeline cached in `_model_cache[hf_id]`
-#### `Pose2DAgent.run(ingest, model_key)`
-- `model_key: str` replaces `model_path: str` (old param)
-- Looks up `config.POSE_MODELS[model_key]` (falls back to `DEFAULT_POSE_MODEL` if key missing)
-- Dispatches to the appropriate sub-runner
-- Returns `Pose2DResult` — identical contract as today
----
-### UI (`app.py`)
-Add `gr.Dropdown` for pose model in the input column, below the test/side row:
-```python
-pose_model_dropdown = gr.Dropdown(
-    choices=list(config.POSE_MODELS.keys()),
-    value=config.DEFAULT_POSE_MODEL,
-    label="Pose Model",
-)
-```
-Update `_map_inputs` to accept and forward `pose_model_key`:
-```python
-def _map_inputs(video, test_display_name, side_display, pose_model_key):
-    ...
-    return process_video(video, test_name, side, pose_model_key)
-```
-Update `submit_btn.click` inputs to include `pose_model_dropdown`.
-`process_video(video_path, test_name, side, pose_model_key)` passes `pose_model_key` through to `director.run()`, which passes it to `Pose2DAgent.run()`. Remove the old `YOLO_POSE_MODELS.get()` lookup from `process_video`.
----
-## Data flow
-```
-UI dropdown (pose_model_key: str)
-  → process_video()
-  → Director.run(pose_model_key=...)
-  → Pose2DAgent.run(ingest, model_key=pose_model_key)
-  → config.POSE_MODELS[model_key] → {backend, path|hf_id}
-  → _run_yolo / _run_mediapipe / _run_sapiens2
-  → list[dict[int, {x, y, conf}]]  (COCO 17, same contract)
-  ��� Pose2DResult
-```
----
-## Error handling
-- Unknown `model_key`: log warning, fall back to `DEFAULT_POSE_MODEL`
-- ONNX file not found in MediaPipe snapshot: `Pose2DResult(confidence=0.0, notes="mediapipe onnx not found")`
-- Sapiens2 / MediaPipe download failure: `Pose2DResult(confidence=0.0, notes=str(e))`
-- All failures are non-fatal; pipeline continues with 0-confidence result and surfaces alert in UI
----
-## Dependencies to add (`requirements.txt`)
-- `onnxruntime` — MediaPipe ONNX inference
-- `huggingface_hub` — snapshot download for MediaPipe (already likely present via transformers)
-Sapiens2 uses `transformers`, already a dependency.
----
-## Testing
-Each new backend gets a pytest in `tests/test_pose2d.py` that:
-- Mocks the model load (no actual HF download in CI)
-- Passes a 3-frame synthetic IngestResult
-- Asserts `Pose2DResult.keypoints` has 3 entries, each a dict with at most 17 int keys
-- Asserts `confidence` is a float in [0, 1]
----
-## Out of scope
-- Sapiens2 / MediaPipe accuracy benchmarking
-- Automatic backend selection based on hardware
-- Downloading Sapiens2/MediaPipe checkpoints to local `checkpoints/` directory

+# Pose Model Selector — Design Spec
+**Date:** 2026-06-09
+**Status:** Approved
+## Goal
+Expose all available pose estimation models as a selectable dropdown in the Gradio UI, replacing the hard-coded YOLO26l default. Supported families: MediaPipe (Qualcomm HF/ONNX), YOLO26 n→x (local), Sapiens2 0.4B→5B (HF/transformers).
+---
+## Architecture
+### Unified model registry (`config.py`)
+Replace `YOLO_POSE_MODELS` with a single `POSE_MODELS` dict. Each entry:
+```python
+{
+    "backend": "yolo" | "mediapipe" | "sapiens2",
+    "path": str,        # yolo only — absolute path to local .pt
+    "hf_id": str,       # mediapipe + sapiens2 — HuggingFace repo id
+    "params_m": float,  # millions of parameters
+}
+```
+Ordered as displayed in the UI:
+| Label | backend | source |
+|---|---|---|
+| `MediaPipe-Pose ⬇ ~16 MB, CPU-friendly` | mediapipe | `qualcomm/MediaPipe-Pose-Estimation` |
+| `YOLO26n — nano (0.7M, fastest)` ★ default | yolo | local checkpoint |
+| `YOLO26s — small (3.5M)` | yolo | local checkpoint |
+| `YOLO26m — medium (9M)` | yolo | local checkpoint |
+| `YOLO26l — large (25.9M)` | yolo | local checkpoint |
+| `YOLO26x — extra-large (57.6M)` | yolo | local checkpoint |
+| `Sapiens2-0.4B ⬇ ~1.6 GB` | sapiens2 | `facebook/sapiens2-pose-0.4b` |
+| `Sapiens2-0.8B ⬇ ~3.2 GB` | sapiens2 | `facebook/sapiens2-pose-0.8b` |
+| `Sapiens2-1B ⬇ ~4 GB` | sapiens2 | `facebook/sapiens2-pose-1b` |
+| `Sapiens2-5B ⬇ ~20 GB, large GPU` | sapiens2 | `facebook/sapiens2-pose-5b` |
+```python
+DEFAULT_POSE_MODEL = "YOLO26n — nano (0.7M, fastest)"
+```
+Keep `YOLO_POSE_MODEL` and `YOLO_POSE_MODEL_HQ` as string aliases for backward compat with any direct references outside the agent.
+---
+### Pose2DAgent (`formscout/agents/pose2d.py`)
+Three private sub-runners, all returning `list[dict[int, dict]]` (COCO 17 keypoints per frame, same format as today):
+#### `_run_yolo(frames, path) -> list[dict]`
+Existing logic, lifted into a named function. Model cached in `_model_cache[path]`.
+#### `_run_mediapipe(frames, hf_id) -> list[dict]`
+- Download repo snapshot via `huggingface_hub.snapshot_download(hf_id)`
+- Locate the pose landmark `.onnx` file in the snapshot
+- Load with `onnxruntime.InferenceSession`
+- Preprocess each frame: resize to 256×256, normalize
+- Run inference → 33 BlazePose landmarks
+- Map BlazePose 33 → COCO 17 via fixed index table:
+  ```
+  COCO  0=nose        → BlazePose 0
+  COCO  1=left_eye    → BlazePose 2
+  COCO  2=right_eye   → BlazePose 5
+  COCO  3=left_ear    → BlazePose 7
+  COCO  4=right_ear   → BlazePose 8
+  COCO  5=left_shld   → BlazePose 11
+  COCO  6=right_shld  → BlazePose 12
+  COCO  7=left_elbow  → BlazePose 13
+  COCO  8=right_elbow → BlazePose 14
+  COCO  9=left_wrist  → BlazePose 15
+  COCO 10=right_wrist → BlazePose 16
+  COCO 11=left_hip    → BlazePose 23
+  COCO 12=right_hip   → BlazePose 24
+  COCO 13=left_knee   → BlazePose 25
+  COCO 14=right_knee  → BlazePose 26
+  COCO 15=left_ankle  → BlazePose 27
+  COCO 16=right_ankle → BlazePose 28
+  ```
+- Session cached in `_model_cache[hf_id]`
+#### `_run_sapiens2(frames, hf_id) -> list[dict]`
+- Load via `transformers.pipeline("pose-estimation", model=hf_id)`
+- Sapiens2 outputs 308 whole-body keypoints; map first 17 (indices 0–16) to COCO 17 — Sapiens2 preserves COCO ordering for the body subset
+- Pipeline cached in `_model_cache[hf_id]`
+#### `Pose2DAgent.run(ingest, model_key)`
+- `model_key: str` replaces `model_path: str` (old param)
+- Looks up `config.POSE_MODELS[model_key]` (falls back to `DEFAULT_POSE_MODEL` if key missing)
+- Dispatches to the appropriate sub-runner
+- Returns `Pose2DResult` — identical contract as today
+---
+### UI (`app.py`)
+Add `gr.Dropdown` for pose model in the input column, below the test/side row:
+```python
+pose_model_dropdown = gr.Dropdown(
+    choices=list(config.POSE_MODELS.keys()),
+    value=config.DEFAULT_POSE_MODEL,
+    label="Pose Model",
+)
+```
+Update `_map_inputs` to accept and forward `pose_model_key`:
+```python
+def _map_inputs(video, test_display_name, side_display, pose_model_key):
+    ...
+    return process_video(video, test_name, side, pose_model_key)
+```
+Update `submit_btn.click` inputs to include `pose_model_dropdown`.
+`process_video(video_path, test_name, side, pose_model_key)` passes `pose_model_key` through to `director.run()`, which passes it to `Pose2DAgent.run()`. Remove the old `YOLO_POSE_MODELS.get()` lookup from `process_video`.
+---
+## Data flow
+```
+UI dropdown (pose_model_key: str)
+  → process_video()
+  → Director.run(pose_model_key=...)
+  → Pose2DAgent.run(ingest, model_key=pose_model_key)
+  → config.POSE_MODELS[model_key] → {backend, path|hf_id}
+  → _run_yolo / _run_mediapipe / _run_sapiens2
+  → list[dict[int, {x, y, conf}]]  (COCO 17, same contract)
+  → Pose2DResult
+```
+---
+## Error handling
+- Unknown `model_key`: log warning, fall back to `DEFAULT_POSE_MODEL`
+- ONNX file not found in MediaPipe snapshot: `Pose2DResult(confidence=0.0, notes="mediapipe onnx not found")`
+- Sapiens2 / MediaPipe download failure: `Pose2DResult(confidence=0.0, notes=str(e))`
+- All failures are non-fatal; pipeline continues with 0-confidence result and surfaces alert in UI
+---
+## Dependencies to add (`requirements.txt`)
+- `onnxruntime` — MediaPipe ONNX inference
+- `huggingface_hub` — snapshot download for MediaPipe (already likely present via transformers)
+Sapiens2 uses `transformers`, already a dependency.
+---
+## Testing
+Each new backend gets a pytest in `tests/test_pose2d.py` that:
+- Mocks the model load (no actual HF download in CI)
+- Passes a 3-frame synthetic IngestResult
+- Asserts `Pose2DResult.keypoints` has 3 entries, each a dict with at most 17 int keys
+- Asserts `confidence` is a float in [0, 1]
+---
+## Out of scope
+- Sapiens2 / MediaPipe accuracy benchmarking
+- Automatic backend selection based on hardware
+- Downloading Sapiens2/MediaPipe checkpoints to local `checkpoints/` directory

docs/superpowers/specs/2026-06-09-pose-visualizer-design.md CHANGED Viewed

@@ -1,197 +1,197 @@
-# Pose Overlay Visualizer — Design Spec
-**Date:** 2026-06-09
-**Status:** Approved
-## Goal
-Add an annotated overlay video output to the FormScout UI showing skeleton, motion trails, and velocity arrows on top of the original footage, alongside a per-joint velocity summary table. Overlay layers are user-selectable via checkboxes. Adapted from the Laban Movement Analysis project.
----
-## Architecture
-Three files change or are created. No changes to `pipeline.py`, `types.py`, or any existing agent.
-```
-formscout/agents/visualizer.py    ← new
-tests/test_visualizer.py          ← new
-app.py                            ← overlay_layers checkbox, new tab, wiring
-```
-The visualizer runs **after** `director.run()` returns in `process_video()` — it is a pure post-processing step, never on the critical scoring path.
----
-## Module: `formscout/agents/visualizer.py`
-### `compute_joint_velocity(keypoints_per_frame, fps) → dict[int, list[float]]`
-- Input: `list[dict[int, {x, y, conf}]]` (COCO-17 pixel coords per frame), `fps: float`
-- Output: `dict[int, list[float]]` — per-joint per-frame speed in **px/s**
-- Method: for each joint index, run a `SimpleKalmanFilter` (1D per axis, constant-velocity model, same structure as Laban's engine) over the (x, y) series. Speed = `sqrt(vx² + vy²)` from the filter's velocity state.
-- Missing keypoints (conf < 0.3 or absent) → speed = 0.0 for that frame, filter state held.
-### `SimpleKalmanFilter`
-Minimal 4-state Kalman (x, y, vx, vy), identical in structure to the Laban `SimpleKalmanFilter`:
-- Transition: constant-velocity model
-- Measurement: position only (x, y)
-- One instance per joint per video run
-### `PoseVisualizer`
-#### Constants
-```python
-COCO_SKELETON = [
-    (0,1),(0,2),(1,3),(2,4),          # face
-    (5,6),(5,7),(7,9),(6,8),(8,10),   # arms
-    (5,11),(6,12),(11,12),             # torso
-    (11,13),(13,15),(12,14),(14,16),  # legs
-]
-TRAIL_LENGTH = 10       # frames of trail history
-MAX_ARROW_PX = 40       # arrow scaled so peak velocity → 40px length
-CONF_THRESHOLD = 0.3    # min confidence to draw a keypoint
-```
-#### Private methods
-**`_draw_skeleton(frame, kps)`**
-- Draw each COCO bone as a line if both endpoints have conf > CONF_THRESHOLD
-- Joint dots: color green→red by confidence using HSV (same as Laban `_confidence_to_color`)
-- Bone color: white
-**`_draw_trails(frame, trail_history, frame_idx)`**
-- `trail_history: dict[int, deque(maxlen=TRAIL_LENGTH)]` keyed by joint index
-- Each deque holds `(x, y)` pixel positions from previous frames
-- Draw fading line segments: alpha = segment_position / TRAIL_LENGTH, color white
-**`_draw_velocity_arrows(frame, kps, velocities, frame_idx)`**
-- `velocities: dict[int, list[float]]` — speeds per joint per frame
-- Direction vector from consecutive keypoint positions (x[t] - x[t-1], y[t] - y[t-1])
-- Arrow length = `speed / peak_speed * MAX_ARROW_PX` (clamped)
-- Drawn only for joints with conf > CONF_THRESHOLD and speed > 0
-- Color: green=slow, orange=medium, red=fast (same thresholds as Laban intensity)
-#### Public method
-**`render_video(ingest, pose2d, layers: set[str], output_path: str) → str | None`**
-- `layers`: subset of `{"skeleton", "trails", "velocity_arrows"}`
-- If `layers` is empty → return `None` immediately
-- Pre-computes `compute_joint_velocity(pose2d.keypoints, ingest.fps)`
-- Iterates frames, updates `trail_history`, calls selected `_draw_*` methods
-- Writes output via `cv2.VideoWriter` (codec: `mp4v`, same fps as ingest)
-- Returns output path on success; `None` on any exception (logs warning)
-#### Velocity summary
-**`build_velocity_summary(keypoints_per_frame, velocities) → str`**
-- For each joint with conf > 0.3 in >50% of frames:
-  - Compute avg and peak speed (px/s)
-- Return markdown table sorted by peak speed descending:
-  ```
-  | Joint         | Avg (px/s) | Peak (px/s) |
-  |---------------|-----------|-------------|
-  | left_knee     | 42.3      | 118.7       |
-  ```
-- Returns empty string if no valid joints
----
-## UI changes: `app.py`
-### Input column — overlay layer checkboxes
-Below `pose_model_dropdown`, add:
-```python
-overlay_layers = gr.CheckboxGroup(
-    choices=["Skeleton", "Trails", "Velocity arrows"],
-    value=["Skeleton", "Trails"],
-    label="Overlay Layers",
-)
-```
-### Results panel — new tab
-Inside the existing `gr.Tabs()` block, add a fourth tab:
-```python
-with gr.TabItem("🎬 Overlay Video"):
-    overlay_video = gr.Video(label="Annotated Movement")
-    velocity_md = gr.Markdown("")
-```
-### `process_video()` signature
-```python
-def process_video(video_path, test_name, side, model_key, layers: list[str]):
-```
-After `director.run()`:
-```python
-from formscout.agents.visualizer import PoseVisualizer, build_velocity_summary
-layer_set = {l.lower().replace(" ", "_") for l in layers}
-# map UI labels to internal names:
-# "Skeleton" → "skeleton", "Trails" → "trails", "Velocity arrows" → "velocity_arrows"
-overlay_path = None
-vel_summary = ""
-if layer_set and state.ingest and state.pose2d:
-    try:
-        vis = PoseVisualizer()
-        with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as f:
-            out_path = f.name
-        overlay_path = vis.render_video(state.ingest, state.pose2d, layer_set, out_path)
-        if overlay_path:
-            vel_summary = build_velocity_summary(state.pose2d.keypoints, vis.last_velocities)
-    except Exception as e:
-        alerts += f"\n⚠️ Visualizer error: {e}"
-return score_html, pipeline_md, score_details, alerts, overlay_path, vel_summary
-```
-`vis.last_velocities` is stored on the instance after `render_video()` to avoid recomputing.
-### Event wiring
-```python
-submit_btn.click(
-    fn=_map_inputs,
-    inputs=[video_input, test_dropdown, side_dropdown, pose_model_dropdown, overlay_layers],
-    outputs=[score_html, pipeline_md, score_details, alerts_md, overlay_video, velocity_md],
-)
-```
-`_map_inputs` gains `overlay_layers` as fifth parameter.
----
-## Error handling
-| Failure | Behaviour |
-|---|---|
-| All frames have no detections | `render_video()` returns `None`, tab empty, no crash |
-| `cv2.VideoWriter` fails | logs warning, returns `None` |
-| Any exception in visualizer | caught in `process_video()`, appended to alerts, `overlay_path = None` |
-| `layers` is empty | returns `None` immediately, no processing |
-The score is always returned regardless of visualizer outcome.
----
-## Testing: `tests/test_visualizer.py`
-- Synthetic `IngestResult`: 5 blank 480×640 BGR frames, fps=30
-- Synthetic `Pose2DResult`: 17 keypoints per frame at fixed positions with conf=0.9
-- `test_render_video_creates_file`: assert output `.mp4` exists and size > 0
-- `test_compute_joint_velocity_shape`: assert 17-key dict, each list length == 5
-- `test_empty_layers_returns_none`: assert `render_video(..., layers=set())` returns `None`
-- `test_no_detections_returns_none`: all-empty keypoints → `None`
-- `test_velocity_summary_markdown`: assert output contains `|` (table) and at least one joint name
----
-## Out of scope
-- Frame-by-frame metrics synced to video playback (Phase 4 / custom Svelte)
-- Multi-person tracking
-- Saving overlay video to Hugging Face Hub (tracing feature, Phase 4)

+# Pose Overlay Visualizer — Design Spec
+**Date:** 2026-06-09
+**Status:** Approved
+## Goal
+Add an annotated overlay video output to the FormScout UI showing skeleton, motion trails, and velocity arrows on top of the original footage, alongside a per-joint velocity summary table. Overlay layers are user-selectable via checkboxes. Adapted from the Laban Movement Analysis project.
+---
+## Architecture
+Three files change or are created. No changes to `pipeline.py`, `types.py`, or any existing agent.
+```
+formscout/agents/visualizer.py    ← new
+tests/test_visualizer.py          ← new
+app.py                            ← overlay_layers checkbox, new tab, wiring
+```
+The visualizer runs **after** `director.run()` returns in `process_video()` — it is a pure post-processing step, never on the critical scoring path.
+---
+## Module: `formscout/agents/visualizer.py`
+### `compute_joint_velocity(keypoints_per_frame, fps) → dict[int, list[float]]`
+- Input: `list[dict[int, {x, y, conf}]]` (COCO-17 pixel coords per frame), `fps: float`
+- Output: `dict[int, list[float]]` — per-joint per-frame speed in **px/s**
+- Method: for each joint index, run a `SimpleKalmanFilter` (1D per axis, constant-velocity model, same structure as Laban's engine) over the (x, y) series. Speed = `sqrt(vx² + vy²)` from the filter's velocity state.
+- Missing keypoints (conf < 0.3 or absent) → speed = 0.0 for that frame, filter state held.
+### `SimpleKalmanFilter`
+Minimal 4-state Kalman (x, y, vx, vy), identical in structure to the Laban `SimpleKalmanFilter`:
+- Transition: constant-velocity model
+- Measurement: position only (x, y)
+- One instance per joint per video run
+### `PoseVisualizer`
+#### Constants
+```python
+COCO_SKELETON = [
+    (0,1),(0,2),(1,3),(2,4),          # face
+    (5,6),(5,7),(7,9),(6,8),(8,10),   # arms
+    (5,11),(6,12),(11,12),             # torso
+    (11,13),(13,15),(12,14),(14,16),  # legs
+]
+TRAIL_LENGTH = 10       # frames of trail history
+MAX_ARROW_PX = 40       # arrow scaled so peak velocity → 40px length
+CONF_THRESHOLD = 0.3    # min confidence to draw a keypoint
+```
+#### Private methods
+**`_draw_skeleton(frame, kps)`**
+- Draw each COCO bone as a line if both endpoints have conf > CONF_THRESHOLD
+- Joint dots: color green→red by confidence using HSV (same as Laban `_confidence_to_color`)
+- Bone color: white
+**`_draw_trails(frame, trail_history, frame_idx)`**
+- `trail_history: dict[int, deque(maxlen=TRAIL_LENGTH)]` keyed by joint index
+- Each deque holds `(x, y)` pixel positions from previous frames
+- Draw fading line segments: alpha = segment_position / TRAIL_LENGTH, color white
+**`_draw_velocity_arrows(frame, kps, velocities, frame_idx)`**
+- `velocities: dict[int, list[float]]` — speeds per joint per frame
+- Direction vector from consecutive keypoint positions (x[t] - x[t-1], y[t] - y[t-1])
+- Arrow length = `speed / peak_speed * MAX_ARROW_PX` (clamped)
+- Drawn only for joints with conf > CONF_THRESHOLD and speed > 0
+- Color: green=slow, orange=medium, red=fast (same thresholds as Laban intensity)
+#### Public method
+**`render_video(ingest, pose2d, layers: set[str], output_path: str) → str | None`**
+- `layers`: subset of `{"skeleton", "trails", "velocity_arrows"}`
+- If `layers` is empty → return `None` immediately
+- Pre-computes `compute_joint_velocity(pose2d.keypoints, ingest.fps)`
+- Iterates frames, updates `trail_history`, calls selected `_draw_*` methods
+- Writes output via `cv2.VideoWriter` (codec: `mp4v`, same fps as ingest)
+- Returns output path on success; `None` on any exception (logs warning)
+#### Velocity summary
+**`build_velocity_summary(keypoints_per_frame, velocities) → str`**
+- For each joint with conf > 0.3 in >50% of frames:
+  - Compute avg and peak speed (px/s)
+- Return markdown table sorted by peak speed descending:
+  ```
+  | Joint         | Avg (px/s) | Peak (px/s) |
+  |---------------|-----------|-------------|
+  | left_knee     | 42.3      | 118.7       |
+  ```
+- Returns empty string if no valid joints
+---
+## UI changes: `app.py`
+### Input column — overlay layer checkboxes
+Below `pose_model_dropdown`, add:
+```python
+overlay_layers = gr.CheckboxGroup(
+    choices=["Skeleton", "Trails", "Velocity arrows"],
+    value=["Skeleton", "Trails"],
+    label="Overlay Layers",
+)
+```
+### Results panel — new tab
+Inside the existing `gr.Tabs()` block, add a fourth tab:
+```python
+with gr.TabItem("🎬 Overlay Video"):
+    overlay_video = gr.Video(label="Annotated Movement")
+    velocity_md = gr.Markdown("")
+```
+### `process_video()` signature
+```python
+def process_video(video_path, test_name, side, model_key, layers: list[str]):
+```
+After `director.run()`:
+```python
+from formscout.agents.visualizer import PoseVisualizer, build_velocity_summary
+layer_set = {l.lower().replace(" ", "_") for l in layers}
+# map UI labels to internal names:
+# "Skeleton" → "skeleton", "Trails" → "trails", "Velocity arrows" → "velocity_arrows"
+overlay_path = None
+vel_summary = ""
+if layer_set and state.ingest and state.pose2d:
+    try:
+        vis = PoseVisualizer()
+        with tempfile.NamedTemporaryFile(suffix=".mp4", delete=False) as f:
+            out_path = f.name
+        overlay_path = vis.render_video(state.ingest, state.pose2d, layer_set, out_path)
+        if overlay_path:
+            vel_summary = build_velocity_summary(state.pose2d.keypoints, vis.last_velocities)
+    except Exception as e:
+        alerts += f"\n⚠️ Visualizer error: {e}"
+return score_html, pipeline_md, score_details, alerts, overlay_path, vel_summary
+```
+`vis.last_velocities` is stored on the instance after `render_video()` to avoid recomputing.
+### Event wiring
+```python
+submit_btn.click(
+    fn=_map_inputs,
+    inputs=[video_input, test_dropdown, side_dropdown, pose_model_dropdown, overlay_layers],
+    outputs=[score_html, pipeline_md, score_details, alerts_md, overlay_video, velocity_md],
+)
+```
+`_map_inputs` gains `overlay_layers` as fifth parameter.
+---
+## Error handling
+| Failure | Behaviour |
+|---|---|
+| All frames have no detections | `render_video()` returns `None`, tab empty, no crash |
+| `cv2.VideoWriter` fails | logs warning, returns `None` |
+| Any exception in visualizer | caught in `process_video()`, appended to alerts, `overlay_path = None` |
+| `layers` is empty | returns `None` immediately, no processing |
+The score is always returned regardless of visualizer outcome.
+---
+## Testing: `tests/test_visualizer.py`
+- Synthetic `IngestResult`: 5 blank 480×640 BGR frames, fps=30
+- Synthetic `Pose2DResult`: 17 keypoints per frame at fixed positions with conf=0.9
+- `test_render_video_creates_file`: assert output `.mp4` exists and size > 0
+- `test_compute_joint_velocity_shape`: assert 17-key dict, each list length == 5
+- `test_empty_layers_returns_none`: assert `render_video(..., layers=set())` returns `None`
+- `test_no_detections_returns_none`: all-empty keypoints → `None`
+- `test_velocity_summary_markdown`: assert output contains `|` (table) and at least one joint name
+---
+## Out of scope
+- Frame-by-frame metrics synced to video playback (Phase 4 / custom Svelte)
+- Multi-person tracking
+- Saving overlay video to Hugging Face Hub (tracing feature, Phase 4)

docs/superpowers/specs/2026-06-13-full-fms-session-pdf-design.md CHANGED Viewed

@@ -1,154 +1,154 @@
-# Full FMS Session + PDF Report — Design
-**Date:** 2026-06-13
-**Status:** Approved (brainstorming) — pending implementation plan
-**Owner:** FormScout
-**Related:** `formscout/agents/report.py`, `formscout/agents/visualizer.py`, `formscout/agents/biomechanics.py`, `app.py`, `formscout/types.py`
-## Problem
-FormScout today scores **one** FMS test per upload. A real Functional Movement Screen is **all 7 tests** producing a single **composite 0–21** with asymmetry flags. `ReportAgent` and `ReportResult.composite` already support a multi-test report, but the UI never accumulates more than one test, and `ReportResult.pdf_path` is a hardcoded `None` stub.
-This feature turns the one-clip scorer into a **screening session**: each analyzed clip accumulates; a "Finish" action produces the composite report plus a downloadable, brand-consistent **PDF**. Each clip's worst-moment frame is captured as an annotated still embedded in both an on-disk log and the PDF.
-## Goals
-- Accumulate multiple analyzed clips into one session, then emit a composite 0–21 report.
-- Generate a clinician/client-facing **PDF** handout (ReportLab) with scores, rationale, asymmetries, key-frame images, and the safety disclaimer.
-- Capture and annotate the **worst-moment frame** per test (the governing/peak frame already computed by `BiomechanicsAgent`).
-- Persist each analysis incrementally to disk (`session.json`, `analysis.md`, key-frame PNGs) until "Finish" is clicked.
-## Non-goals (YAGNI)
-- No cross-restart session reload — the session lives in `gr.State` + a temp dir for the browser session.
-- No PDF styling beyond a clean branded layout (no HTML/CSS engine; ReportLab only).
-- No RAG / exemplar-clip citations (separate future spec).
-- No changes to the scoring pipeline, rubric functions, or Director flow.
-## UX
-The current one-clip-at-a-time flow is preserved. Two new buttons appear after an analysis completes:
-- **➕ Analyse new clip** — clears the video/test inputs for the next upload; **keeps** the session.
-- **✅ Finish & generate PDF** — runs the report + PDF over everything accumulated so far.
-After each analysis, a **"Session so far"** table updates: `test · side · score · status`. Finish renders an on-screen composite scorecard + asymmetry summary and exposes the PDF (and `analysis.md`) via `gr.File` for download.
-Guard: Finish with zero analyses → warning, no PDF.
-## Components
-### 1. Session state + on-disk store
-A per-session temp directory `<tmpdir>/formscout_sessions/<session_id>/`:
-- `session.json` — structured list of entries; **source of truth** for the PDF.
-- `analysis.md` — human-readable log, appended after each clip.
-- `keyframes/<test_name>_<side>.png` — annotated worst-frame stills.
-Session identity lives in a `gr.State`. Each entry carries:
-- `test_name`, `side`, `score` (judge score, else rubric), `needs_human`
-- `rationale`, `compensation_tags`, `corrective_hint`
-- key measurements (selected `angles` / `alignments`)
-- `confidence`, `view` (`"2d"`/`"3d"`)
-- `keyframe_path`
-- the `movement` / `features` / `rubric_score` / `judge` objects that `ReportAgent.run()` consumes
-Persistence lasts until Finish; files are kept afterward for download. Cross-restart cleanup is best-effort and out of scope.
-### 2. Key-frame capture
-New method on `PoseVisualizer`:
-```python
-def render_frame(self, ingest, pose2d, frame_idx: int,
-                 layers: set[str], caption: str, out_png: str) -> str | None
-```
-- `frame_idx` comes from `features.timing`, which already stores the governing frame per test:
-  `deep_squat → deepest_frame`, `hurdle_step → peak_step_frame`,
-  `inline_lunge → deepest_lunge_frame`, `shoulder_mobility → measure_frame`,
-  `active_slr → peak_raise_frame`.
-- `trunk_stability_pushup` and `rotary_stability` currently store only counts in `timing`. Add the worst-sag-frame and peak-extension-frame index to their `timing` dicts (one-line change in each `BiomechanicsAgent` method).
-- Reuses `_draw_skeleton` (+ optional `_draw_trails`) on the single frame, overlays a caption naming the worst compensation, writes a PNG.
-- Returns `None` on any failure — never raises, never blocks the entry.
-The "worst compensation" caption is derived from `judge.compensation_tags` (preferred) or the failed `alignments` (fallback).
-### 3. PDF generator
-New module `formscout/agents/pdf_report.py`:
-```python
-class PdfReportAgent:
-    def run(self, report_result: ReportResult,
-            entries: list[SessionEntry], session_dir: str) -> str | None
-```
-Uses **ReportLab** (pure-Python, no system deps — safe on HF Spaces/ZeroGPU). Layout:
-- Safety disclaimer banner at **top and bottom** (mirrors the UI invariant).
-- Title/brand header + date.
-- Composite **0–21** badge, or "Incomplete — N/7 tests scored" when `composite is None`.
-- Per-test block: score, rationale, key measurements, compensation tags, corrective hint, the annotated key-frame image, asymmetry delta (bilateral).
-- Flags section: low-confidence, rubric↔judge disagreement, needs-human.
-- Populates `ReportResult.pdf_path`.
-Returns the PDF path, or `None` on failure (UI surfaces the error and keeps the session for retry). Image embedding tolerates a missing/`None` `keyframe_path` with a placeholder line.
-### 4. ReportAgent reuse
-At Finish, build the entry list and call the existing `ReportAgent.run()` for composite + asymmetries + flags. The bilateral lower-score + asymmetry-delta logic and the null-composite rule already exist and are not rewritten. A small adapter converts `SessionEntry` objects to the dict schema `ReportAgent.run()` expects (or `ReportAgent` gains overload tolerance — implementer's choice, keep it minimal).
-### 5. Types
-Add a `SessionEntry` frozen dataclass to `formscout/types.py` (consistent with the "every agent I/O is a typed dataclass" standard), including `keyframe_path: str | None`. Populate the existing `ReportResult.pdf_path` (and optionally `overlay_video_path`). No other type changes.
-### 6. UI (`app.py`)
-- Add a `gr.State` holding the session (id + entries).
-- After each analysis: render the scorecard as today, append the entry, write `session.json`/`analysis.md`/keyframe PNG, refresh the "Session so far" table, and reveal the two buttons.
-- **Analyse new clip**: reset the video/test/side inputs; keep session state.
-- **Finish & generate PDF**: `ReportAgent.run` → `PdfReportAgent.run` → display composite + asymmetry summary + `gr.File` downloads (PDF + `analysis.md`).
-- Guard: Finish with zero analyses → warning.
-## Data flow
-```
-upload → Director.run → score
-       → build SessionEntry (+ render_frame keyframe png)
-       → append to gr.State + write session.json / analysis.md / keyframe png
-       → refresh "Session so far" table
-Finish → ReportAgent.run(entries) → composite / asymmetries / flags
-       → PdfReportAgent.run(...) → pdf_path
-       → on-screen composite + gr.File (PDF, analysis.md)
-```
-## Error handling
-- Key-frame render fails → entry still saved; PDF shows an image placeholder.
-- PDF generation fails → surface the error, keep the session intact for retry.
-- `needs_human` entry → no numeric score; PDF shows "Clinician review required"; composite null.
-- Composite is `None` whenever any test is unscored or needs human review (existing rule — never show a partial 0–21 as complete).
-- All disk writes tolerate failure with a logged warning; a write failure degrades the artifact but never blocks scoring.
-## Testing (must run without model downloads)
-- `tests/test_pdf_report.py` — synthetic `ReportResult` + entries → PDF file created, non-zero size, contains the disclaimer text and composite line.
-- `tests/test_session.py` — accumulation; composite math; bilateral lower-score + asymmetry delta; null composite when one entry `needs_human`.
-- `tests/test_keyframe.py` — `render_frame` returns a real PNG path (file exists) for a synthetic frame; returns `None` gracefully on bad input.
-## Invariants preserved
-- Pipeline stays headless — no Gradio imports in agent files (`PdfReportAgent` is a pure agent; key-frame capture stays in `visualizer.py`, the existing UI-layer component).
-- Safety disclaimer present top and bottom of the PDF, mirroring the UI.
-- Pain / clearing / needs-human is never auto-scored; composite null when any test unscored.
-- New code follows the engineering standards: one public entrypoint per agent, typed dataclass I/O, `confidence`/`notes` where applicable, module docstring stating purpose/inputs/outputs/failure/params/license/gated.
-## Open implementation choices (left to the plan)
-- Exact `SessionEntry` → `ReportAgent` dict adapter shape.
-- Which measurements to surface per test in the PDF (a curated subset, not the full `angles` dump).
-- PDF assertion strategy in tests (text extraction vs. size/smoke).

+# Full FMS Session + PDF Report — Design
+**Date:** 2026-06-13
+**Status:** Approved (brainstorming) — pending implementation plan
+**Owner:** FormScout
+**Related:** `formscout/agents/report.py`, `formscout/agents/visualizer.py`, `formscout/agents/biomechanics.py`, `app.py`, `formscout/types.py`
+## Problem
+FormScout today scores **one** FMS test per upload. A real Functional Movement Screen is **all 7 tests** producing a single **composite 0–21** with asymmetry flags. `ReportAgent` and `ReportResult.composite` already support a multi-test report, but the UI never accumulates more than one test, and `ReportResult.pdf_path` is a hardcoded `None` stub.
+This feature turns the one-clip scorer into a **screening session**: each analyzed clip accumulates; a "Finish" action produces the composite report plus a downloadable, brand-consistent **PDF**. Each clip's worst-moment frame is captured as an annotated still embedded in both an on-disk log and the PDF.
+## Goals
+- Accumulate multiple analyzed clips into one session, then emit a composite 0–21 report.
+- Generate a clinician/client-facing **PDF** handout (ReportLab) with scores, rationale, asymmetries, key-frame images, and the safety disclaimer.
+- Capture and annotate the **worst-moment frame** per test (the governing/peak frame already computed by `BiomechanicsAgent`).
+- Persist each analysis incrementally to disk (`session.json`, `analysis.md`, key-frame PNGs) until "Finish" is clicked.
+## Non-goals (YAGNI)
+- No cross-restart session reload — the session lives in `gr.State` + a temp dir for the browser session.
+- No PDF styling beyond a clean branded layout (no HTML/CSS engine; ReportLab only).
+- No RAG / exemplar-clip citations (separate future spec).
+- No changes to the scoring pipeline, rubric functions, or Director flow.
+## UX
+The current one-clip-at-a-time flow is preserved. Two new buttons appear after an analysis completes:
+- **➕ Analyse new clip** — clears the video/test inputs for the next upload; **keeps** the session.
+- **✅ Finish & generate PDF** — runs the report + PDF over everything accumulated so far.
+After each analysis, a **"Session so far"** table updates: `test · side · score · status`. Finish renders an on-screen composite scorecard + asymmetry summary and exposes the PDF (and `analysis.md`) via `gr.File` for download.
+Guard: Finish with zero analyses → warning, no PDF.
+## Components
+### 1. Session state + on-disk store
+A per-session temp directory `<tmpdir>/formscout_sessions/<session_id>/`:
+- `session.json` — structured list of entries; **source of truth** for the PDF.
+- `analysis.md` — human-readable log, appended after each clip.
+- `keyframes/<test_name>_<side>.png` — annotated worst-frame stills.
+Session identity lives in a `gr.State`. Each entry carries:
+- `test_name`, `side`, `score` (judge score, else rubric), `needs_human`
+- `rationale`, `compensation_tags`, `corrective_hint`
+- key measurements (selected `angles` / `alignments`)
+- `confidence`, `view` (`"2d"`/`"3d"`)
+- `keyframe_path`
+- the `movement` / `features` / `rubric_score` / `judge` objects that `ReportAgent.run()` consumes
+Persistence lasts until Finish; files are kept afterward for download. Cross-restart cleanup is best-effort and out of scope.
+### 2. Key-frame capture
+New method on `PoseVisualizer`:
+```python
+def render_frame(self, ingest, pose2d, frame_idx: int,
+                 layers: set[str], caption: str, out_png: str) -> str | None
+```
+- `frame_idx` comes from `features.timing`, which already stores the governing frame per test:
+  `deep_squat → deepest_frame`, `hurdle_step → peak_step_frame`,
+  `inline_lunge → deepest_lunge_frame`, `shoulder_mobility → measure_frame`,
+  `active_slr → peak_raise_frame`.
+- `trunk_stability_pushup` and `rotary_stability` currently store only counts in `timing`. Add the worst-sag-frame and peak-extension-frame index to their `timing` dicts (one-line change in each `BiomechanicsAgent` method).
+- Reuses `_draw_skeleton` (+ optional `_draw_trails`) on the single frame, overlays a caption naming the worst compensation, writes a PNG.
+- Returns `None` on any failure — never raises, never blocks the entry.
+The "worst compensation" caption is derived from `judge.compensation_tags` (preferred) or the failed `alignments` (fallback).
+### 3. PDF generator
+New module `formscout/agents/pdf_report.py`:
+```python
+class PdfReportAgent:
+    def run(self, report_result: ReportResult,
+            entries: list[SessionEntry], session_dir: str) -> str | None
+```
+Uses **ReportLab** (pure-Python, no system deps — safe on HF Spaces/ZeroGPU). Layout:
+- Safety disclaimer banner at **top and bottom** (mirrors the UI invariant).
+- Title/brand header + date.
+- Composite **0–21** badge, or "Incomplete — N/7 tests scored" when `composite is None`.
+- Per-test block: score, rationale, key measurements, compensation tags, corrective hint, the annotated key-frame image, asymmetry delta (bilateral).
+- Flags section: low-confidence, rubric↔judge disagreement, needs-human.
+- Populates `ReportResult.pdf_path`.
+Returns the PDF path, or `None` on failure (UI surfaces the error and keeps the session for retry). Image embedding tolerates a missing/`None` `keyframe_path` with a placeholder line.
+### 4. ReportAgent reuse
+At Finish, build the entry list and call the existing `ReportAgent.run()` for composite + asymmetries + flags. The bilateral lower-score + asymmetry-delta logic and the null-composite rule already exist and are not rewritten. A small adapter converts `SessionEntry` objects to the dict schema `ReportAgent.run()` expects (or `ReportAgent` gains overload tolerance — implementer's choice, keep it minimal).
+### 5. Types
+Add a `SessionEntry` frozen dataclass to `formscout/types.py` (consistent with the "every agent I/O is a typed dataclass" standard), including `keyframe_path: str | None`. Populate the existing `ReportResult.pdf_path` (and optionally `overlay_video_path`). No other type changes.
+### 6. UI (`app.py`)
+- Add a `gr.State` holding the session (id + entries).
+- After each analysis: render the scorecard as today, append the entry, write `session.json`/`analysis.md`/keyframe PNG, refresh the "Session so far" table, and reveal the two buttons.
+- **Analyse new clip**: reset the video/test/side inputs; keep session state.
+- **Finish & generate PDF**: `ReportAgent.run` → `PdfReportAgent.run` → display composite + asymmetry summary + `gr.File` downloads (PDF + `analysis.md`).
+- Guard: Finish with zero analyses → warning.
+## Data flow
+```
+upload → Director.run → score
+       → build SessionEntry (+ render_frame keyframe png)
+       → append to gr.State + write session.json / analysis.md / keyframe png
+       → refresh "Session so far" table
+Finish → ReportAgent.run(entries) → composite / asymmetries / flags
+       → PdfReportAgent.run(...) → pdf_path
+       → on-screen composite + gr.File (PDF, analysis.md)
+```
+## Error handling
+- Key-frame render fails → entry still saved; PDF shows an image placeholder.
+- PDF generation fails → surface the error, keep the session intact for retry.
+- `needs_human` entry → no numeric score; PDF shows "Clinician review required"; composite null.
+- Composite is `None` whenever any test is unscored or needs human review (existing rule — never show a partial 0–21 as complete).
+- All disk writes tolerate failure with a logged warning; a write failure degrades the artifact but never blocks scoring.
+## Testing (must run without model downloads)
+- `tests/test_pdf_report.py` — synthetic `ReportResult` + entries → PDF file created, non-zero size, contains the disclaimer text and composite line.
+- `tests/test_session.py` — accumulation; composite math; bilateral lower-score + asymmetry delta; null composite when one entry `needs_human`.
+- `tests/test_keyframe.py` — `render_frame` returns a real PNG path (file exists) for a synthetic frame; returns `None` gracefully on bad input.
+## Invariants preserved
+- Pipeline stays headless — no Gradio imports in agent files (`PdfReportAgent` is a pure agent; key-frame capture stays in `visualizer.py`, the existing UI-layer component).
+- Safety disclaimer present top and bottom of the PDF, mirroring the UI.
+- Pain / clearing / needs-human is never auto-scored; composite null when any test unscored.
+- New code follows the engineering standards: one public entrypoint per agent, typed dataclass I/O, `confidence`/`notes` where applicable, module docstring stating purpose/inputs/outputs/failure/params/license/gated.
+## Open implementation choices (left to the plan)
+- Exact `SessionEntry` → `ReportAgent` dict adapter shape.
+- Which measurements to surface per test in the PDF (a curated subset, not the full `angles` dump).
+- PDF assertion strategy in tests (text extraction vs. size/smoke).

formscout/agents/classifier.py CHANGED Viewed

@@ -1,102 +1,102 @@
-"""
-MovementClassifierAgent — identifies which FMS test is in the clip.
-Input:  IngestResult (keyframes), Pose2DResult (skeleton context)
-Output: MovementResult(test_name, side, confidence)
-Failure: returns MovementResult(test_name="unknown") — pipeline stops and asks for manual override.
-Model:  Qwen3-VL-8B-Instruct via llama.cpp (8B params, Apache-2.0).
-Gated:  No.
-"""
-from __future__ import annotations
-import logging
-from pathlib import Path
-from formscout import config
-from formscout.types import IngestResult, Pose2DResult, MovementResult
-from formscout.serving.llama_cpp import LlamaCppClient
-logger = logging.getLogger(__name__)
-_PROMPT_PATH = Path(__file__).parent / "prompts" / "c1_classifier.md"
-class MovementClassifierAgent:
-    """Classifies which FMS test is being performed via VLM or manual override."""
-    def __init__(self):
-        self._client = LlamaCppClient(port=config.LLAMA_CPP_PORT_VLM)
-        self._system_prompt = _PROMPT_PATH.read_text(encoding="utf-8")
-    def run(
-        self,
-        ingest: IngestResult,
-        pose2d: Pose2DResult | None = None,
-        manual_override: str | None = None,
-    ) -> MovementResult:
-        """
-        Classify the movement. If manual_override is provided, use it directly.
-        Otherwise, use VLM inference on keyframes.
-        """
-        if manual_override and manual_override != "unknown":
-            return MovementResult(
-                test_name=manual_override, side="na",
-                confidence=1.0, notes="manual override",
-            )
-        if not self._client.available:
-            return MovementResult(
-                test_name="unknown", side="na", confidence=0.0,
-                notes="VLM server unavailable — use manual override",
-            )
-        # Select keyframes for classification (3 evenly spaced)
-        n = len(ingest.frames)
-        indices = [0, n // 2, n - 1] if n >= 3 else list(range(n))
-        images = self._encode_frames(ingest.frames, indices)
-        prompt = f"{self._system_prompt}\n\nClassify this movement from the keyframes shown."
-        result = self._client.complete(prompt, images=images, max_tokens=256, temperature=0.1)
-        return self._parse_response(result)
-    def _encode_frames(self, frames: list, indices: list[int]) -> list[str]:
-        """Encode selected frames as base64 JPEG for the VLM."""
-        import cv2
-        import base64
-        encoded = []
-        for idx in indices:
-            if idx < len(frames):
-                _, buf = cv2.imencode(".jpg", frames[idx], [cv2.IMWRITE_JPEG_QUALITY, 80])
-                encoded.append(base64.b64encode(buf.tobytes()).decode())
-        return encoded
-    def _parse_response(self, result: dict) -> MovementResult:
-        """Parse VLM JSON response into MovementResult."""
-        if "error" in result:
-            return MovementResult(
-                test_name="unknown", side="na", confidence=0.0,
-                notes=f"VLM error: {result['error']}",
-            )
-        test = result.get("test", "unknown")
-        side = result.get("side", "na")
-        confidence = float(result.get("confidence", 0.0))
-        reason = result.get("reason", "")
-        valid_tests = {
-            "deep_squat", "hurdle_step", "inline_lunge",
-            "shoulder_mobility", "active_slr",
-            "trunk_stability_pushup", "rotary_stability", "unknown",
-        }
-        if test not in valid_tests:
-            test = "unknown"
-        if side not in ("left", "right", "na"):
-            side = "na"
-        return MovementResult(
-            test_name=test, side=side,
-            confidence=confidence, notes=reason,
-        )

+"""
+MovementClassifierAgent — identifies which FMS test is in the clip.
+Input:  IngestResult (keyframes), Pose2DResult (skeleton context)
+Output: MovementResult(test_name, side, confidence)
+Failure: returns MovementResult(test_name="unknown") — pipeline stops and asks for manual override.
+Model:  Qwen3-VL-8B-Instruct via llama.cpp (8B params, Apache-2.0).
+Gated:  No.
+"""
+from __future__ import annotations
+import logging
+from pathlib import Path
+from formscout import config
+from formscout.types import IngestResult, Pose2DResult, MovementResult
+from formscout.serving.llama_cpp import LlamaCppClient
+logger = logging.getLogger(__name__)
+_PROMPT_PATH = Path(__file__).parent / "prompts" / "c1_classifier.md"
+class MovementClassifierAgent:
+    """Classifies which FMS test is being performed via VLM or manual override."""
+    def __init__(self):
+        self._client = LlamaCppClient(port=config.LLAMA_CPP_PORT_VLM)
+        self._system_prompt = _PROMPT_PATH.read_text(encoding="utf-8")
+    def run(
+        self,
+        ingest: IngestResult,
+        pose2d: Pose2DResult | None = None,
+        manual_override: str | None = None,
+    ) -> MovementResult:
+        """
+        Classify the movement. If manual_override is provided, use it directly.
+        Otherwise, use VLM inference on keyframes.
+        """
+        if manual_override and manual_override != "unknown":
+            return MovementResult(
+                test_name=manual_override, side="na",
+                confidence=1.0, notes="manual override",
+            )
+        if not self._client.available:
+            return MovementResult(
+                test_name="unknown", side="na", confidence=0.0,
+                notes="VLM server unavailable — use manual override",
+            )
+        # Select keyframes for classification (3 evenly spaced)
+        n = len(ingest.frames)
+        indices = [0, n // 2, n - 1] if n >= 3 else list(range(n))
+        images = self._encode_frames(ingest.frames, indices)
+        prompt = f"{self._system_prompt}\n\nClassify this movement from the keyframes shown."
+        result = self._client.complete(prompt, images=images, max_tokens=256, temperature=0.1)
+        return self._parse_response(result)
+    def _encode_frames(self, frames: list, indices: list[int]) -> list[str]:
+        """Encode selected frames as base64 JPEG for the VLM."""
+        import cv2
+        import base64
+        encoded = []
+        for idx in indices:
+            if idx < len(frames):
+                _, buf = cv2.imencode(".jpg", frames[idx], [cv2.IMWRITE_JPEG_QUALITY, 80])
+                encoded.append(base64.b64encode(buf.tobytes()).decode())
+        return encoded
+    def _parse_response(self, result: dict) -> MovementResult:
+        """Parse VLM JSON response into MovementResult."""
+        if "error" in result:
+            return MovementResult(
+                test_name="unknown", side="na", confidence=0.0,
+                notes=f"VLM error: {result['error']}",
+            )
+        test = result.get("test", "unknown")
+        side = result.get("side", "na")
+        confidence = float(result.get("confidence", 0.0))
+        reason = result.get("reason", "")
+        valid_tests = {
+            "deep_squat", "hurdle_step", "inline_lunge",
+            "shoulder_mobility", "active_slr",
+            "trunk_stability_pushup", "rotary_stability", "unknown",
+        }
+        if test not in valid_tests:
+            test = "unknown"
+        if side not in ("left", "right", "na"):
+            side = "na"
+        return MovementResult(
+            test_name=test, side=side,
+            confidence=confidence, notes=reason,
+        )

formscout/agents/ingest.py CHANGED Viewed

@@ -49,32 +49,22 @@ class IngestAgent:
         total = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
         w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
         h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
         notes_parts: list[str] = []
-        # Frame count is unreliable for webm/mkv (browser recordings often
-        # report 0 or a wrong value). Detect this and fall back to reading
-        # every frame up to the cap, subsampling only when the count is trusted.
-        count_reliable = total > 0
-        if count_reliable:
-            step = max(1, total // config.MAX_FRAMES)
-        else:
-            step = 1  # unknown length — keep frames, cap at MAX_FRAMES
-        # Read frames; tolerate a truncated/premature end gracefully.
         frames: list = []
         idx = 0
-        read_failures = 0
         while True:
             ret, frame = cap.read()
             if not ret:
-                # A truncated final chunk yields ret=False; stop cleanly.
                 break
-            if frame is None:
-                read_failures += 1
-                if read_failures > 5:
-                    break
-                continue
             if idx % step == 0:
                 frames.append(frame)
             idx += 1
@@ -82,17 +72,6 @@ class IngestAgent:
                 break
         cap.release()
-        # Compute duration from what we actually decoded, not the (possibly
-        # bogus) header frame count.
-        decoded_total = idx if idx > 0 else len(frames)
-        duration = decoded_total / fps if fps > 0 else 0.0
-        if duration > config.MAX_DURATION_SEC:
-            notes_parts.append(
-                f"video is {duration:.1f}s (>{config.MAX_DURATION_SEC}s) — capping frames"
-            )
-        if not count_reliable and frames:
-            notes_parts.append("frame count unreliable (webm/mkv) — read sequentially")
         if not frames:
             return IngestResult(
                 frames=[], fps=fps, duration=duration, n_people=0,

         total = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
         w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
         h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
+        duration = total / fps if fps > 0 else 0.0
         notes_parts: list[str] = []
+        if duration > config.MAX_DURATION_SEC:
+            notes_parts.append(
+                f"video is {duration:.1f}s (>{config.MAX_DURATION_SEC}s) — capping frames"
+            )
+        # Sample frames evenly, capped at MAX_FRAMES
+        step = max(1, total // config.MAX_FRAMES)
         frames: list = []
         idx = 0
         while True:
             ret, frame = cap.read()
             if not ret:
                 break
             if idx % step == 0:
                 frames.append(frame)
             idx += 1
                 break
         cap.release()
         if not frames:
             return IngestResult(
                 frames=[], fps=fps, duration=duration, n_people=0,

formscout/agents/judge.py CHANGED Viewed

@@ -1,136 +1,125 @@
-"""
-JudgeAgent — VLM-based final scorer with rationale, compensation tags, pain detection.
-Input:  BiomechFeatures, ScoreResult (rubric candidate), MovementResult, keyframes
-Output: JudgeResult(score, rationale, compensation_tags, corrective_hint, needs_human)
-Failure: returns JudgeResult(needs_human=True, score=None) when uncertain.
-Model:  Qwen3-VL-8B-Instruct via llama.cpp (8B params, Apache-2.0).
-Gated:  No.
-Safety: NEVER auto-scores pain. If any indication of pain/clearing test,
-        sets needs_human=True and score=None.
-"""
-from __future__ import annotations
-import json
-import logging
-from pathlib import Path
-from formscout import config
-from formscout.types import (
-    BiomechFeatures, ScoreResult, MovementResult,
-    IngestResult, JudgeResult,
-)
-from formscout.serving.llama_cpp import LlamaCppClient
-logger = logging.getLogger(__name__)
-_PROMPT_PATH = Path(__file__).parent / "prompts" / "c2_judge.md"
-class JudgeAgent:
-    """VLM judge that produces the final FMS score with rationale."""
-    def __init__(self):
-        self._client = LlamaCppClient(port=config.LLAMA_CPP_PORT_VLM)
-        self._system_prompt = _PROMPT_PATH.read_text(encoding="utf-8")
-    def run(
-        self,
-        features: BiomechFeatures,
-        rubric_score: ScoreResult,
-        movement: MovementResult,
-        ingest: IngestResult | None = None,
-    ) -> JudgeResult:
-        """
-        Produce final score. Falls back to rubric score if VLM unavailable.
-        """
-        if not config.ENABLE_JUDGE:
-            return self._fallback_from_rubric(rubric_score, features)
-        if not self._client.available:
-            logger.warning("JudgeAgent: VLM unavailable, using rubric score as final")
-            return self._fallback_from_rubric(rubric_score, features)
-        # Build context for the judge
-        context = {
-            "test": features.test_name,
-            "side": features.side,
-            "view": features.view,
-            "features": {"angles": features.angles, "alignments": features.alignments},
-            "candidate_score": rubric_score.score,
-            "candidate_confidence": rubric_score.confidence,
-            "exemplars": [],  # Phase 3: populated by RetrievalAgent
-        }
-        prompt = f"{self._system_prompt}\n\n{json.dumps(context, indent=2)}"
-        # Optionally include keyframes
-        images = None
-        if ingest and ingest.frames:
-            images = self._encode_keyframes(ingest.frames)
-        result = self._client.complete(prompt, images=images, max_tokens=512, temperature=0.1)
-        return self._parse_response(result)
-    def _encode_keyframes(self, frames: list) -> list[str]:
-        """Encode 3 keyframes for VLM context (downscaled to max 768px)."""
-        import cv2
-        import base64
-        n = len(frames)
-        indices = [0, n // 2, n - 1] if n >= 3 else list(range(n))
-        encoded = []
-        for idx in indices:
-            frame = self._downscale(frames[idx], max_dim=768)
-            _, buf = cv2.imencode(".jpg", frame, [cv2.IMWRITE_JPEG_QUALITY, 70])
-            encoded.append(base64.b64encode(buf.tobytes()).decode())
-        return encoded
-    @staticmethod
-    def _downscale(frame, max_dim: int = 768):
-        """Resize frame so its largest dimension is at most max_dim."""
-        import cv2
-        h, w = frame.shape[:2]
-        longest = max(h, w)
-        if longest <= max_dim:
-            return frame
-        scale = max_dim / longest
-        new_size = (int(w * scale), int(h * scale))
-        return cv2.resize(frame, new_size, interpolation=cv2.INTER_AREA)
-    def _parse_response(self, result: dict) -> JudgeResult:
-        """Parse VLM JSON response into JudgeResult."""
-        if "error" in result:
-            return JudgeResult(
-                score=None, rationale=f"VLM error: {result['error']}",
-                compensation_tags=[], corrective_hint="",
-                confidence=0.0, needs_human=True,
-            )
-        needs_human = result.get("needs_human", False)
-        score = result.get("score") if not needs_human else None
-        if score is not None:
-            score = max(0, min(3, int(score)))
-        return JudgeResult(
-            score=score,
-            rationale=result.get("rationale", ""),
-            compensation_tags=result.get("compensation_tags", []),
-            corrective_hint=result.get("corrective_hint", ""),
-            confidence=float(result.get("confidence", 0.5)),
-            needs_human=needs_human,
-        )
-    def _fallback_from_rubric(self, rubric: ScoreResult, features: BiomechFeatures) -> JudgeResult:
-        """When VLM is unavailable, promote the rubric score as the final score."""
-        return JudgeResult(
-            score=rubric.score,
-            rationale=f"[rubric-only] {rubric.rationale}",
-            compensation_tags=[],
-            corrective_hint="",
-            confidence=rubric.confidence * 0.8,
-            needs_human=rubric.needs_human,
-            notes="VLM unavailable — rubric score used as final",
-        )

+"""
+JudgeAgent — VLM-based final scorer with rationale, compensation tags, pain detection.
+Input:  BiomechFeatures, ScoreResult (rubric candidate), MovementResult, keyframes
+Output: JudgeResult(score, rationale, compensation_tags, corrective_hint, needs_human)
+Failure: returns JudgeResult(needs_human=True, score=None) when uncertain.
+Model:  Qwen3-VL-8B-Instruct via llama.cpp (8B params, Apache-2.0).
+Gated:  No.
+Safety: NEVER auto-scores pain. If any indication of pain/clearing test,
+        sets needs_human=True and score=None.
+"""
+from __future__ import annotations
+import json
+import logging
+from pathlib import Path
+from formscout import config
+from formscout.types import (
+    BiomechFeatures, ScoreResult, MovementResult,
+    IngestResult, JudgeResult,
+)
+from formscout.serving import get_vlm_client
+logger = logging.getLogger(__name__)
+_PROMPT_PATH = Path(__file__).parent / "prompts" / "c2_judge.md"
+class JudgeAgent:
+    """VLM judge that produces the final FMS score with rationale."""
+    def __init__(self):
+        self._client = get_vlm_client()
+        self._system_prompt = _PROMPT_PATH.read_text(encoding="utf-8")
+    def run(
+        self,
+        features: BiomechFeatures,
+        rubric_score: ScoreResult,
+        movement: MovementResult,
+        ingest: IngestResult | None = None,
+    ) -> JudgeResult:
+        """
+        Produce final score. Falls back to rubric score if VLM unavailable.
+        """
+        if not config.ENABLE_JUDGE:
+            return self._fallback_from_rubric(rubric_score, features)
+        if not self._client.available:
+            logger.warning("JudgeAgent: VLM unavailable, using rubric score as final")
+            return self._fallback_from_rubric(rubric_score, features)
+        # Build context for the judge
+        context = {
+            "test": features.test_name,
+            "side": features.side,
+            "view": features.view,
+            "features": {"angles": features.angles, "alignments": features.alignments},
+            "candidate_score": rubric_score.score,
+            "candidate_confidence": rubric_score.confidence,
+            "exemplars": [],  # Phase 3: populated by RetrievalAgent
+        }
+        prompt = f"{self._system_prompt}\n\n{json.dumps(context, indent=2)}"
+        # Optionally include keyframes
+        images = None
+        if ingest and ingest.frames:
+            images = self._encode_keyframes(ingest.frames)
+        result = self._client.complete(prompt, images=images, max_tokens=512, temperature=0.1)
+        if result.get("fallback"):
+            # transformers backend couldn't load/run — use the deterministic rubric
+            return self._fallback_from_rubric(rubric_score, features)
+        return self._parse_response(result)
+    def _encode_keyframes(self, frames: list) -> list[str]:
+        """Encode 3 keyframes for VLM context."""
+        import cv2
+        import base64
+        n = len(frames)
+        indices = [0, n // 2, n - 1] if n >= 3 else list(range(n))
+        encoded = []
+        for idx in indices:
+            _, buf = cv2.imencode(".jpg", frames[idx], [cv2.IMWRITE_JPEG_QUALITY, 70])
+            encoded.append(base64.b64encode(buf.tobytes()).decode())
+        return encoded
+    def _parse_response(self, result: dict) -> JudgeResult:
+        """Parse VLM JSON response into JudgeResult."""
+        if "error" in result:
+            return JudgeResult(
+                score=None, rationale=f"VLM error: {result['error']}",
+                compensation_tags=[], corrective_hint="",
+                confidence=0.0, needs_human=True,
+            )
+        needs_human = result.get("needs_human", False)
+        score = result.get("score") if not needs_human else None
+        if score is not None:
+            score = max(0, min(3, int(score)))
+        return JudgeResult(
+            score=score,
+            rationale=result.get("rationale", ""),
+            compensation_tags=result.get("compensation_tags", []),
+            corrective_hint=result.get("corrective_hint", ""),
+            confidence=float(result.get("confidence", 0.5)),
+            needs_human=needs_human,
+        )
+    def _fallback_from_rubric(self, rubric: ScoreResult, features: BiomechFeatures) -> JudgeResult:
+        """When VLM is unavailable, promote the rubric score as the final score."""
+        return JudgeResult(
+            score=rubric.score,
+            rationale=f"[rubric-only] {rubric.rationale}",
+            compensation_tags=[],
+            corrective_hint="",
+            confidence=rubric.confidence * 0.8,
+            needs_human=rubric.needs_human,
+            notes="VLM unavailable — rubric score used as final",
+        )

formscout/agents/pdf_report.py CHANGED Viewed

@@ -1,115 +1,175 @@
-"""
-PdfReportAgent — renders a ReportResult + session entries to a branded PDF.
-Input:  ReportResult, list[SessionEntry], session_dir (str)
-Output: path to the written PDF (str), or None on failure.
-Failure: returns None, never raises.
-Params: 0 (pure rendering — no model).
-License: n/a.
-Gated: no.
-"""
-from __future__ import annotations
-import logging
-import os
-from formscout.types import ReportResult
-logger = logging.getLogger(__name__)
-DISCLAIMER = "Screening aid — not a diagnosis. Pain or clearing tests require a clinician."
-class PdfReportAgent:
-    """Assembles the screening-session PDF via ReportLab."""
-    def run(self, report: ReportResult, entries: list, session_dir: str) -> str | None:
-        try:
-            from reportlab.lib import colors
-            from reportlab.lib.pagesizes import LETTER
-            from reportlab.lib.styles import ParagraphStyle, getSampleStyleSheet
-            from reportlab.lib.units import inch
-            from reportlab.platypus import (
-                Image, Paragraph, SimpleDocTemplate, Spacer, Table, TableStyle,
-            )
-        except Exception as e:
-            logger.warning("reportlab unavailable: %s", e)
-            return None
-        out_path = os.path.join(session_dir, "formscout_report.pdf")
-        try:
-            styles = getSampleStyleSheet()
-            banner = ParagraphStyle(
-                "banner", parent=styles["Normal"], fontSize=9, textColor=colors.white,
-                backColor=colors.HexColor("#b45309"), alignment=1, borderPadding=6, spaceAfter=12,
-            )
-            story = []
-            story.append(Paragraph(f"<b>&#9888; {DISCLAIMER}</b>", banner))
-            story.append(Paragraph("FormScout — FMS Screening Report", styles["Title"]))
-            if report.composite is not None:
-                comp = f"Composite: <b>{report.composite} / 21</b>"
-            else:
-                comp = f"Composite: <b>Incomplete</b> — {len(entries)}/7 tests scored"
-            story.append(Paragraph(comp, styles["Heading2"]))
-            story.append(Spacer(1, 0.2 * inch))
-            for e in entries:
-                title = e.test_name.replace("_", " ").title()
-                if e.side in ("left", "right"):
-                    title += f" ({e.side})"
-                score_txt = "Clinician review required" if e.needs_human else f"Score: {e.score}/3"
-                story.append(Paragraph(f"<b>{title}</b> — {score_txt}", styles["Heading3"]))
-                if e.rationale:
-                    story.append(Paragraph(e.rationale, styles["Normal"]))
-                if e.compensation_tags:
-                    story.append(Paragraph("Compensations: " + ", ".join(e.compensation_tags),
-                                           styles["Normal"]))
-                if e.corrective_hint:
-                    story.append(Paragraph("Corrective: " + e.corrective_hint, styles["Normal"]))
-                items = list(e.measurements.items())[:6]
-                if items:
-                    rows = [[k.replace("_", " "),
-                             (f"{v:.1f}" if isinstance(v, float) else str(v))] for k, v in items]
-                    tbl = Table(rows, colWidths=[3 * inch, 1.5 * inch])
-                    tbl.setStyle(TableStyle([
-                        ("FONTSIZE", (0, 0), (-1, -1), 8),
-                        ("TEXTCOLOR", (0, 0), (-1, -1), colors.HexColor("#334155")),
-                    ]))
-                    story.append(tbl)
-                if e.keyframe_path and os.path.exists(e.keyframe_path):
-                    try:
-                        story.append(Image(e.keyframe_path, width=3.0 * inch, height=2.25 * inch))
-                    except Exception:
-                        story.append(Paragraph("<i>(key-frame image unavailable)</i>", styles["Normal"]))
-                else:
-                    story.append(Paragraph("<i>(key-frame image unavailable)</i>", styles["Normal"]))
-                story.append(Spacer(1, 0.2 * inch))
-            if report.asymmetries:
-                story.append(Paragraph("Asymmetries", styles["Heading2"]))
-                for a in report.asymmetries:
-                    story.append(Paragraph(
-                        f"{a['test'].replace('_', ' ').title()}: "
-                        f"L={a['left_score']} R={a['right_score']} (&#916; {a['delta']})",
-                        styles["Normal"]))
-            flags = list(report.low_confidence_flags) + list(report.disagreement_flags)
-            if flags:
-                story.append(Paragraph("Flags", styles["Heading2"]))
-                for fl in flags:
-                    story.append(Paragraph(fl, styles["Normal"]))
-            story.append(Spacer(1, 0.3 * inch))
-            story.append(Paragraph(f"<b>&#9888; {DISCLAIMER}</b>", banner))
-            doc = SimpleDocTemplate(out_path, pagesize=LETTER,
-                                    topMargin=0.6 * inch, bottomMargin=0.6 * inch)
-            doc.build(story)
-            return out_path
-        except Exception as e:
-            logger.warning("pdf build failed: %s", e)
-            return None

+"""
+PdfReportAgent — renders a ReportResult + session entries to a branded PDF.
+Input:  ReportResult, list[SessionEntry], session_dir (str)
+Output: path to the written PDF (str), or None on failure.
+Failure: returns None, never raises.
+Params: 0 (pure rendering — no model).
+License: n/a.
+Gated: no.
+"""
+from __future__ import annotations
+import logging
+import os
+from formscout.types import ReportResult
+logger = logging.getLogger(__name__)
+DISCLAIMER = "Screening aid — not a diagnosis. Pain or clearing tests require a clinician."
+class PdfReportAgent:
+    """Assembles the screening-session PDF via ReportLab."""
+    def run(self, report: ReportResult, entries: list, session_dir: str) -> str | None:
+        try:
+            from reportlab.lib import colors
+            from reportlab.lib.pagesizes import LETTER
+            from reportlab.lib.styles import ParagraphStyle, getSampleStyleSheet
+            from reportlab.lib.units import inch
+            from reportlab.platypus import (
+                Image, PageBreak, Paragraph, SimpleDocTemplate, Spacer, Table, TableStyle,
+            )
+        except Exception as e:
+            logger.warning("reportlab unavailable: %s", e)
+            return None
+        out_path = os.path.join(session_dir, "formscout_report.pdf")
+        try:
+            styles = getSampleStyleSheet()
+            banner = ParagraphStyle(
+                "banner", parent=styles["Normal"], fontSize=9, textColor=colors.white,
+                backColor=colors.HexColor("#cf922a"), alignment=1, borderPadding=6, spaceAfter=12,
+            )
+            ink = colors.HexColor("#243a34")
+            def _meas_table(pairs, col0=3.0, col1=1.6):
+                rows = [[str(k).replace("_", " "),
+                         (f"{v:.2f}" if isinstance(v, float) else str(v))] for k, v in pairs]
+                tbl = Table(rows, colWidths=[col0 * inch, col1 * inch])
+                tbl.setStyle(TableStyle([
+                    ("FONTSIZE", (0, 0), (-1, -1), 8),
+                    ("TEXTCOLOR", (0, 0), (-1, -1), ink),
+                    ("ROWBACKGROUNDS", (0, 0), (-1, -1),
+                     [colors.HexColor("#f7eedd"), colors.white]),
+                ]))
+                return tbl
+            def _img(path, w=3.0, h=2.25):
+                if path and os.path.exists(path):
+                    try:
+                        return Image(path, width=w * inch, height=h * inch)
+                    except Exception:
+                        return None
+                return None
+            story = []
+            story.append(Paragraph(f"<b>&#9888; {DISCLAIMER}</b>", banner))
+            story.append(Paragraph("FormScout — FMS Screening Report", styles["Title"]))
+            if report.composite is not None:
+                comp = f"Composite: <b>{report.composite} / 21</b>"
+            else:
+                comp = f"Composite: <b>Incomplete</b> — {len(entries)}/7 tests scored"
+            story.append(Paragraph(comp, styles["Heading2"]))
+            story.append(Spacer(1, 0.2 * inch))
+            for ei, e in enumerate(entries):
+                if ei > 0:
+                    story.append(PageBreak())
+                title = e.test_name.replace("_", " ").title()
+                if e.side in ("left", "right"):
+                    title += f" ({e.side})"
+                score_txt = "Clinician review required" if e.needs_human else f"Score: {e.score}/3"
+                story.append(Paragraph(f"<b>{title}</b> — {score_txt}", styles["Heading3"]))
+                story.append(Paragraph(f"<font size=8>view: {e.view} · confidence: "
+                                       f"{e.confidence:.0%}</font>", styles["Normal"]))
+                if e.rationale:
+                    story.append(Paragraph(e.rationale, styles["Normal"]))
+                if e.compensation_tags:
+                    story.append(Paragraph("<b>Compensations:</b> " + ", ".join(e.compensation_tags),
+                                           styles["Normal"]))
+                if e.corrective_hint:
+                    story.append(Paragraph("<b>Corrective:</b> " + e.corrective_hint, styles["Normal"]))
+                # Key frame + flexion chart side by side
+                kf, fb = _img(e.keyframe_path), _img((e.chart_paths or {}).get("flexion"), w=3.2, h=2.0)
+                if kf or fb:
+                    cells = [c for c in (kf, fb) if c] or [Paragraph("<i>(images unavailable)</i>",
+                                                                     styles["Normal"])]
+                    story.append(Table([cells], hAlign="LEFT"))
+                # Relevant-joint flexion table
+                if e.flexion:
+                    story.append(Paragraph("<b>Relevant joint flexion (key frame)</b>", styles["Normal"]))
+                    story.append(_meas_table(
+                        [(n, f"{v['deg']:.1f}° — {v['openness']}") for n, v in e.flexion.items()],
+                        col0=2.6, col1=2.6))
+                # Laban Effort + radar
+                if e.laban:
+                    eff, lab = e.laban.get("effort", {}), e.laban.get("labels", {})
+                    story.append(Spacer(1, 0.08 * inch))
+                    story.append(Paragraph("<b>Laban Effort (kinematic estimate)</b>", styles["Normal"]))
+                    laban_tbl = _meas_table(
+                        [(k.title(), f"{eff.get(k, 0):.2f} — {lab.get(k, '')}")
+                         for k in ("space", "weight", "time", "flow")], col0=2.6, col1=2.6)
+                    radar = _img((e.chart_paths or {}).get("radar"), w=2.6, h=2.6)
+                    if radar:
+                        story.append(Table([[laban_tbl, radar]], hAlign="LEFT"))
+                    else:
+                        story.append(laban_tbl)
+                    if e.laban.get("body_emphasis"):
+                        emph = ", ".join(f"{n}" for n, _ in e.laban["body_emphasis"])
+                        story.append(Paragraph(f"<font size=8>Body emphasis: {emph} · "
+                                               f"{e.laban.get('notes', '')}</font>", styles["Normal"]))
+                # Angle + velocity charts
+                for kind in ("angle", "velocity"):
+                    chart = _img((e.chart_paths or {}).get(kind), w=5.0, h=2.5)
+                    if chart:
+                        story.append(chart)
+                # Full measurement dump
+                if e.measurements:
+                    story.append(Paragraph("<b>All measurements</b>", styles["Normal"]))
+                    story.append(_meas_table(list(e.measurements.items())))
+                story.append(Spacer(1, 0.15 * inch))
+            if report.asymmetries:
+                story.append(PageBreak())
+                story.append(Paragraph("Asymmetries", styles["Heading2"]))
+                for a in report.asymmetries:
+                    story.append(Paragraph(
+                        f"{a['test'].replace('_', ' ').title()}: "
+                        f"L={a['left_score']} R={a['right_score']} (&#916; {a['delta']})",
+                        styles["Normal"]))
+                try:
+                    from formscout.analysis.charts import symmetry_bars
+                    os.makedirs(os.path.join(session_dir, "charts"), exist_ok=True)
+                    sym_png = symmetry_bars(report.asymmetries,
+                                            os.path.join(session_dir, "charts", "symmetry.png"))
+                    sym_img = _img(sym_png, w=5.5, h=2.75)
+                    if sym_img:
+                        story.append(sym_img)
+                except Exception:
+                    pass
+            flags = list(report.low_confidence_flags) + list(report.disagreement_flags)
+            if flags:
+                story.append(Paragraph("Flags", styles["Heading2"]))
+                for fl in flags:
+                    story.append(Paragraph(fl, styles["Normal"]))
+            story.append(Spacer(1, 0.3 * inch))
+            story.append(Paragraph(f"<b>&#9888; {DISCLAIMER}</b>", banner))
+            doc = SimpleDocTemplate(out_path, pagesize=LETTER,
+                                    topMargin=0.6 * inch, bottomMargin=0.6 * inch)
+            doc.build(story)
+            return out_path
+        except Exception as e:
+            logger.warning("pdf build failed: %s", e)
+            return None

formscout/agents/pose2d.py CHANGED Viewed

@@ -1,232 +1,232 @@
-"""
-Pose2DAgent — 2D per-frame keypoint extraction.
-Backends: yolo (local checkpoints, ultralytics), mediapipe (official Tasks API,
-          local .task checkpoint), sapiens2 (Meta HF/transformers).
-All backends output COCO-17 keypoints: dict[int, {x, y, conf}] per frame.
-Input:  IngestResult
-Output: Pose2DResult(keypoints per frame, fps, confidence)
-Failure: Pose2DResult(confidence=0.0, notes=<reason>) — never raises.
-Gated: yolo=no; mediapipe=no (local checkpoint); sapiens2=yes (access accepted).
-"""
-from __future__ import annotations
-import logging
-import numpy as np
-from formscout import config
-from formscout.types import IngestResult, Pose2DResult
-logger = logging.getLogger(__name__)
-COCO_KEYPOINTS = [
-    "nose", "left_eye", "right_eye", "left_ear", "right_ear",
-    "left_shoulder", "right_shoulder", "left_elbow", "right_elbow",
-    "left_wrist", "right_wrist", "left_hip", "right_hip",
-    "left_knee", "right_knee", "left_ankle", "right_ankle",
-]
-# BlazePose-33 source indices → COCO-17 target indices
-# BlazePose: 0=nose, 2=left_eye, 5=right_eye, 7=left_ear, 8=right_ear,
-#            11=left_shoulder, 12=right_shoulder, 13=left_elbow, 14=right_elbow,
-#            15=left_wrist, 16=right_wrist, 23=left_hip, 24=right_hip,
-#            25=left_knee, 26=right_knee, 27=left_ankle, 28=right_ankle
-_BP_SRC = [0, 2, 5, 7, 8, 11, 12, 13, 14, 15, 16, 23, 24, 25, 26, 27, 28]
-_BP_DST = list(range(17))  # COCO indices 0..16
-_model_cache: dict[str, object] = {}
-# ── YOLO backend ──────────────────────────────────────────────────────────────
-def _get_yolo(path: str) -> object:
-    if path not in _model_cache:
-        from ultralytics import YOLO
-        _model_cache[path] = YOLO(path)
-    return _model_cache[path]
-def _run_yolo(frames: list, path: str) -> list[dict]:
-    model = _get_yolo(path)
-    out = []
-    for frame in frames:
-        try:
-            results = model(frame, verbose=False)
-            kps: dict[int, dict] = {}
-            if results and results[0].keypoints is not None:
-                kp = results[0].keypoints
-                if kp.xy is not None and len(kp.xy) > 0:
-                    xy = kp.xy[0].cpu().numpy()
-                    conf = kp.conf[0].cpu().numpy()
-                    for j in range(min(len(xy), 17)):
-                        kps[j] = {"x": float(xy[j, 0]), "y": float(xy[j, 1]), "conf": float(conf[j])}
-            out.append(kps)
-        except Exception:
-            out.append({})
-    return out
-# ── MediaPipe backend (official Tasks API, local .task checkpoint) ────────────
-def _get_mediapipe_landmarker(path: str) -> object:
-    """Return PoseLandmarker cached by model path."""
-    cache_key = f"mp:{path}"
-    if cache_key not in _model_cache:
-        from mediapipe.tasks import python as mp_tasks
-        from mediapipe.tasks.python import vision
-        options = vision.PoseLandmarkerOptions(
-            base_options=mp_tasks.BaseOptions(model_asset_path=path),
-            running_mode=vision.RunningMode.IMAGE,
-            num_poses=1,
-            min_pose_detection_confidence=0.4,
-            min_pose_presence_confidence=0.4,
-            min_tracking_confidence=0.4,
-        )
-        _model_cache[cache_key] = vision.PoseLandmarker.create_from_options(options)
-    return _model_cache[cache_key]
-def _run_mediapipe(frames: list, path: str) -> list[dict]:
-    import cv2
-    import mediapipe as mp
-    try:
-        landmarker = _get_mediapipe_landmarker(path)
-    except Exception as e:
-        logger.warning("mediapipe load failed: %s", e)
-        return [{} for _ in frames]
-    out = []
-    for frame in frames:
-        try:
-            h, w = frame.shape[:2]
-            rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
-            mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=rgb)
-            detection = landmarker.detect(mp_image)
-            kps: dict[int, dict] = {}
-            if detection.pose_landmarks:
-                lms = detection.pose_landmarks[0]
-                for coco_idx, bp_idx in zip(_BP_DST, _BP_SRC):
-                    if bp_idx < len(lms):
-                        lm = lms[bp_idx]
-                        kps[coco_idx] = {
-                            "x": float(lm.x * w),
-                            "y": float(lm.y * h),
-                            "conf": float(lm.visibility),
-                        }
-            out.append(kps)
-        except Exception:
-            out.append({})
-    return out
-# ── Sapiens2 backend (Meta HF, transformers) ──────────────────────────────────
-def _get_sapiens2(hf_id: str) -> object:
-    if hf_id not in _model_cache:
-        from transformers import pipeline as hf_pipeline
-        _model_cache[hf_id] = hf_pipeline("pose-estimation", model=hf_id)
-    return _model_cache[hf_id]
-def _run_sapiens2(frames: list, hf_id: str) -> list[dict]:
-    try:
-        pipe = _get_sapiens2(hf_id)
-    except Exception as e:
-        logger.warning("sapiens2 load failed: %s", e)
-        return [{} for _ in frames]
-    from PIL import Image
-    out = []
-    for frame in frames:
-        try:
-            pil_img = Image.fromarray(frame)
-            result = pipe(pil_img)
-            if not result:
-                out.append({})
-                continue
-            # Take highest-confidence person (first result)
-            person = result[0]
-            keypoints = person.get("keypoints", [])
-            scores = person.get("keypoint_scores", [])
-            # Build name→(x, y, score) lookup from pipeline output
-            kp_lookup: dict[str, tuple] = {}
-            for i, kp in enumerate(keypoints):
-                if isinstance(kp, dict):
-                    name = kp.get("label", "")
-                    x, y = kp.get("x", 0.0), kp.get("y", 0.0)
-                else:
-                    name = ""
-                    x, y = float(kp[0]), float(kp[1])
-                score = float(scores[i]) if i < len(scores) else 0.0
-                if name:
-                    kp_lookup[name] = (x, y, score)
-            kps: dict[int, dict] = {}
-            for coco_idx, name in enumerate(COCO_KEYPOINTS):
-                if name in kp_lookup:
-                    x, y, s = kp_lookup[name]
-                    kps[coco_idx] = {"x": x, "y": y, "conf": s}
-            out.append(kps)
-        except Exception:
-            out.append({})
-    return out
-# ── Agent ─────────────────────────────────────────────────────────────────────
-class Pose2DAgent:
-    """Extracts COCO-17 keypoints per frame; dispatches to YOLO, MediaPipe, or Sapiens2."""
-    def run(self, ingest: IngestResult, model_key: str | None = None) -> Pose2DResult:
-        if not ingest.frames:
-            return Pose2DResult(keypoints=[], fps=ingest.fps, confidence=0.0, notes="no frames in ingest")
-        key = model_key or config.DEFAULT_POSE_MODEL
-        spec = config.POSE_MODELS.get(key)
-        if spec is None:
-            logger.warning("Unknown model_key %r — falling back to %s", key, config.DEFAULT_POSE_MODEL)
-            spec = config.POSE_MODELS[config.DEFAULT_POSE_MODEL]
-        backend = spec["backend"]
-        try:
-            if backend == "yolo":
-                kps_per_frame = _run_yolo(ingest.frames, spec["path"])
-            elif backend == "mediapipe":
-                kps_per_frame = _run_mediapipe(ingest.frames, spec["path"])
-            elif backend == "sapiens2":
-                kps_per_frame = _run_sapiens2(ingest.frames, spec["hf_id"])
-            else:
-                return Pose2DResult(
-                    keypoints=[{} for _ in ingest.frames],
-                    fps=ingest.fps, confidence=0.0,
-                    notes=f"unknown backend: {backend}",
-                )
-        except Exception as e:
-            return Pose2DResult(
-                keypoints=[{} for _ in ingest.frames],
-                fps=ingest.fps, confidence=0.0,
-                notes=str(e),
-            )
-        n_detected = sum(1 for f in kps_per_frame if f)
-        total_conf = sum(
-            sum(kp["conf"] for kp in f.values()) / len(f)
-            for f in kps_per_frame if f
-        )
-        overall_conf = (total_conf / n_detected) if n_detected > 0 else 0.0
-        notes = "" if n_detected > 0 else "no person detected in any frame"
-        return Pose2DResult(
-            keypoints=kps_per_frame,
-            fps=ingest.fps,
-            confidence=overall_conf,
-            notes=notes,
-        )

+"""
+Pose2DAgent — 2D per-frame keypoint extraction.
+Backends: yolo (local checkpoints, ultralytics), mediapipe (official Tasks API,
+          local .task checkpoint), sapiens2 (Meta HF/transformers).
+All backends output COCO-17 keypoints: dict[int, {x, y, conf}] per frame.
+Input:  IngestResult
+Output: Pose2DResult(keypoints per frame, fps, confidence)
+Failure: Pose2DResult(confidence=0.0, notes=<reason>) — never raises.
+Gated: yolo=no; mediapipe=no (local checkpoint); sapiens2=yes (access accepted).
+"""
+from __future__ import annotations
+import logging
+import numpy as np
+from formscout import config
+from formscout.types import IngestResult, Pose2DResult
+logger = logging.getLogger(__name__)
+COCO_KEYPOINTS = [
+    "nose", "left_eye", "right_eye", "left_ear", "right_ear",
+    "left_shoulder", "right_shoulder", "left_elbow", "right_elbow",
+    "left_wrist", "right_wrist", "left_hip", "right_hip",
+    "left_knee", "right_knee", "left_ankle", "right_ankle",
+]
+# BlazePose-33 source indices → COCO-17 target indices
+# BlazePose: 0=nose, 2=left_eye, 5=right_eye, 7=left_ear, 8=right_ear,
+#            11=left_shoulder, 12=right_shoulder, 13=left_elbow, 14=right_elbow,
+#            15=left_wrist, 16=right_wrist, 23=left_hip, 24=right_hip,
+#            25=left_knee, 26=right_knee, 27=left_ankle, 28=right_ankle
+_BP_SRC = [0, 2, 5, 7, 8, 11, 12, 13, 14, 15, 16, 23, 24, 25, 26, 27, 28]
+_BP_DST = list(range(17))  # COCO indices 0..16
+_model_cache: dict[str, object] = {}
+# ── YOLO backend ──────────────────────────────────────────────────────────────
+def _get_yolo(path: str) -> object:
+    if path not in _model_cache:
+        from ultralytics import YOLO
+        _model_cache[path] = YOLO(path)
+    return _model_cache[path]
+def _run_yolo(frames: list, path: str) -> list[dict]:
+    model = _get_yolo(path)
+    out = []
+    for frame in frames:
+        try:
+            results = model(frame, verbose=False)
+            kps: dict[int, dict] = {}
+            if results and results[0].keypoints is not None:
+                kp = results[0].keypoints
+                if kp.xy is not None and len(kp.xy) > 0:
+                    xy = kp.xy[0].cpu().numpy()
+                    conf = kp.conf[0].cpu().numpy()
+                    for j in range(min(len(xy), 17)):
+                        kps[j] = {"x": float(xy[j, 0]), "y": float(xy[j, 1]), "conf": float(conf[j])}
+            out.append(kps)
+        except Exception:
+            out.append({})
+    return out
+# ── MediaPipe backend (official Tasks API, local .task checkpoint) ────────────
+def _get_mediapipe_landmarker(path: str) -> object:
+    """Return PoseLandmarker cached by model path."""
+    cache_key = f"mp:{path}"
+    if cache_key not in _model_cache:
+        from mediapipe.tasks import python as mp_tasks
+        from mediapipe.tasks.python import vision
+        options = vision.PoseLandmarkerOptions(
+            base_options=mp_tasks.BaseOptions(model_asset_path=path),
+            running_mode=vision.RunningMode.IMAGE,
+            num_poses=1,
+            min_pose_detection_confidence=0.4,
+            min_pose_presence_confidence=0.4,
+            min_tracking_confidence=0.4,
+        )
+        _model_cache[cache_key] = vision.PoseLandmarker.create_from_options(options)
+    return _model_cache[cache_key]
+def _run_mediapipe(frames: list, path: str) -> list[dict]:
+    import cv2
+    import mediapipe as mp
+    try:
+        landmarker = _get_mediapipe_landmarker(path)
+    except Exception as e:
+        logger.warning("mediapipe load failed: %s", e)
+        return [{} for _ in frames]
+    out = []
+    for frame in frames:
+        try:
+            h, w = frame.shape[:2]
+            rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
+            mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=rgb)
+            detection = landmarker.detect(mp_image)
+            kps: dict[int, dict] = {}
+            if detection.pose_landmarks:
+                lms = detection.pose_landmarks[0]
+                for coco_idx, bp_idx in zip(_BP_DST, _BP_SRC):
+                    if bp_idx < len(lms):
+                        lm = lms[bp_idx]
+                        kps[coco_idx] = {
+                            "x": float(lm.x * w),
+                            "y": float(lm.y * h),
+                            "conf": float(lm.visibility),
+                        }
+            out.append(kps)
+        except Exception:
+            out.append({})
+    return out
+# ── Sapiens2 backend (Meta HF, transformers) ──────────────────────────────────
+def _get_sapiens2(hf_id: str) -> object:
+    if hf_id not in _model_cache:
+        from transformers import pipeline as hf_pipeline
+        _model_cache[hf_id] = hf_pipeline("pose-estimation", model=hf_id)
+    return _model_cache[hf_id]
+def _run_sapiens2(frames: list, hf_id: str) -> list[dict]:
+    try:
+        pipe = _get_sapiens2(hf_id)
+    except Exception as e:
+        logger.warning("sapiens2 load failed: %s", e)
+        return [{} for _ in frames]
+    from PIL import Image
+    out = []
+    for frame in frames:
+        try:
+            pil_img = Image.fromarray(frame)
+            result = pipe(pil_img)
+            if not result:
+                out.append({})
+                continue
+            # Take highest-confidence person (first result)
+            person = result[0]
+            keypoints = person.get("keypoints", [])
+            scores = person.get("keypoint_scores", [])
+            # Build name→(x, y, score) lookup from pipeline output
+            kp_lookup: dict[str, tuple] = {}
+            for i, kp in enumerate(keypoints):
+                if isinstance(kp, dict):
+                    name = kp.get("label", "")
+                    x, y = kp.get("x", 0.0), kp.get("y", 0.0)
+                else:
+                    name = ""
+                    x, y = float(kp[0]), float(kp[1])
+                score = float(scores[i]) if i < len(scores) else 0.0
+                if name:
+                    kp_lookup[name] = (x, y, score)
+            kps: dict[int, dict] = {}
+            for coco_idx, name in enumerate(COCO_KEYPOINTS):
+                if name in kp_lookup:
+                    x, y, s = kp_lookup[name]
+                    kps[coco_idx] = {"x": x, "y": y, "conf": s}
+            out.append(kps)
+        except Exception:
+            out.append({})
+    return out
+# ── Agent ─────────────────────────────────────────────────────────────────────
+class Pose2DAgent:
+    """Extracts COCO-17 keypoints per frame; dispatches to YOLO, MediaPipe, or Sapiens2."""
+    def run(self, ingest: IngestResult, model_key: str | None = None) -> Pose2DResult:
+        if not ingest.frames:
+            return Pose2DResult(keypoints=[], fps=ingest.fps, confidence=0.0, notes="no frames in ingest")
+        key = model_key or config.DEFAULT_POSE_MODEL
+        spec = config.POSE_MODELS.get(key)
+        if spec is None:
+            logger.warning("Unknown model_key %r — falling back to %s", key, config.DEFAULT_POSE_MODEL)
+            spec = config.POSE_MODELS[config.DEFAULT_POSE_MODEL]
+        backend = spec["backend"]
+        try:
+            if backend == "yolo":
+                kps_per_frame = _run_yolo(ingest.frames, spec["path"])
+            elif backend == "mediapipe":
+                kps_per_frame = _run_mediapipe(ingest.frames, spec["path"])
+            elif backend == "sapiens2":
+                kps_per_frame = _run_sapiens2(ingest.frames, spec["hf_id"])
+            else:
+                return Pose2DResult(
+                    keypoints=[{} for _ in ingest.frames],
+                    fps=ingest.fps, confidence=0.0,
+                    notes=f"unknown backend: {backend}",
+                )
+        except Exception as e:
+            return Pose2DResult(
+                keypoints=[{} for _ in ingest.frames],
+                fps=ingest.fps, confidence=0.0,
+                notes=str(e),
+            )
+        n_detected = sum(1 for f in kps_per_frame if f)
+        total_conf = sum(
+            sum(kp["conf"] for kp in f.values()) / len(f)
+            for f in kps_per_frame if f
+        )
+        overall_conf = (total_conf / n_detected) if n_detected > 0 else 0.0
+        notes = "" if n_detected > 0 else "no person detected in any frame"
+        return Pose2DResult(
+            keypoints=kps_per_frame,
+            fps=ingest.fps,
+            confidence=overall_conf,
+            notes=notes,
+        )

formscout/agents/report.py CHANGED Viewed

@@ -1,139 +1,139 @@
-"""
-ReportAgent — assembles per-test scorecard, composite, asymmetries.
-Input:  List of (MovementResult, BiomechFeatures, ScoreResult, JudgeResult) per test
-Output: ReportResult(per_test, composite, asymmetries, overlay_video_path, pdf_path)
-Failure: returns ReportResult with composite=None if any test unscored.
-Params: 0 (pure assembly — no model).
-License: n/a.
-Gated: no.
-"""
-from __future__ import annotations
-from formscout.types import (
-    MovementResult, BiomechFeatures, ScoreResult, JudgeResult, ReportResult,
-)
-from formscout import config
-# Bilateral tests that need L/R scoring
-BILATERAL_TESTS = {"hurdle_step", "inline_lunge", "shoulder_mobility", "active_slr"}
-class ReportAgent:
-    """Assembles the final screening report from all test results."""
-    def run(self, test_results: list[dict]) -> ReportResult:
-        """
-        Assemble the report.
-        Args:
-            test_results: list of dicts with keys:
-                - movement: MovementResult
-                - features: BiomechFeatures
-                - rubric_score: ScoreResult
-                - judge: JudgeResult
-                - side: str (for bilateral: "left" or "right")
-        """
-        per_test = []
-        asymmetries = []
-        low_confidence_flags = []
-        disagreement_flags = []
-        # Group bilateral tests by test_name
-        bilateral_groups: dict[str, list[dict]] = {}
-        unilateral: list[dict] = []
-        for entry in test_results:
-            test_name = entry["movement"].test_name
-            if test_name in BILATERAL_TESTS:
-                bilateral_groups.setdefault(test_name, []).append(entry)
-            else:
-                unilateral.append(entry)
-        # Process bilateral tests — take the lower score, emit asymmetry
-        for test_name, entries in bilateral_groups.items():
-            scores = []
-            for entry in entries:
-                judge = entry["judge"]
-                side = entry.get("side", entry["movement"].side)
-                score = judge.score if judge.score is not None else None
-                scores.append({"side": side, "score": score, "entry": entry})
-            # Find best entry per side
-            left = next((s for s in scores if s["side"] == "left"), None)
-            right = next((s for s in scores if s["side"] == "right"), None)
-            left_score = left["score"] if left else None
-            right_score = right["score"] if right else None
-            # Report lower
-            if left_score is not None and right_score is not None:
-                final_score = min(left_score, right_score)
-                delta = abs(left_score - right_score)
-                asymmetries.append({
-                    "test": test_name,
-                    "left_score": left_score,
-                    "right_score": right_score,
-                    "delta": delta,
-                })
-            elif left_score is not None:
-                final_score = left_score
-            elif right_score is not None:
-                final_score = right_score
-            else:
-                final_score = None
-            # Use the entry with the lower score for details
-            primary = (left["entry"] if left and (right is None or (left_score or 4) <= (right_score or 4))
-                      else right["entry"] if right else entries[0])
-            per_test.append({
-                "test_name": test_name,
-                "score": final_score,
-                "judge": primary["judge"],
-                "features": primary["features"],
-                "needs_human": primary["judge"].needs_human,
-            })
-            self._check_flags(primary, low_confidence_flags, disagreement_flags)
-        # Process unilateral tests
-        for entry in unilateral:
-            judge = entry["judge"]
-            per_test.append({
-                "test_name": entry["movement"].test_name,
-                "score": judge.score,
-                "judge": judge,
-                "features": entry["features"],
-                "needs_human": judge.needs_human,
-            })
-            self._check_flags(entry, low_confidence_flags, disagreement_flags)
-        # Composite — null if any test unscored
-        all_scores = [t["score"] for t in per_test]
-        composite = sum(all_scores) if all(s is not None for s in all_scores) else None
-        return ReportResult(
-            per_test=per_test,
-            composite=composite,
-            asymmetries=asymmetries,
-            overlay_video_path=None,  # Phase 4
-            pdf_path=None,  # Phase 4
-            low_confidence_flags=low_confidence_flags,
-            disagreement_flags=disagreement_flags,
-        )
-    def _check_flags(self, entry: dict, low_conf: list, disagree: list):
-        """Check quality gates and populate flag lists."""
-        judge = entry["judge"]
-        rubric = entry["rubric_score"]
-        test_name = entry["movement"].test_name
-        if judge.confidence < config.MIN_CONFIDENCE:
-            low_conf.append(f"{test_name}: judge confidence {judge.confidence:.2f}")
-        if (judge.score is not None and rubric.score is not None
-                and abs(judge.score - rubric.score) >= config.SCORE_DISAGREE_THRESH):
-            disagree.append(
-                f"{test_name}: rubric={rubric.score} vs judge={judge.score}"
-            )

+"""
+ReportAgent — assembles per-test scorecard, composite, asymmetries.
+Input:  List of (MovementResult, BiomechFeatures, ScoreResult, JudgeResult) per test
+Output: ReportResult(per_test, composite, asymmetries, overlay_video_path, pdf_path)
+Failure: returns ReportResult with composite=None if any test unscored.
+Params: 0 (pure assembly — no model).
+License: n/a.
+Gated: no.
+"""
+from __future__ import annotations
+from formscout.types import (
+    MovementResult, BiomechFeatures, ScoreResult, JudgeResult, ReportResult,
+)
+from formscout import config
+# Bilateral tests that need L/R scoring
+BILATERAL_TESTS = {"hurdle_step", "inline_lunge", "shoulder_mobility", "active_slr"}
+class ReportAgent:
+    """Assembles the final screening report from all test results."""
+    def run(self, test_results: list[dict]) -> ReportResult:
+        """
+        Assemble the report.
+        Args:
+            test_results: list of dicts with keys:
+                - movement: MovementResult
+                - features: BiomechFeatures
+                - rubric_score: ScoreResult
+                - judge: JudgeResult
+                - side: str (for bilateral: "left" or "right")
+        """
+        per_test = []
+        asymmetries = []
+        low_confidence_flags = []
+        disagreement_flags = []
+        # Group bilateral tests by test_name
+        bilateral_groups: dict[str, list[dict]] = {}
+        unilateral: list[dict] = []
+        for entry in test_results:
+            test_name = entry["movement"].test_name
+            if test_name in BILATERAL_TESTS:
+                bilateral_groups.setdefault(test_name, []).append(entry)
+            else:
+                unilateral.append(entry)
+        # Process bilateral tests — take the lower score, emit asymmetry
+        for test_name, entries in bilateral_groups.items():
+            scores = []
+            for entry in entries:
+                judge = entry["judge"]
+                side = entry.get("side", entry["movement"].side)
+                score = judge.score if judge.score is not None else None
+                scores.append({"side": side, "score": score, "entry": entry})
+            # Find best entry per side
+            left = next((s for s in scores if s["side"] == "left"), None)
+            right = next((s for s in scores if s["side"] == "right"), None)
+            left_score = left["score"] if left else None
+            right_score = right["score"] if right else None
+            # Report lower
+            if left_score is not None and right_score is not None:
+                final_score = min(left_score, right_score)
+                delta = abs(left_score - right_score)
+                asymmetries.append({
+                    "test": test_name,
+                    "left_score": left_score,
+                    "right_score": right_score,
+                    "delta": delta,
+                })
+            elif left_score is not None:
+                final_score = left_score
+            elif right_score is not None:
+                final_score = right_score
+            else:
+                final_score = None
+            # Use the entry with the lower score for details
+            primary = (left["entry"] if left and (right is None or (left_score or 4) <= (right_score or 4))
+                      else right["entry"] if right else entries[0])
+            per_test.append({
+                "test_name": test_name,
+                "score": final_score,
+                "judge": primary["judge"],
+                "features": primary["features"],
+                "needs_human": primary["judge"].needs_human,
+            })
+            self._check_flags(primary, low_confidence_flags, disagreement_flags)
+        # Process unilateral tests
+        for entry in unilateral:
+            judge = entry["judge"]
+            per_test.append({
+                "test_name": entry["movement"].test_name,
+                "score": judge.score,
+                "judge": judge,
+                "features": entry["features"],
+                "needs_human": judge.needs_human,
+            })
+            self._check_flags(entry, low_confidence_flags, disagreement_flags)
+        # Composite — null if any test unscored
+        all_scores = [t["score"] for t in per_test]
+        composite = sum(all_scores) if all(s is not None for s in all_scores) else None
+        return ReportResult(
+            per_test=per_test,
+            composite=composite,
+            asymmetries=asymmetries,
+            overlay_video_path=None,  # Phase 4
+            pdf_path=None,  # Phase 4
+            low_confidence_flags=low_confidence_flags,
+            disagreement_flags=disagreement_flags,
+        )
+    def _check_flags(self, entry: dict, low_conf: list, disagree: list):
+        """Check quality gates and populate flag lists."""
+        judge = entry["judge"]
+        rubric = entry["rubric_score"]
+        test_name = entry["movement"].test_name
+        if judge.confidence < config.MIN_CONFIDENCE:
+            low_conf.append(f"{test_name}: judge confidence {judge.confidence:.2f}")
+        if (judge.score is not None and rubric.score is not None
+                and abs(judge.score - rubric.score) >= config.SCORE_DISAGREE_THRESH):
+            disagree.append(
+                f"{test_name}: rubric={rubric.score} vs judge={judge.score}"
+            )

formscout/agents/visualizer.py CHANGED Viewed

@@ -1,435 +1,418 @@
-"""
-PoseVisualizer — annotated overlay video with skeleton, trails, velocity arrows.
-Input:  IngestResult + Pose2DResult
-Output: .mp4 path (or None on failure/empty layers)
-Failure: returns None, never raises.
-"""
-from __future__ import annotations
-import colorsys
-import logging
-import math
-import tempfile
-from collections import deque
-import cv2
-import numpy as np
-logger = logging.getLogger(__name__)
-# ── COCO constants ────────────────────────────────────────────────────────────
-COCO_KEYPOINTS = [
-    "nose", "left_eye", "right_eye", "left_ear", "right_ear",
-    "left_shoulder", "right_shoulder", "left_elbow", "right_elbow",
-    "left_wrist", "right_wrist", "left_hip", "right_hip",
-    "left_knee", "right_knee", "left_ankle", "right_ankle",
-]
-COCO_SKELETON = [
-    (0, 1), (0, 2), (1, 3), (2, 4),          # face
-    (5, 6), (5, 7), (7, 9), (6, 8), (8, 10), # arms
-    (5, 11), (6, 12), (11, 12),               # torso
-    (11, 13), (13, 15), (12, 14), (14, 16),  # legs
-]
-TRAIL_LENGTH = 10
-MAX_ARROW_PX = 40
-CONF_THRESHOLD = 0.3
-# ── Kalman filter ─────────────────────────────────────────────────────────────
-class SimpleKalmanFilter:
-    """4-state Kalman filter (x, y, vx, vy) for joint tracking."""
-    def __init__(self, process_noise: float = 0.01, measurement_noise: float = 0.1):
-        self.is_initialized = False
-        self.state = np.zeros(4)
-        self.cov = np.eye(4) * 0.1
-        self.Q = np.eye(4) * process_noise
-        self.R = np.eye(2) * measurement_noise
-        self.H = np.array([[1, 0, 0, 0], [0, 1, 0, 0]], dtype=float)
-    def predict(self, dt: float = 1.0):
-        F = np.array([[1, 0, dt, 0], [0, 1, 0, dt], [0, 0, 1, 0], [0, 0, 0, 1]], dtype=float)
-        self.state = F @ self.state
-        self.cov = F @ self.cov @ F.T + self.Q
-    def update(self, x: float, y: float):
-        z = np.array([x, y])
-        if not self.is_initialized:
-            self.state[:2] = z
-            self.is_initialized = True
-            return
-        S = self.H @ self.cov @ self.H.T + self.R
-        K = self.cov @ self.H.T @ np.linalg.inv(S)
-        self.state = self.state + K @ (z - self.H @ self.state)
-        self.cov = (np.eye(4) - K @ self.H) @ self.cov
-    def velocity_magnitude(self) -> float:
-        vx, vy = self.state[2], self.state[3]
-        return math.sqrt(vx * vx + vy * vy)
-    def velocity_vector(self) -> tuple[float, float]:
-        return float(self.state[2]), float(self.state[3])
-# ── Velocity computation ──────────────────────────────────────────────────────
-def compute_joint_velocity(
-    keypoints_per_frame: list[dict],
-    fps: float,
-) -> dict[int, list[float]]:
-    """
-    Compute Kalman-filtered per-joint speed (px/s) for each frame.
-    Returns dict[joint_idx, [speed_frame0, ...]] for all 17 COCO joints.
-    Missing/low-confidence keypoints yield speed=0.0 for that frame.
-    """
-    dt = 1.0 / fps if fps > 0 else 1.0
-    filters: dict[int, SimpleKalmanFilter] = {j: SimpleKalmanFilter() for j in range(17)}
-    result: dict[int, list[float]] = {j: [] for j in range(17)}
-    for frame_kps in keypoints_per_frame:
-        for j in range(17):
-            kf = filters[j]
-            kp = frame_kps.get(j)
-            kf.predict(dt)
-            if kp and kp.get("conf", 0.0) >= CONF_THRESHOLD:
-                kf.update(kp["x"], kp["y"])
-                speed = kf.velocity_magnitude()
-            else:
-                speed = 0.0
-            result[j].append(speed)
-    return result
-# ── Helpers ───────────────────────────────────────────────────────────────────
-def _conf_to_bgr(conf: float) -> tuple[int, int, int]:
-    """Map confidence 0→1 to BGR color red→green via HSV."""
-    hue = conf * 120.0 / 360.0
-    r, g, b = colorsys.hsv_to_rgb(hue, 1.0, 1.0)
-    return (int(b * 255), int(g * 255), int(r * 255))
-def _resolve_ffmpeg() -> str | None:
-    """Return a usable ffmpeg binary path: imageio-ffmpeg's bundle, then system PATH."""
-    try:
-        import imageio_ffmpeg
-        return imageio_ffmpeg.get_ffmpeg_exe()
-    except Exception:
-        pass
-    import shutil
-    return shutil.which("ffmpeg")
-# ── PoseVisualizer ────────────────────────────────────────────────────────────
-class PoseVisualizer:
-    """Renders skeleton, trails, and velocity arrows onto video frames."""
-    def __init__(self):
-        self.last_velocities: dict[int, list[float]] = {}
-    # ── Skeleton ──────────────────────────────────────────────────────────────
-    def _draw_skeleton(self, frame: np.ndarray, kps: dict) -> np.ndarray:
-        """Draw COCO-17 bones (white) and joints (confidence-colored) onto frame."""
-        visible = {j: kp for j, kp in kps.items() if kp.get("conf", 0.0) >= CONF_THRESHOLD}
-        # Bones
-        for j1, j2 in COCO_SKELETON:
-            if j1 in visible and j2 in visible:
-                p1 = (int(visible[j1]["x"]), int(visible[j1]["y"]))
-                p2 = (int(visible[j2]["x"]), int(visible[j2]["y"]))
-                cv2.line(frame, p1, p2, (255, 255, 255), 2)
-        # Joints
-        for j, kp in visible.items():
-            pt = (int(kp["x"]), int(kp["y"]))
-            color = _conf_to_bgr(kp["conf"])
-            cv2.circle(frame, pt, 4, color, -1)
-            cv2.circle(frame, pt, 5, (255, 255, 255), 1)
-        return frame
-    # ── Trails ───────────────────────────────────────────────────────────────
-    def _draw_trails(self, frame: np.ndarray, trail_history: dict) -> np.ndarray:
-        """Draw fading motion trails for each joint."""
-        for joint_idx, trail in trail_history.items():
-            pts = list(trail)
-            if len(pts) < 2:
-                continue
-            for i in range(1, len(pts)):
-                alpha = i / len(pts)
-                brightness = int(255 * alpha)
-                color = (brightness, brightness, brightness)
-                thickness = max(1, int(3 * alpha))
-                p1 = (int(pts[i - 1][0]), int(pts[i - 1][1]))
-                p2 = (int(pts[i][0]), int(pts[i][1]))
-                cv2.line(frame, p1, p2, color, thickness)
-        return frame
-    # ── Velocity arrows ───────────────────────────────────────────────────────
-    def _draw_velocity_arrows(
-        self,
-        frame: np.ndarray,
-        kps: dict,
-        prev_kps: dict | None,
-        velocities: dict[int, list[float]],
-        frame_idx: int,
-    ) -> np.ndarray:
-        """Draw per-joint velocity arrows scaled by speed."""
-        if prev_kps is None:
-            return frame
-        all_speeds = [velocities[j][frame_idx] for j in range(17) if frame_idx < len(velocities.get(j, []))]
-        peak = max(all_speeds) if all_speeds else 1.0
-        if peak == 0.0:
-            return frame
-        for j in range(17):
-            kp = kps.get(j)
-            pk = prev_kps.get(j)
-            if not kp or not pk:
-                continue
-            if kp.get("conf", 0.0) < CONF_THRESHOLD:
-                continue
-            speeds = velocities.get(j, [])
-            if frame_idx >= len(speeds):
-                continue
-            speed = speeds[frame_idx]
-            if speed == 0.0:
-                continue
-            dx = kp["x"] - pk["x"]
-            dy = kp["y"] - pk["y"]
-            mag = math.sqrt(dx * dx + dy * dy)
-            if mag < 1e-6:
-                continue
-            length = min(speed / peak * MAX_ARROW_PX, MAX_ARROW_PX)
-            nx, ny = dx / mag, dy / mag
-            start = (int(kp["x"]), int(kp["y"]))
-            end = (int(kp["x"] + nx * length), int(kp["y"] + ny * length))
-            ratio = speed / peak
-            if ratio < 0.33:
-                color = (0, 200, 0)     # green
-            elif ratio < 0.66:
-                color = (0, 140, 255)   # orange
-            else:
-                color = (0, 0, 255)     # red
-            cv2.arrowedLine(frame, start, end, color, 2, tipLength=0.35)
-        return frame
-    # ── Public ────────────────────────────────────────────────────────────────
-    def render_video(
-        self,
-        ingest,
-        pose2d,
-        layers: set[str],
-        output_path: str,
-    ) -> str | None:
-        """
-        Render annotated video. Returns output_path on success, None otherwise.
-        layers: subset of {"skeleton", "trails", "velocity_arrows"}
-        """
-        if not layers:
-            return None
-        if not any(pose2d.keypoints):
-            return None
-        try:
-            velocities = compute_joint_velocity(pose2d.keypoints, ingest.fps)
-            self.last_velocities = velocities
-            frames = ingest.frames
-            orig_h, orig_w = frames[0].shape[:2]
-            fps = ingest.fps or 30.0
-            # Cap at 1280px wide — big frames are slow and don't need to be HQ
-            max_w = 1280
-            if orig_w > max_w:
-                scale = max_w / orig_w
-                out_w = max_w
-                out_h = int(orig_h * scale)
-            else:
-                scale = 1.0
-                out_w, out_h = orig_w, orig_h
-            # Scale keypoint coordinates to match resized frames
-            def _scale_kps(kps: dict) -> dict:
-                if scale == 1.0:
-                    return kps
-                return {
-                    j: {**kp, "x": kp["x"] * scale, "y": kp["y"] * scale}
-                    for j, kp in kps.items()
-                }
-            scaled_keypoints = [_scale_kps(k) for k in pose2d.keypoints]
-            # Write raw mp4v to a temp file, then remux with ffmpeg faststart
-            import subprocess
-            import tempfile as _tf
-            tmp = _tf.NamedTemporaryFile(suffix="_raw.mp4", delete=False)
-            tmp_path = tmp.name
-            tmp.close()
-            fourcc = cv2.VideoWriter_fourcc(*"mp4v")
-            writer = cv2.VideoWriter(tmp_path, fourcc, fps, (out_w, out_h))
-            if not writer.isOpened():
-                logger.warning("VideoWriter failed to open: %s", tmp_path)
-                return None
-            trail_history: dict[int, deque] = {j: deque(maxlen=TRAIL_LENGTH) for j in range(17)}
-            prev_kps: dict | None = None
-            for frame_idx, (frame, kps) in enumerate(zip(frames, scaled_keypoints)):
-                if scale != 1.0:
-                    out_frame = cv2.resize(frame, (out_w, out_h), interpolation=cv2.INTER_AREA)
-                else:
-                    out_frame = frame.copy()
-                if "trails" in layers:
-                    for j, kp in kps.items():
-                        if kp.get("conf", 0.0) >= CONF_THRESHOLD:
-                            trail_history[j].append((kp["x"], kp["y"]))
-                    out_frame = self._draw_trails(out_frame, trail_history)
-                if "skeleton" in layers:
-                    out_frame = self._draw_skeleton(out_frame, kps)
-                if "velocity_arrows" in layers:
-                    out_frame = self._draw_velocity_arrows(
-                        out_frame, kps, prev_kps, velocities, frame_idx
-                    )
-                writer.write(out_frame)
-                prev_kps = kps
-            writer.release()
-            # Re-encode to H.264 (browsers/Gradio cannot play raw mp4v).
-            # Prefer imageio-ffmpeg's bundled binary so no system install is needed.
-            ffmpeg_bin = _resolve_ffmpeg()
-            if ffmpeg_bin:
-                try:
-                    subprocess.run(
-                        [ffmpeg_bin, "-y", "-i", tmp_path,
-                         "-c:v", "libx264", "-pix_fmt", "yuv420p",
-                         "-movflags", "+faststart", output_path],
-                        check=True, capture_output=True,
-                    )
-                    import os
-                    os.unlink(tmp_path)
-                    return output_path
-                except Exception as ffmpeg_err:
-                    logger.warning("ffmpeg H.264 re-encode failed (%s) — using raw mp4v", ffmpeg_err)
-            # No ffmpeg available — fall back to raw mp4v (may not play in browser)
-            import shutil
-            shutil.move(tmp_path, output_path)
-            return output_path
-        except Exception as e:
-            logger.warning("render_video failed: %s", e)
-            return None
-    def render_frame(
-        self,
-        ingest,
-        pose2d,
-        frame_idx: int,
-        layers: set[str],
-        caption: str = "",
-        out_png: str | None = None,
-    ) -> str | None:
-        """Render a single annotated still (skeleton + optional trails + caption).
-        frame_idx is typically the governing frame from BiomechFeatures.timing.
-        Returns the PNG path on success, None on any failure. Never raises.
-        """
-        try:
-            if not (0 <= frame_idx < len(ingest.frames)) or frame_idx >= len(pose2d.keypoints):
-                return None
-            frame = ingest.frames[frame_idx].copy()
-            kps = pose2d.keypoints[frame_idx]
-            if "trails" in layers:
-                trail: dict[int, deque] = {j: deque(maxlen=TRAIL_LENGTH) for j in range(17)}
-                start = max(0, frame_idx - TRAIL_LENGTH)
-                for fi in range(start, frame_idx + 1):
-                    for j, kp in pose2d.keypoints[fi].items():
-                        if kp.get("conf", 0.0) >= CONF_THRESHOLD:
-                            trail[j].append((kp["x"], kp["y"]))
-                frame = self._draw_trails(frame, trail)
-            if "skeleton" in layers:
-                frame = self._draw_skeleton(frame, kps)
-            if caption:
-                cv2.rectangle(frame, (0, 0), (frame.shape[1], 28), (0, 0, 0), -1)
-                cv2.putText(frame, caption[:80], (8, 20), cv2.FONT_HERSHEY_SIMPLEX,
-                            0.55, (255, 255, 255), 1, cv2.LINE_AA)
-            if out_png is None:
-                out_png = tempfile.NamedTemporaryFile(suffix=".png", delete=False).name
-            ok = cv2.imwrite(out_png, frame)
-            return out_png if ok else None
-        except Exception as e:
-            logger.warning("render_frame failed: %s", e)
-            return None
-# ── Velocity summary ──────────────────────────────────────────────────────────
-def build_velocity_summary(
-    keypoints_per_frame: list[dict],
-    velocities: dict[int, list[float]],
-) -> str:
-    """Return markdown table of per-joint avg/peak velocity. Empty string if no valid joints."""
-    n_frames = len(keypoints_per_frame)
-    if n_frames == 0:
-        return ""
-    rows = []
-    for j in range(17):
-        detected = sum(
-            1 for kps in keypoints_per_frame
-            if kps.get(j, {}).get("conf", 0.0) >= CONF_THRESHOLD
-        )
-        if detected < n_frames * 0.5:
-            continue
-        speeds = velocities.get(j, [])
-        if not speeds:
-            continue
-        avg_speed = sum(speeds) / len(speeds)
-        peak_speed = max(speeds)
-        rows.append((COCO_KEYPOINTS[j], avg_speed, peak_speed))
-    if not rows:
-        return ""
-    rows.sort(key=lambda r: r[2], reverse=True)
-    lines = [
-        "| Joint | Avg (px/s) | Peak (px/s) |",
-        "|---|---|---|",
-    ]
-    for name, avg, peak in rows:
-        lines.append(f"| {name} | {avg:.1f} | {peak:.1f} |")
-    return "\n".join(lines)

+"""
+PoseVisualizer — annotated overlay video with skeleton, trails, velocity arrows.
+Input:  IngestResult + Pose2DResult
+Output: .mp4 path (or None on failure/empty layers)
+Failure: returns None, never raises.
+"""
+from __future__ import annotations
+import colorsys
+import logging
+import math
+import tempfile
+from collections import deque
+import cv2
+import numpy as np
+logger = logging.getLogger(__name__)
+# ── COCO constants ────────────────────────────────────────────────────────────
+COCO_KEYPOINTS = [
+    "nose", "left_eye", "right_eye", "left_ear", "right_ear",
+    "left_shoulder", "right_shoulder", "left_elbow", "right_elbow",
+    "left_wrist", "right_wrist", "left_hip", "right_hip",
+    "left_knee", "right_knee", "left_ankle", "right_ankle",
+]
+COCO_SKELETON = [
+    (0, 1), (0, 2), (1, 3), (2, 4),          # face
+    (5, 6), (5, 7), (7, 9), (6, 8), (8, 10), # arms
+    (5, 11), (6, 12), (11, 12),               # torso
+    (11, 13), (13, 15), (12, 14), (14, 16),  # legs
+]
+TRAIL_LENGTH = 10
+MAX_ARROW_PX = 40
+CONF_THRESHOLD = 0.3
+# ── Kalman filter ─────────────────────────────────────────────────────────────
+class SimpleKalmanFilter:
+    """4-state Kalman filter (x, y, vx, vy) for joint tracking."""
+    def __init__(self, process_noise: float = 0.01, measurement_noise: float = 0.1):
+        self.is_initialized = False
+        self.state = np.zeros(4)
+        self.cov = np.eye(4) * 0.1
+        self.Q = np.eye(4) * process_noise
+        self.R = np.eye(2) * measurement_noise
+        self.H = np.array([[1, 0, 0, 0], [0, 1, 0, 0]], dtype=float)
+    def predict(self, dt: float = 1.0):
+        F = np.array([[1, 0, dt, 0], [0, 1, 0, dt], [0, 0, 1, 0], [0, 0, 0, 1]], dtype=float)
+        self.state = F @ self.state
+        self.cov = F @ self.cov @ F.T + self.Q
+    def update(self, x: float, y: float):
+        z = np.array([x, y])
+        if not self.is_initialized:
+            self.state[:2] = z
+            self.is_initialized = True
+            return
+        S = self.H @ self.cov @ self.H.T + self.R
+        K = self.cov @ self.H.T @ np.linalg.inv(S)
+        self.state = self.state + K @ (z - self.H @ self.state)
+        self.cov = (np.eye(4) - K @ self.H) @ self.cov
+    def velocity_magnitude(self) -> float:
+        vx, vy = self.state[2], self.state[3]
+        return math.sqrt(vx * vx + vy * vy)
+    def velocity_vector(self) -> tuple[float, float]:
+        return float(self.state[2]), float(self.state[3])
+# ── Velocity computation ──────────────────────────────────────────────────────
+def compute_joint_velocity(
+    keypoints_per_frame: list[dict],
+    fps: float,
+) -> dict[int, list[float]]:
+    """
+    Compute Kalman-filtered per-joint speed (px/s) for each frame.
+    Returns dict[joint_idx, [speed_frame0, ...]] for all 17 COCO joints.
+    Missing/low-confidence keypoints yield speed=0.0 for that frame.
+    """
+    dt = 1.0 / fps if fps > 0 else 1.0
+    filters: dict[int, SimpleKalmanFilter] = {j: SimpleKalmanFilter() for j in range(17)}
+    result: dict[int, list[float]] = {j: [] for j in range(17)}
+    for frame_kps in keypoints_per_frame:
+        for j in range(17):
+            kf = filters[j]
+            kp = frame_kps.get(j)
+            kf.predict(dt)
+            if kp and kp.get("conf", 0.0) >= CONF_THRESHOLD:
+                kf.update(kp["x"], kp["y"])
+                speed = kf.velocity_magnitude()
+            else:
+                speed = 0.0
+            result[j].append(speed)
+    return result
+# ── Helpers ───────────────────────────────────────────────────────────────────
+def _conf_to_bgr(conf: float) -> tuple[int, int, int]:
+    """Map confidence 0→1 to BGR color red→green via HSV."""
+    hue = conf * 120.0 / 360.0
+    r, g, b = colorsys.hsv_to_rgb(hue, 1.0, 1.0)
+    return (int(b * 255), int(g * 255), int(r * 255))
+# ── PoseVisualizer ────────────────────────────────────────────────────────────
+class PoseVisualizer:
+    """Renders skeleton, trails, and velocity arrows onto video frames."""
+    def __init__(self):
+        self.last_velocities: dict[int, list[float]] = {}
+    # ── Skeleton ──────────────────────────────────────────────────────────────
+    def _draw_skeleton(self, frame: np.ndarray, kps: dict) -> np.ndarray:
+        """Draw COCO-17 bones (white) and joints (confidence-colored) onto frame."""
+        visible = {j: kp for j, kp in kps.items() if kp.get("conf", 0.0) >= CONF_THRESHOLD}
+        # Bones
+        for j1, j2 in COCO_SKELETON:
+            if j1 in visible and j2 in visible:
+                p1 = (int(visible[j1]["x"]), int(visible[j1]["y"]))
+                p2 = (int(visible[j2]["x"]), int(visible[j2]["y"]))
+                cv2.line(frame, p1, p2, (255, 255, 255), 2)
+        # Joints
+        for j, kp in visible.items():
+            pt = (int(kp["x"]), int(kp["y"]))
+            color = _conf_to_bgr(kp["conf"])
+            cv2.circle(frame, pt, 4, color, -1)
+            cv2.circle(frame, pt, 5, (255, 255, 255), 1)
+        return frame
+    # ── Trails ───────────────────────────────────────────────────────────────
+    def _draw_trails(self, frame: np.ndarray, trail_history: dict) -> np.ndarray:
+        """Draw fading motion trails for each joint."""
+        for joint_idx, trail in trail_history.items():
+            pts = list(trail)
+            if len(pts) < 2:
+                continue
+            for i in range(1, len(pts)):
+                alpha = i / len(pts)
+                brightness = int(255 * alpha)
+                color = (brightness, brightness, brightness)
+                thickness = max(1, int(3 * alpha))
+                p1 = (int(pts[i - 1][0]), int(pts[i - 1][1]))
+                p2 = (int(pts[i][0]), int(pts[i][1]))
+                cv2.line(frame, p1, p2, color, thickness)
+        return frame
+    # ── Velocity arrows ───────────────────────────────────────────────────────
+    def _draw_velocity_arrows(
+        self,
+        frame: np.ndarray,
+        kps: dict,
+        prev_kps: dict | None,
+        velocities: dict[int, list[float]],
+        frame_idx: int,
+    ) -> np.ndarray:
+        """Draw per-joint velocity arrows scaled by speed."""
+        if prev_kps is None:
+            return frame
+        all_speeds = [velocities[j][frame_idx] for j in range(17) if frame_idx < len(velocities.get(j, []))]
+        peak = max(all_speeds) if all_speeds else 1.0
+        if peak == 0.0:
+            return frame
+        for j in range(17):
+            kp = kps.get(j)
+            pk = prev_kps.get(j)
+            if not kp or not pk:
+                continue
+            if kp.get("conf", 0.0) < CONF_THRESHOLD:
+                continue
+            speeds = velocities.get(j, [])
+            if frame_idx >= len(speeds):
+                continue
+            speed = speeds[frame_idx]
+            if speed == 0.0:
+                continue
+            dx = kp["x"] - pk["x"]
+            dy = kp["y"] - pk["y"]
+            mag = math.sqrt(dx * dx + dy * dy)
+            if mag < 1e-6:
+                continue
+            length = min(speed / peak * MAX_ARROW_PX, MAX_ARROW_PX)
+            nx, ny = dx / mag, dy / mag
+            start = (int(kp["x"]), int(kp["y"]))
+            end = (int(kp["x"] + nx * length), int(kp["y"] + ny * length))
+            ratio = speed / peak
+            if ratio < 0.33:
+                color = (0, 200, 0)     # green
+            elif ratio < 0.66:
+                color = (0, 140, 255)   # orange
+            else:
+                color = (0, 0, 255)     # red
+            cv2.arrowedLine(frame, start, end, color, 2, tipLength=0.35)
+        return frame
+    # ── Public ────────────────────────────────────────────────────────────────
+    def render_video(
+        self,
+        ingest,
+        pose2d,
+        layers: set[str],
+        output_path: str,
+    ) -> str | None:
+        """
+        Render annotated video. Returns output_path on success, None otherwise.
+        layers: subset of {"skeleton", "trails", "velocity_arrows"}
+        """
+        if not layers:
+            return None
+        if not any(pose2d.keypoints):
+            return None
+        try:
+            velocities = compute_joint_velocity(pose2d.keypoints, ingest.fps)
+            self.last_velocities = velocities
+            frames = ingest.frames
+            orig_h, orig_w = frames[0].shape[:2]
+            fps = ingest.fps or 30.0
+            # Cap at 1280px wide — big frames are slow and don't need to be HQ
+            max_w = 1280
+            if orig_w > max_w:
+                scale = max_w / orig_w
+                out_w = max_w
+                out_h = int(orig_h * scale)
+            else:
+                scale = 1.0
+                out_w, out_h = orig_w, orig_h
+            # Scale keypoint coordinates to match resized frames
+            def _scale_kps(kps: dict) -> dict:
+                if scale == 1.0:
+                    return kps
+                return {
+                    j: {**kp, "x": kp["x"] * scale, "y": kp["y"] * scale}
+                    for j, kp in kps.items()
+                }
+            scaled_keypoints = [_scale_kps(k) for k in pose2d.keypoints]
+            # Write raw mp4v to a temp file, then remux with ffmpeg faststart
+            import subprocess
+            import tempfile as _tf
+            tmp = _tf.NamedTemporaryFile(suffix="_raw.mp4", delete=False)
+            tmp_path = tmp.name
+            tmp.close()
+            fourcc = cv2.VideoWriter_fourcc(*"mp4v")
+            writer = cv2.VideoWriter(tmp_path, fourcc, fps, (out_w, out_h))
+            if not writer.isOpened():
+                logger.warning("VideoWriter failed to open: %s", tmp_path)
+                return None
+            trail_history: dict[int, deque] = {j: deque(maxlen=TRAIL_LENGTH) for j in range(17)}
+            prev_kps: dict | None = None
+            for frame_idx, (frame, kps) in enumerate(zip(frames, scaled_keypoints)):
+                if scale != 1.0:
+                    out_frame = cv2.resize(frame, (out_w, out_h), interpolation=cv2.INTER_AREA)
+                else:
+                    out_frame = frame.copy()
+                if "trails" in layers:
+                    for j, kp in kps.items():
+                        if kp.get("conf", 0.0) >= CONF_THRESHOLD:
+                            trail_history[j].append((kp["x"], kp["y"]))
+                    out_frame = self._draw_trails(out_frame, trail_history)
+                if "skeleton" in layers:
+                    out_frame = self._draw_skeleton(out_frame, kps)
+                if "velocity_arrows" in layers:
+                    out_frame = self._draw_velocity_arrows(
+                        out_frame, kps, prev_kps, velocities, frame_idx
+                    )
+                writer.write(out_frame)
+                prev_kps = kps
+            writer.release()
+            # Remux with faststart so browsers can seek without downloading the whole file
+            try:
+                subprocess.run(
+                    ["ffmpeg", "-y", "-i", tmp_path, "-c", "copy",
+                     "-movflags", "+faststart", output_path],
+                    check=True, capture_output=True,
+                )
+                import os
+                os.unlink(tmp_path)
+            except Exception as ffmpeg_err:
+                logger.warning("ffmpeg remux failed (%s) — using raw mp4v", ffmpeg_err)
+                import shutil
+                shutil.move(tmp_path, output_path)
+            return output_path
+        except Exception as e:
+            logger.warning("render_video failed: %s", e)
+            return None
+    def render_frame(
+        self,
+        ingest,
+        pose2d,
+        frame_idx: int,
+        layers: set[str],
+        caption: str = "",
+        out_png: str | None = None,
+    ) -> str | None:
+        """Render a single annotated still (skeleton + optional trails + caption).
+        frame_idx is typically the governing frame from BiomechFeatures.timing.
+        Returns the PNG path on success, None on any failure. Never raises.
+        """
+        try:
+            if not (0 <= frame_idx < len(ingest.frames)) or frame_idx >= len(pose2d.keypoints):
+                return None
+            frame = ingest.frames[frame_idx].copy()
+            kps = pose2d.keypoints[frame_idx]
+            if "trails" in layers:
+                trail: dict[int, deque] = {j: deque(maxlen=TRAIL_LENGTH) for j in range(17)}
+                start = max(0, frame_idx - TRAIL_LENGTH)
+                for fi in range(start, frame_idx + 1):
+                    for j, kp in pose2d.keypoints[fi].items():
+                        if kp.get("conf", 0.0) >= CONF_THRESHOLD:
+                            trail[j].append((kp["x"], kp["y"]))
+                frame = self._draw_trails(frame, trail)
+            if "skeleton" in layers:
+                frame = self._draw_skeleton(frame, kps)
+            if caption:
+                cv2.rectangle(frame, (0, 0), (frame.shape[1], 28), (0, 0, 0), -1)
+                cv2.putText(frame, caption[:80], (8, 20), cv2.FONT_HERSHEY_SIMPLEX,
+                            0.55, (255, 255, 255), 1, cv2.LINE_AA)
+            if out_png is None:
+                out_png = tempfile.NamedTemporaryFile(suffix=".png", delete=False).name
+            ok = cv2.imwrite(out_png, frame)
+            return out_png if ok else None
+        except Exception as e:
+            logger.warning("render_frame failed: %s", e)
+            return None
+# ── Velocity summary ──────────────────────────────────────────────────────────
+def build_velocity_summary(
+    keypoints_per_frame: list[dict],
+    velocities: dict[int, list[float]],
+) -> str:
+    """Return markdown table of per-joint avg/peak velocity. Empty string if no valid joints."""
+    n_frames = len(keypoints_per_frame)
+    if n_frames == 0:
+        return ""
+    rows = []
+    for j in range(17):
+        detected = sum(
+            1 for kps in keypoints_per_frame
+            if kps.get(j, {}).get("conf", 0.0) >= CONF_THRESHOLD
+        )
+        if detected < n_frames * 0.5:
+            continue
+        speeds = velocities.get(j, [])
+        if not speeds:
+            continue
+        avg_speed = sum(speeds) / len(speeds)
+        peak_speed = max(speeds)
+        rows.append((COCO_KEYPOINTS[j], avg_speed, peak_speed))
+    if not rows:
+        return ""
+    rows.sort(key=lambda r: r[2], reverse=True)
+    lines = [
+        "| Joint | Avg (px/s) | Peak (px/s) |",
+        "|---|---|---|",
+    ]
+    for name, avg, peak in rows:
+        lines.append(f"| {name} | {avg:.1f} | {peak:.1f} |")
+    return "\n".join(lines)

formscout/analysis/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """FormScout movement-analysis engine: relevant joints, time series, Laban, charts."""

formscout/analysis/charts.py ADDED Viewed

	@@ -0,0 +1,171 @@

+"""
+Matplotlib chart generators for the screening report.
+Every function returns a PNG path on success or None on failure (never raises),
+so a chart problem degrades the report but never blocks scoring. Charts use the
+Silas palette and an Agg backend so they render headless on the Space.
+"""
+from __future__ import annotations
+import logging
+import matplotlib
+matplotlib.use("Agg")
+import matplotlib.pyplot as plt  # noqa: E402
+import numpy as np  # noqa: E402
+from formscout.analysis.relevant_joints import COCO_NAMES  # noqa: E402
+logger = logging.getLogger(__name__)
+TEAL = "#2b8a8a"
+GOLD = "#e0a43b"
+SAGE = "#9cbcad"
+INK = "#243a34"
+RED = "#d9534f"
+_PALETTE = [TEAL, GOLD, SAGE, "#7a5ca0", "#c2683c", "#3c8dbc"]
+def _save(fig, out_png: str) -> str | None:
+    try:
+        fig.savefig(out_png, dpi=110, bbox_inches="tight", facecolor="white")
+        return out_png
+    except Exception as e:
+        logger.warning("chart save failed: %s", e)
+        return None
+    finally:
+        plt.close(fig)
+def angle_over_time(series: dict, primary: str | None, governing_idx: int | None,
+                    out_png: str, title: str = "Joint angle over time") -> str | None:
+    """Angle-vs-frame for the relevant angles; primary emphasised, key-frame marked."""
+    try:
+        if not series:
+            return None
+        fig, ax = plt.subplots(figsize=(6.4, 3.2))
+        for i, (name, vals) in enumerate(series.items()):
+            arr = np.array(vals, dtype=float)
+            is_primary = name == primary
+            ax.plot(np.arange(len(arr)), arr,
+                    color=(TEAL if is_primary else _PALETTE[i % len(_PALETTE)]),
+                    lw=2.4 if is_primary else 1.3,
+                    alpha=1.0 if is_primary else 0.6,
+                    label=name.replace("_", " ") + (" ★" if is_primary else ""))
+        if governing_idx is not None:
+            ax.axvline(governing_idx, color=GOLD, ls="--", lw=1.5, label="key frame")
+        ax.set_xlabel("frame")
+        ax.set_ylabel("degrees")
+        ax.set_title(title, color=INK)
+        ax.legend(fontsize=7, loc="best")
+        ax.grid(True, alpha=0.2)
+        return _save(fig, out_png)
+    except Exception as e:
+        logger.warning("angle_over_time failed: %s", e)
+        return None
+def velocity_profile(keypoints: list, fps: float, joints: list[int],
+                     out_png: str, title: str = "Joint speed over time") -> str | None:
+    """Per-frame speed (px/s) of the relevant joints across the clip."""
+    try:
+        from formscout.agents.visualizer import compute_joint_velocity
+        vel = compute_joint_velocity(keypoints, fps or 30.0)
+        plot_joints = [j for j in joints if j in vel] or list(vel.keys())[:4]
+        if not plot_joints:
+            return None
+        fig, ax = plt.subplots(figsize=(6.4, 3.2))
+        for i, j in enumerate(plot_joints):
+            ax.plot(vel[j], color=_PALETTE[i % len(_PALETTE)], lw=1.6,
+                    label=COCO_NAMES.get(j, str(j)).replace("_", " "))
+        ax.set_xlabel("frame")
+        ax.set_ylabel("speed (px/s)")
+        ax.set_title(title, color=INK)
+        ax.legend(fontsize=7, loc="best")
+        ax.grid(True, alpha=0.2)
+        return _save(fig, out_png)
+    except Exception as e:
+        logger.warning("velocity_profile failed: %s", e)
+        return None
+def laban_radar(effort: dict, out_png: str, title: str = "Laban Effort") -> str | None:
+    """4-axis radar of the Effort factors (Space, Weight, Time, Flow)."""
+    try:
+        axes_order = ["space", "weight", "time", "flow"]
+        labels = ["Space\n(direct)", "Weight\n(strong)", "Time\n(sudden)", "Flow\n(free)"]
+        vals = [float(effort.get(k, 0.0)) for k in axes_order]
+        angles = np.linspace(0, 2 * np.pi, len(axes_order), endpoint=False).tolist()
+        vals_loop = vals + vals[:1]
+        angles_loop = angles + angles[:1]
+        fig, ax = plt.subplots(figsize=(4.2, 4.2), subplot_kw={"polar": True})
+        ax.plot(angles_loop, vals_loop, color=TEAL, lw=2)
+        ax.fill(angles_loop, vals_loop, color=TEAL, alpha=0.25)
+        ax.set_xticks(angles)
+        ax.set_xticklabels(labels, fontsize=8, color=INK)
+        ax.set_ylim(0, 1)
+        ax.set_yticks([0.25, 0.5, 0.75, 1.0])
+        ax.set_yticklabels(["", "0.5", "", "1.0"], fontsize=7)
+        ax.set_title(title, color=INK, pad=18)
+        return _save(fig, out_png)
+    except Exception as e:
+        logger.warning("laban_radar failed: %s", e)
+        return None
+def flexion_bars(flexion: dict, out_png: str,
+                 title: str = "Relevant joint flexion") -> str | None:
+    """Horizontal bars of relevant joint angles (deg) at the key frame."""
+    try:
+        if not flexion:
+            return None
+        names = [n.replace("_", " ") for n in flexion]
+        degs = [flexion[n]["deg"] for n in flexion]
+        colors = [TEAL if d >= 160 else GOLD if d >= 110 else RED for d in degs]
+        fig, ax = plt.subplots(figsize=(6.0, max(1.6, 0.5 * len(names) + 0.8)))
+        y = np.arange(len(names))
+        ax.barh(y, degs, color=colors)
+        ax.set_yticks(y)
+        ax.set_yticklabels(names, fontsize=8)
+        ax.set_xlim(0, 200)
+        ax.axvline(160, color=SAGE, ls=":", lw=1)
+        for yi, d in zip(y, degs):
+            ax.text(d + 3, yi, f"{d:.0f}°", va="center", fontsize=8, color=INK)
+        ax.set_xlabel("interior angle (°)  ·  higher = more open")
+        ax.set_title(title, color=INK)
+        ax.invert_yaxis()
+        return _save(fig, out_png)
+    except Exception as e:
+        logger.warning("flexion_bars failed: %s", e)
+        return None
+def symmetry_bars(asymmetries: list, out_png: str,
+                  title: str = "Left / right symmetry") -> str | None:
+    """Grouped L vs R score bars for bilateral tests."""
+    try:
+        rows = [a for a in asymmetries
+                if a.get("left_score") is not None and a.get("right_score") is not None]
+        if not rows:
+            return None
+        names = [a["test"].replace("_", " ") for a in rows]
+        left = [a["left_score"] for a in rows]
+        right = [a["right_score"] for a in rows]
+        x = np.arange(len(names))
+        w = 0.36
+        fig, ax = plt.subplots(figsize=(6.4, 3.2))
+        ax.bar(x - w / 2, left, w, color=TEAL, label="left")
+        ax.bar(x + w / 2, right, w, color=GOLD, label="right")
+        ax.set_xticks(x)
+        ax.set_xticklabels(names, fontsize=8, rotation=15, ha="right")
+        ax.set_ylim(0, 3.4)
+        ax.set_ylabel("score (0–3)")
+        ax.set_title(title, color=INK)
+        ax.legend(fontsize=8)
+        ax.grid(True, axis="y", alpha=0.2)
+        return _save(fig, out_png)
+    except Exception as e:
+        logger.warning("symmetry_bars failed: %s", e)
+        return None

formscout/analysis/laban.py ADDED Viewed

	@@ -0,0 +1,127 @@

+"""
+Laban Movement Analysis — Effort factors from pose kinematics.
+Computes the four Effort factors over the joints relevant to a screening test:
+  - Space   (indirect 0 … direct 1)    — path directness of the leading joint
+  - Weight  (light 0 … strong 1)       — motion-energy of the relevant joints
+  - Time    (sustained 0 … sudden 1)   — impulsivity (peak vs mean speed)
+  - Flow    (bound 0 … free 1)         — smoothness (inverse normalised jerk)
+These are reproducible kinematic heuristics, not clinical LMA notation — the
+report labels them as such. Distances are normalised by torso length so the
+factors are scale-invariant across cameras. Pure function — no model, no I/O.
+"""
+from __future__ import annotations
+import math
+from formscout.agents.biomechanics import _get_joint
+from formscout.analysis.relevant_joints import (
+    COCO_NAMES, L_HIP, L_SHOULDER, R_HIP, R_SHOULDER, relevant_joints,
+)
+# Heuristic calibration references (movement is normalised to torso-lengths/sec).
+_WEIGHT_REF = 1.0      # energy (bl/s)^2 giving ~0.63 weight
+_TIME_LO, _TIME_HI = 1.5, 4.0   # peak/mean speed ratio mapped to [0, 1]
+_FLOW_JERK_REF = 6.0   # normalised jerk (bl/s^3) giving ~0.37 flow
+_LABELS = {
+    "space": ("indirect", "direct"),
+    "weight": ("light", "strong"),
+    "time": ("sustained", "sudden"),
+    "flow": ("bound", "free"),
+}
+def _torso_scale(frames) -> float:
+    """Median shoulder-hip distance across frames; 1.0 if unmeasurable."""
+    lengths = []
+    for kps in frames:
+        for sh, hip in ((L_SHOULDER, L_HIP), (R_SHOULDER, R_HIP)):
+            a, b = _get_joint(kps, sh), _get_joint(kps, hip)
+            if a and b:
+                lengths.append(math.hypot(a[0] - b[0], a[1] - b[1]))
+    if not lengths:
+        return 1.0
+    lengths.sort()
+    med = lengths[len(lengths) // 2]
+    return med if med > 1e-6 else 1.0
+def _joint_kinematics(frames, joint_id: int, dt: float, scale: float) -> dict | None:
+    """Speed/accel/jerk/directness for one joint trajectory (torso-length units)."""
+    pts = [_get_joint(kps, joint_id) for kps in frames]
+    valid = [(i, p) for i, p in enumerate(pts) if p is not None]
+    if len(valid) < 3:
+        return None
+    speeds, path_len = [], 0.0
+    for (i0, p0), (i1, p1) in zip(valid, valid[1:]):
+        d = math.hypot(p1[0] - p0[0], p1[1] - p0[1]) / scale
+        path_len += d
+        gap = max(1, i1 - i0)
+        speeds.append(d / (gap * dt))
+    if not speeds:
+        return None
+    net = math.hypot(valid[-1][1][0] - valid[0][1][0],
+                     valid[-1][1][1] - valid[0][1][1]) / scale
+    directness = net / path_len if path_len > 1e-6 else 0.0
+    accels = [abs(speeds[i + 1] - speeds[i]) / dt for i in range(len(speeds) - 1)]
+    jerks = [abs(accels[i + 1] - accels[i]) / dt for i in range(len(accels) - 1)]
+    mean_speed = sum(speeds) / len(speeds)
+    peak_speed = max(speeds)
+    return {
+        "mean_speed": mean_speed,
+        "peak_speed": peak_speed,
+        "energy": sum(s * s for s in speeds) / len(speeds),
+        "ratio": peak_speed / (mean_speed + 1e-6),
+        "mean_jerk": (sum(jerks) / len(jerks)) if jerks else 0.0,
+        "directness": min(1.0, max(0.0, directness)),
+    }
+def _clip01(x: float) -> float:
+    return min(1.0, max(0.0, x))
+def compute_laban(pose2d, test_name: str, fps: float) -> dict:
+    """Return the four Effort factors, their labels, and body emphasis."""
+    frames = pose2d.keypoints
+    dt = 1.0 / fps if fps and fps > 0 else 1.0 / 30.0
+    scale = _torso_scale(frames)
+    joints = relevant_joints(test_name) or list(range(17))
+    kin = {j: k for j in joints if (k := _joint_kinematics(frames, j, dt, scale))}
+    if not kin:
+        return {
+            "effort": {"space": 0.0, "weight": 0.0, "time": 0.0, "flow": 0.0},
+            "labels": {k: v[0] for k, v in _LABELS.items()},
+            "body_emphasis": [],
+            "notes": "insufficient motion to estimate Effort",
+        }
+    leader = max(kin, key=lambda j: kin[j]["mean_speed"])
+    lead = kin[leader]
+    weight = _clip01(1.0 - math.exp(-(sum(k["energy"] for k in kin.values()) / len(kin)) / _WEIGHT_REF))
+    time = _clip01((lead["ratio"] - _TIME_LO) / (_TIME_HI - _TIME_LO))
+    flow = _clip01(math.exp(-lead["mean_jerk"] / _FLOW_JERK_REF))
+    space = _clip01(lead["directness"])
+    effort = {"space": space, "weight": weight, "time": time, "flow": flow}
+    labels = {k: _LABELS[k][1] if v >= 0.5 else _LABELS[k][0] for k, v in effort.items()}
+    emphasis = sorted(kin.items(), key=lambda kv: kv[1]["mean_speed"], reverse=True)[:3]
+    body_emphasis = [(COCO_NAMES.get(j, str(j)), round(k["mean_speed"], 3)) for j, k in emphasis]
+    return {
+        "effort": {k: round(v, 3) for k, v in effort.items()},
+        "labels": labels,
+        "body_emphasis": body_emphasis,
+        "leading_joint": COCO_NAMES.get(leader, str(leader)),
+        "notes": "kinematic Effort estimate (heuristic, not clinical LMA notation)",
+    }

formscout/analysis/relevant_joints.py ADDED Viewed

	@@ -0,0 +1,122 @@

+"""
+Per-test relevant-joint and relevant-angle definitions.
+Drives the "always describe just the joint relevant to the screening action"
+requirement: each FMS test names the joints and the joint-angles that matter for
+its rubric, plus the single primary angle used for the headline angle-over-time
+graph. Angles are COCO (a, b, c) triplets — the angle is measured at joint b.
+"""
+from __future__ import annotations
+# COCO-17 joint indices
+NOSE = 0
+L_SHOULDER, R_SHOULDER = 5, 6
+L_ELBOW, R_ELBOW = 7, 8
+L_WRIST, R_WRIST = 9, 10
+L_HIP, R_HIP = 11, 12
+L_KNEE, R_KNEE = 13, 14
+L_ANKLE, R_ANKLE = 15, 16
+COCO_NAMES = {
+    0: "nose", 1: "left_eye", 2: "right_eye", 3: "left_ear", 4: "right_ear",
+    5: "left_shoulder", 6: "right_shoulder", 7: "left_elbow", 8: "right_elbow",
+    9: "left_wrist", 10: "right_wrist", 11: "left_hip", 12: "right_hip",
+    13: "left_knee", 14: "right_knee", 15: "left_ankle", 16: "right_ankle",
+}
+# Each test: the joints that matter, the named angles (a, b, c) measured at b,
+# and the primary angle for the headline graph.
+RELEVANT: dict[str, dict] = {
+    "deep_squat": {
+        "joints": [L_SHOULDER, R_SHOULDER, L_HIP, R_HIP, L_KNEE, R_KNEE, L_ANKLE, R_ANKLE],
+        "angles": {
+            "left_knee_flexion": (L_HIP, L_KNEE, L_ANKLE),
+            "right_knee_flexion": (R_HIP, R_KNEE, R_ANKLE),
+            "left_hip_flexion": (L_SHOULDER, L_HIP, L_KNEE),
+            "right_hip_flexion": (R_SHOULDER, R_HIP, R_KNEE),
+        },
+        "primary_angle": "left_knee_flexion",
+    },
+    "hurdle_step": {
+        "joints": [L_HIP, R_HIP, L_KNEE, R_KNEE, L_ANKLE, R_ANKLE],
+        "angles": {
+            "left_knee_flexion": (L_HIP, L_KNEE, L_ANKLE),
+            "right_knee_flexion": (R_HIP, R_KNEE, R_ANKLE),
+            "left_hip_flexion": (L_SHOULDER, L_HIP, L_KNEE),
+            "right_hip_flexion": (R_SHOULDER, R_HIP, R_KNEE),
+        },
+        "primary_angle": "left_hip_flexion",
+    },
+    "inline_lunge": {
+        "joints": [L_HIP, R_HIP, L_KNEE, R_KNEE, L_ANKLE, R_ANKLE],
+        "angles": {
+            "left_knee_flexion": (L_HIP, L_KNEE, L_ANKLE),
+            "right_knee_flexion": (R_HIP, R_KNEE, R_ANKLE),
+        },
+        "primary_angle": "left_knee_flexion",
+    },
+    "shoulder_mobility": {
+        "joints": [L_SHOULDER, R_SHOULDER, L_ELBOW, R_ELBOW, L_WRIST, R_WRIST],
+        "angles": {
+            "left_shoulder_angle": (L_ELBOW, L_SHOULDER, L_HIP),
+            "right_shoulder_angle": (R_ELBOW, R_SHOULDER, R_HIP),
+            "left_elbow_flexion": (L_SHOULDER, L_ELBOW, L_WRIST),
+            "right_elbow_flexion": (R_SHOULDER, R_ELBOW, R_WRIST),
+        },
+        "primary_angle": "left_shoulder_angle",
+    },
+    "active_slr": {
+        "joints": [L_HIP, R_HIP, L_KNEE, R_KNEE, L_ANKLE, R_ANKLE],
+        "angles": {
+            "left_hip_flexion": (L_SHOULDER, L_HIP, L_KNEE),
+            "right_hip_flexion": (R_SHOULDER, R_HIP, R_KNEE),
+            "left_knee_flexion": (L_HIP, L_KNEE, L_ANKLE),
+            "right_knee_flexion": (R_HIP, R_KNEE, R_ANKLE),
+        },
+        "primary_angle": "left_hip_flexion",
+    },
+    "trunk_stability_pushup": {
+        "joints": [L_SHOULDER, R_SHOULDER, L_ELBOW, R_ELBOW, L_HIP, R_HIP, L_ANKLE, R_ANKLE],
+        "angles": {
+            "left_elbow_flexion": (L_SHOULDER, L_ELBOW, L_WRIST),
+            "right_elbow_flexion": (R_SHOULDER, R_ELBOW, R_WRIST),
+            "left_hip_line": (L_SHOULDER, L_HIP, L_ANKLE),
+            "right_hip_line": (R_SHOULDER, R_HIP, R_ANKLE),
+        },
+        "primary_angle": "left_hip_line",
+    },
+    "rotary_stability": {
+        "joints": [L_SHOULDER, R_SHOULDER, L_ELBOW, R_ELBOW, L_HIP, R_HIP, L_KNEE, R_KNEE],
+        "angles": {
+            "left_hip_line": (L_SHOULDER, L_HIP, L_KNEE),
+            "right_hip_line": (R_SHOULDER, R_HIP, R_KNEE),
+        },
+        "primary_angle": "left_hip_line",
+    },
+}
+def relevant_joints(test_name: str) -> list[int]:
+    """COCO joint indices relevant to this test (empty for unknown tests)."""
+    return list(RELEVANT.get(test_name, {}).get("joints", []))
+def relevant_angles(test_name: str) -> dict[str, tuple]:
+    """Named (a, b, c) angle triplets relevant to this test."""
+    return dict(RELEVANT.get(test_name, {}).get("angles", {}))
+def primary_angle(test_name: str) -> str | None:
+    """Name of the headline angle for the angle-over-time graph, or None."""
+    return RELEVANT.get(test_name, {}).get("primary_angle")
+def openness_label(angle_deg: float) -> str:
+    """Describe how open/closed a joint is from its interior angle in degrees."""
+    if angle_deg >= 160:
+        return "open / extended"
+    if angle_deg >= 110:
+        return "mid-range"
+    if angle_deg >= 60:
+        return "flexed"
+    return "deeply flexed / closed"

formscout/analysis/timeseries.py ADDED Viewed

	@@ -0,0 +1,49 @@

+"""
+Per-frame time series for the joints relevant to a screening test.
+Builds angle-over-time series (used by the angle graph) and a per-joint flexion
+summary at a chosen frame (degrees + open/closed label). Reuses the biomechanics
+geometry helpers so angle definitions never diverge.
+"""
+from __future__ import annotations
+import math
+from formscout.agents.biomechanics import _angle_between_points, _get_joint
+from formscout.analysis.relevant_joints import openness_label, relevant_angles
+def angle_series(pose2d, test_name: str) -> dict[str, list[float]]:
+    """Return {angle_name: [deg per frame]} for this test's relevant angles.
+    Frames where the angle cannot be measured hold NaN.
+    """
+    angles = relevant_angles(test_name)
+    series: dict[str, list[float]] = {name: [] for name in angles}
+    for kps in pose2d.keypoints:
+        for name, (a, b, c) in angles.items():
+            pa, pb, pc = _get_joint(kps, a), _get_joint(kps, b), _get_joint(kps, c)
+            if pa and pb and pc:
+                series[name].append(_angle_between_points(pa, pb, pc))
+            else:
+                series[name].append(float("nan"))
+    return series
+def relevant_flexion_at(pose2d, test_name: str, frame_idx: int) -> dict[str, dict]:
+    """At one frame, the relevant joint angles with degree value + openness label.
+    Returns {angle_name: {"deg": float, "openness": str}}; angles that cannot be
+    measured at that frame are omitted.
+    """
+    out: dict[str, dict] = {}
+    if not (0 <= frame_idx < len(pose2d.keypoints)):
+        return out
+    kps = pose2d.keypoints[frame_idx]
+    for name, (a, b, c) in relevant_angles(test_name).items():
+        pa, pb, pc = _get_joint(kps, a), _get_joint(kps, b), _get_joint(kps, c)
+        if pa and pb and pc:
+            deg = _angle_between_points(pa, pb, pc)
+            if not math.isnan(deg):
+                out[name] = {"deg": deg, "openness": openness_label(deg)}
+    return out

formscout/config.py CHANGED Viewed

@@ -143,6 +143,18 @@ DEEP_SQUAT_KNEE_TRACKING_MARGIN_PX = 20
 LLAMA_CPP_HOST = "127.0.0.1"
 LLAMA_CPP_PORT_VLM = 8080
 LLAMA_CPP_PORT_EMBED = 8081
-# Model id sent in the OpenAI-compatible request. LM Studio uses this for
-# JIT auto-loading; native llama-server ignores it. Override with env var.
-LLAMA_CPP_MODEL = os.environ.get("FORMSCOUT_VLM_MODEL", "qwen/qwen3-vl-8b")

 LLAMA_CPP_HOST = "127.0.0.1"
 LLAMA_CPP_PORT_VLM = 8080
 LLAMA_CPP_PORT_EMBED = 8081
+# ─── Judge backend selection ────────────────────────────────────────────────
+# "llama_cpp"   — local llama-server (default for local dev; works perfectly)
+# "transformers"— in-process Qwen3-VL via transformers, GPU on HF Spaces (ZeroGPU)
+# "auto"        — transformers on a Space (SPACE_ID set), llama_cpp locally
+JUDGE_BACKEND = os.environ.get("FORMSCOUT_JUDGE_BACKEND", "auto")
+JUDGE_HF_MODEL = os.environ.get("FORMSCOUT_JUDGE_HF_MODEL", "Qwen/Qwen3-VL-8B-Instruct")
+ON_HF_SPACE = bool(os.environ.get("SPACE_ID"))
+def resolve_judge_backend() -> str:
+    """Resolve the effective judge backend from JUDGE_BACKEND + environment."""
+    if JUDGE_BACKEND in ("llama_cpp", "transformers"):
+        return JUDGE_BACKEND
+    return "transformers" if ON_HF_SPACE else "llama_cpp"

formscout/pipeline.py CHANGED Viewed

@@ -1,111 +1,111 @@
-"""
-Director — deterministic state machine orchestrating the FormScout pipeline.
-NOT an LLM. Runs each agent in sequence, applies quality gates, and assembles
-the final PipelineState. Exposes run(video_path, config) -> PipelineState.
-"""
-from __future__ import annotations
-from pathlib import Path
-from formscout import config
-from formscout.types import (
-    PipelineState, Body3DResult, MovementResult,
-)
-from formscout.agents.ingest import IngestAgent
-from formscout.agents.pose2d import Pose2DAgent
-from formscout.agents.body3d import Body3DAgent
-from formscout.agents.biomechanics import BiomechanicsAgent
-from formscout.agents.classifier import MovementClassifierAgent
-from formscout.agents.judge import JudgeAgent
-from formscout.agents.report import ReportAgent
-from formscout.rubric import score_test
-class Director:
-    """
-    Orchestrates the FormScout agent pipeline as a deterministic state machine.
-    Quality gates are applied after each agent — never silently passes bad data.
-    """
-    def __init__(self):
-        self._ingest = IngestAgent()
-        self._pose2d = Pose2DAgent()
-        self._body3d = Body3DAgent()
-        self._biomechanics = BiomechanicsAgent()
-        self._classifier = MovementClassifierAgent()
-        self._judge = JudgeAgent()
-        self._report = ReportAgent()
-    def run(self, video_path: str, test_name: str = "deep_squat", side: str = "na", model_key: str | None = None) -> PipelineState:
-        """
-        Run the full pipeline on a single video.
-        test_name/side serve as manual override when provided (skips classifier).
-        model_key selects the pose backend (see config.POSE_MODELS).
-        """
-        state = PipelineState(video_path=video_path)
-        # ─── Ingest ───
-        state.ingest = self._ingest.run(video_path)
-        if state.ingest.confidence < config.MIN_CONFIDENCE:
-            state.errors.append("ingest: low confidence — video may be corrupt")
-            return state
-        # ─── Pose 2D ───
-        state.pose2d = self._pose2d.run(state.ingest, model_key=model_key)
-        if state.pose2d.confidence < config.MIN_CONFIDENCE:
-            state.warnings.append("pose2d: low confidence — no clear person detected")
-        # ─── Body 3D (optional) ───
-        masks = state.segment.masks if state.segment else []
-        frames = state.ingest.frames if state.ingest else []
-        state.body3d = self._body3d.run(state.pose2d, masks, frames=frames)
-        # ─── Movement classification ───
-        if test_name and test_name != "unknown":
-            # Manual override
-            state.movement = MovementResult(
-                test_name=test_name, side=side,
-                confidence=1.0, notes="manually specified",
-            )
-        else:
-            state.movement = self._classifier.run(state.ingest, state.pose2d)
-        # Gate: unknown test → stop
-        if state.movement.test_name == "unknown":
-            state.errors.append("movement classifier returned 'unknown' — manual override required")
-            return state
-        # ─── Biomechanics ───
-        state.features = self._biomechanics.run(
-            state.pose2d,
-            state.body3d or Body3DResult(used=False, joints_3d=[]),
-            state.movement,
-        )
-        if state.features.confidence < config.MIN_CONFIDENCE:
-            state.warnings.append(
-                f"biomechanics: low confidence ({state.features.confidence:.2f}) — physio review recommended"
-            )
-        # ─── Rubric Score ───
-        rubric_result = score_test(state.features)
-        state.stgcn_score = rubric_result  # Reusing field for rubric until ST-GCN is built
-        # ─── Judge ───
-        state.judge = self._judge.run(
-            state.features, rubric_result, state.movement, state.ingest,
-        )
-        # ─── Quality gates ───
-        # Gate: score disagreement
-        if (state.judge.score is not None and rubric_result.score is not None
-                and abs(state.judge.score - rubric_result.score) >= config.SCORE_DISAGREE_THRESH):
-            state.warnings.append(
-                f"score disagreement: rubric={rubric_result.score} vs judge={state.judge.score} — review recommended"
-            )
-        # Gate: needs_human
-        if state.judge.needs_human:
-            state.warnings.append("judge flagged needs_human — no auto-score emitted")
-        return state

+"""
+Director — deterministic state machine orchestrating the FormScout pipeline.
+NOT an LLM. Runs each agent in sequence, applies quality gates, and assembles
+the final PipelineState. Exposes run(video_path, config) -> PipelineState.
+"""
+from __future__ import annotations
+from pathlib import Path
+from formscout import config
+from formscout.types import (
+    PipelineState, Body3DResult, MovementResult,
+)
+from formscout.agents.ingest import IngestAgent
+from formscout.agents.pose2d import Pose2DAgent
+from formscout.agents.body3d import Body3DAgent
+from formscout.agents.biomechanics import BiomechanicsAgent
+from formscout.agents.classifier import MovementClassifierAgent
+from formscout.agents.judge import JudgeAgent
+from formscout.agents.report import ReportAgent
+from formscout.rubric import score_test
+class Director:
+    """
+    Orchestrates the FormScout agent pipeline as a deterministic state machine.
+    Quality gates are applied after each agent — never silently passes bad data.
+    """
+    def __init__(self):
+        self._ingest = IngestAgent()
+        self._pose2d = Pose2DAgent()
+        self._body3d = Body3DAgent()
+        self._biomechanics = BiomechanicsAgent()
+        self._classifier = MovementClassifierAgent()
+        self._judge = JudgeAgent()
+        self._report = ReportAgent()
+    def run(self, video_path: str, test_name: str = "deep_squat", side: str = "na", model_key: str | None = None) -> PipelineState:
+        """
+        Run the full pipeline on a single video.
+        test_name/side serve as manual override when provided (skips classifier).
+        model_key selects the pose backend (see config.POSE_MODELS).
+        """
+        state = PipelineState(video_path=video_path)
+        # ─── Ingest ───
+        state.ingest = self._ingest.run(video_path)
+        if state.ingest.confidence < config.MIN_CONFIDENCE:
+            state.errors.append("ingest: low confidence — video may be corrupt")
+            return state
+        # ─── Pose 2D ───
+        state.pose2d = self._pose2d.run(state.ingest, model_key=model_key)
+        if state.pose2d.confidence < config.MIN_CONFIDENCE:
+            state.warnings.append("pose2d: low confidence — no clear person detected")
+        # ─── Body 3D (optional) ───
+        masks = state.segment.masks if state.segment else []
+        frames = state.ingest.frames if state.ingest else []
+        state.body3d = self._body3d.run(state.pose2d, masks, frames=frames)
+        # ─── Movement classification ───
+        if test_name and test_name != "unknown":
+            # Manual override
+            state.movement = MovementResult(
+                test_name=test_name, side=side,
+                confidence=1.0, notes="manually specified",
+            )
+        else:
+            state.movement = self._classifier.run(state.ingest, state.pose2d)
+        # Gate: unknown test → stop
+        if state.movement.test_name == "unknown":
+            state.errors.append("movement classifier returned 'unknown' — manual override required")
+            return state
+        # ─── Biomechanics ───
+        state.features = self._biomechanics.run(
+            state.pose2d,
+            state.body3d or Body3DResult(used=False, joints_3d=[]),
+            state.movement,
+        )
+        if state.features.confidence < config.MIN_CONFIDENCE:
+            state.warnings.append(
+                f"biomechanics: low confidence ({state.features.confidence:.2f}) — physio review recommended"
+            )
+        # ─── Rubric Score ───
+        rubric_result = score_test(state.features)
+        state.stgcn_score = rubric_result  # Reusing field for rubric until ST-GCN is built
+        # ─── Judge ───
+        state.judge = self._judge.run(
+            state.features, rubric_result, state.movement, state.ingest,
+        )
+        # ─── Quality gates ───
+        # Gate: score disagreement
+        if (state.judge.score is not None and rubric_result.score is not None
+                and abs(state.judge.score - rubric_result.score) >= config.SCORE_DISAGREE_THRESH):
+            state.warnings.append(
+                f"score disagreement: rubric={rubric_result.score} vs judge={state.judge.score} — review recommended"
+            )
+        # Gate: needs_human
+        if state.judge.needs_human:
+            state.warnings.append("judge flagged needs_human — no auto-score emitted")
+        return state

formscout/rubric/__init__.py CHANGED Viewed

@@ -1,32 +1,32 @@
-"""
-FormScout rubric scorers — one pure-function scorer per FMS test.
-"""
-from formscout.rubric.deep_squat import score_deep_squat
-from formscout.rubric.hurdle_step import score_hurdle_step
-from formscout.rubric.inline_lunge import score_inline_lunge
-from formscout.rubric.shoulder_mobility import score_shoulder_mobility
-from formscout.rubric.active_slr import score_active_slr
-from formscout.rubric.trunk_stability_pushup import score_trunk_stability_pushup
-from formscout.rubric.rotary_stability import score_rotary_stability
-from formscout.types import BiomechFeatures, ScoreResult
-SCORERS = {
-    "deep_squat": score_deep_squat,
-    "hurdle_step": score_hurdle_step,
-    "inline_lunge": score_inline_lunge,
-    "shoulder_mobility": score_shoulder_mobility,
-    "active_slr": score_active_slr,
-    "trunk_stability_pushup": score_trunk_stability_pushup,
-    "rotary_stability": score_rotary_stability,
-}
-def score_test(features: BiomechFeatures) -> ScoreResult:
-    """Dispatch to the appropriate rubric scorer by test name."""
-    fn = SCORERS.get(features.test_name)
-    if fn is None:
-        return ScoreResult(
-            score=1, rationale=f"No rubric for test '{features.test_name}'",
-            confidence=0.0, notes="unknown test",
-        )
-    return fn(features)

+"""
+FormScout rubric scorers — one pure-function scorer per FMS test.
+"""
+from formscout.rubric.deep_squat import score_deep_squat
+from formscout.rubric.hurdle_step import score_hurdle_step
+from formscout.rubric.inline_lunge import score_inline_lunge
+from formscout.rubric.shoulder_mobility import score_shoulder_mobility
+from formscout.rubric.active_slr import score_active_slr
+from formscout.rubric.trunk_stability_pushup import score_trunk_stability_pushup
+from formscout.rubric.rotary_stability import score_rotary_stability
+from formscout.types import BiomechFeatures, ScoreResult
+SCORERS = {
+    "deep_squat": score_deep_squat,
+    "hurdle_step": score_hurdle_step,
+    "inline_lunge": score_inline_lunge,
+    "shoulder_mobility": score_shoulder_mobility,
+    "active_slr": score_active_slr,
+    "trunk_stability_pushup": score_trunk_stability_pushup,
+    "rotary_stability": score_rotary_stability,
+}
+def score_test(features: BiomechFeatures) -> ScoreResult:
+    """Dispatch to the appropriate rubric scorer by test name."""
+    fn = SCORERS.get(features.test_name)
+    if fn is None:
+        return ScoreResult(
+            score=1, rationale=f"No rubric for test '{features.test_name}'",
+            confidence=0.0, notes="unknown test",
+        )
+    return fn(features)

formscout/rubric/active_slr.py CHANGED Viewed

@@ -1,51 +1,51 @@
-"""
-Active Straight-Leg Raise rubric scorer — pure function, no model calls.
-FMS ASLR Criteria (bilateral):
-- Score 3: raised leg malleolus past contralateral knee (>70°), down leg flat.
-- Score 2: malleolus between mid-thigh and knee (45-70°).
-- Score 1: malleolus below mid-thigh (<45°).
-- Score 0: PAIN — never auto-scored.
-"""
-from __future__ import annotations
-from formscout.types import BiomechFeatures, ScoreResult
-def score_active_slr(features: BiomechFeatures) -> ScoreResult:
-    """Pure rubric scorer for active straight-leg raise."""
-    angles = features.angles
-    alignments = features.alignments
-    has_angle = "raised_leg_angle_deg" in angles
-    if not has_angle:
-        return ScoreResult(
-            score=1, rationale="Insufficient data: leg raise angle not measurable",
-            confidence=0.3, notes="missing key measurements",
-        )
-    angle = angles["raised_leg_angle_deg"]
-    past_knee = alignments.get("past_contralateral_knee", False)
-    past_mid = alignments.get("past_mid_thigh", False)
-    down_flat = alignments.get("down_leg_flat", True)
-    rationale_parts = []
-    if past_knee and down_flat:
-        score = 3
-        rationale_parts.append(f"Raised leg at {angle:.0f}° (past contralateral knee)")
-    elif past_mid:
-        score = 2
-        rationale_parts.append(f"Raised leg at {angle:.0f}° (between mid-thigh and knee)")
-        if not down_flat:
-            rationale_parts.append("down leg lifted off surface")
-    else:
-        score = 1
-        rationale_parts.append(f"Raised leg only {angle:.0f}° (below mid-thigh)")
-    confidence = features.confidence * 0.9
-    return ScoreResult(
-        score=score, rationale="; ".join(rationale_parts),
-        confidence=confidence, notes="",
-    )

+"""
+Active Straight-Leg Raise rubric scorer — pure function, no model calls.
+FMS ASLR Criteria (bilateral):
+- Score 3: raised leg malleolus past contralateral knee (>70°), down leg flat.
+- Score 2: malleolus between mid-thigh and knee (45-70°).
+- Score 1: malleolus below mid-thigh (<45°).
+- Score 0: PAIN — never auto-scored.
+"""
+from __future__ import annotations
+from formscout.types import BiomechFeatures, ScoreResult
+def score_active_slr(features: BiomechFeatures) -> ScoreResult:
+    """Pure rubric scorer for active straight-leg raise."""
+    angles = features.angles
+    alignments = features.alignments
+    has_angle = "raised_leg_angle_deg" in angles
+    if not has_angle:
+        return ScoreResult(
+            score=1, rationale="Insufficient data: leg raise angle not measurable",
+            confidence=0.3, notes="missing key measurements",
+        )
+    angle = angles["raised_leg_angle_deg"]
+    past_knee = alignments.get("past_contralateral_knee", False)
+    past_mid = alignments.get("past_mid_thigh", False)
+    down_flat = alignments.get("down_leg_flat", True)
+    rationale_parts = []
+    if past_knee and down_flat:
+        score = 3
+        rationale_parts.append(f"Raised leg at {angle:.0f}° (past contralateral knee)")
+    elif past_mid:
+        score = 2
+        rationale_parts.append(f"Raised leg at {angle:.0f}° (between mid-thigh and knee)")
+        if not down_flat:
+            rationale_parts.append("down leg lifted off surface")
+    else:
+        score = 1
+        rationale_parts.append(f"Raised leg only {angle:.0f}° (below mid-thigh)")
+    confidence = features.confidence * 0.9
+    return ScoreResult(
+        score=score, rationale="; ".join(rationale_parts),
+        confidence=confidence, notes="",
+    )

formscout/rubric/hurdle_step.py CHANGED Viewed

@@ -1,60 +1,60 @@
-"""
-Hurdle Step rubric scorer — pure function, no model calls.
-FMS Hurdle Step Criteria (bilateral — score each side, report lower):
-- Score 3: hips/knees/ankles aligned, minimal trunk movement, dowel/posture stable,
-           no contact with hurdle.
-- Score 2: movement completed with compensation (trunk lean, loss of alignment).
-- Score 1: contact with hurdle, loss of balance, or inability to maintain alignment.
-- Score 0: PAIN — never auto-scored.
-"""
-from __future__ import annotations
-from formscout.types import BiomechFeatures, ScoreResult
-def score_hurdle_step(features: BiomechFeatures) -> ScoreResult:
-    """Pure rubric scorer for hurdle step."""
-    angles = features.angles
-    alignments = features.alignments
-    has_hip_flex = "step_hip_flexion_deg" in angles
-    if not has_hip_flex:
-        return ScoreResult(
-            score=1, rationale="Insufficient data: hip flexion not measurable",
-            confidence=0.3, notes="missing key measurements",
-        )
-    trunk_stable = alignments.get("trunk_stable", False)
-    stance_extended = alignments.get("stance_knee_extended", False)
-    hip_flex = angles.get("step_hip_flexion_deg", 0)
-    rationale_parts = []
-    # Score 3: good hip flexion, trunk stable, stance solid
-    if hip_flex > 90 and trunk_stable and stance_extended:
-        score = 3
-        rationale_parts.append("Hip flexion adequate, trunk stable, stance knee extended")
-    elif hip_flex > 70 or (trunk_stable and stance_extended):
-        score = 2
-        if not trunk_stable:
-            rationale_parts.append("trunk lean detected")
-        if not stance_extended:
-            rationale_parts.append("stance knee flexion")
-        if hip_flex <= 90:
-            rationale_parts.append(f"hip flexion {hip_flex:.0f}° (borderline)")
-        rationale_parts.insert(0, "Movement completed with compensation")
-    else:
-        score = 1
-        rationale_parts.append("Unable to maintain alignment")
-        if not trunk_stable:
-            rationale_parts.append("significant trunk lean")
-        if not stance_extended:
-            rationale_parts.append("stance knee collapse")
-    confidence = features.confidence * 0.85
-    return ScoreResult(
-        score=score, rationale="; ".join(rationale_parts),
-        confidence=confidence, notes="",
-    )

+"""
+Hurdle Step rubric scorer — pure function, no model calls.
+FMS Hurdle Step Criteria (bilateral — score each side, report lower):
+- Score 3: hips/knees/ankles aligned, minimal trunk movement, dowel/posture stable,
+           no contact with hurdle.
+- Score 2: movement completed with compensation (trunk lean, loss of alignment).
+- Score 1: contact with hurdle, loss of balance, or inability to maintain alignment.
+- Score 0: PAIN — never auto-scored.
+"""
+from __future__ import annotations
+from formscout.types import BiomechFeatures, ScoreResult
+def score_hurdle_step(features: BiomechFeatures) -> ScoreResult:
+    """Pure rubric scorer for hurdle step."""
+    angles = features.angles
+    alignments = features.alignments
+    has_hip_flex = "step_hip_flexion_deg" in angles
+    if not has_hip_flex:
+        return ScoreResult(
+            score=1, rationale="Insufficient data: hip flexion not measurable",
+            confidence=0.3, notes="missing key measurements",
+        )
+    trunk_stable = alignments.get("trunk_stable", False)
+    stance_extended = alignments.get("stance_knee_extended", False)
+    hip_flex = angles.get("step_hip_flexion_deg", 0)
+    rationale_parts = []
+    # Score 3: good hip flexion, trunk stable, stance solid
+    if hip_flex > 90 and trunk_stable and stance_extended:
+        score = 3
+        rationale_parts.append("Hip flexion adequate, trunk stable, stance knee extended")
+    elif hip_flex > 70 or (trunk_stable and stance_extended):
+        score = 2
+        if not trunk_stable:
+            rationale_parts.append("trunk lean detected")
+        if not stance_extended:
+            rationale_parts.append("stance knee flexion")
+        if hip_flex <= 90:
+            rationale_parts.append(f"hip flexion {hip_flex:.0f}° (borderline)")
+        rationale_parts.insert(0, "Movement completed with compensation")
+    else:
+        score = 1
+        rationale_parts.append("Unable to maintain alignment")
+        if not trunk_stable:
+            rationale_parts.append("significant trunk lean")
+        if not stance_extended:
+            rationale_parts.append("stance knee collapse")
+    confidence = features.confidence * 0.85
+    return ScoreResult(
+        score=score, rationale="; ".join(rationale_parts),
+        confidence=confidence, notes="",
+    )

formscout/rubric/inline_lunge.py CHANGED Viewed

@@ -1,58 +1,58 @@
-"""
-In-Line Lunge rubric scorer — pure function, no model calls.
-FMS In-Line Lunge Criteria (bilateral):
-- Score 3: dowel contacts maintained, no torso movement, knee touches behind heel.
-- Score 2: movement completed with compensation (trunk lean, loss of balance).
-- Score 1: loss of balance, inability to maintain foot contact or posture.
-- Score 0: PAIN — never auto-scored.
-"""
-from __future__ import annotations
-from formscout.types import BiomechFeatures, ScoreResult
-def score_inline_lunge(features: BiomechFeatures) -> ScoreResult:
-    """Pure rubric scorer for in-line lunge."""
-    angles = features.angles
-    alignments = features.alignments
-    has_knee = "front_knee_flexion_deg" in angles
-    if not has_knee:
-        return ScoreResult(
-            score=1, rationale="Insufficient data: knee flexion not measurable",
-            confidence=0.3, notes="missing key measurements",
-        )
-    knee_flex = angles.get("front_knee_flexion_deg", 180)
-    trunk_upright = alignments.get("trunk_upright", False)
-    knee_over_ankle = alignments.get("knee_over_ankle", False)
-    rationale_parts = []
-    # Good lunge: knee flexion < 90° (deep), trunk upright, knee aligned
-    deep_enough = knee_flex < 100
-    if deep_enough and trunk_upright and knee_over_ankle:
-        score = 3
-        rationale_parts.append("Deep lunge with trunk upright and knee aligned")
-    elif deep_enough or (trunk_upright and knee_over_ankle):
-        score = 2
-        if not trunk_upright:
-            rationale_parts.append(f"trunk lean {angles.get('trunk_lean_from_vertical_deg', '?')}°")
-        if not knee_over_ankle:
-            rationale_parts.append("knee drifts past ankle")
-        if not deep_enough:
-            rationale_parts.append(f"knee flexion {knee_flex:.0f}° (insufficient depth)")
-        rationale_parts.insert(0, "Completed with compensation")
-    else:
-        score = 1
-        rationale_parts.append("Unable to complete lunge pattern")
-        if not deep_enough:
-            rationale_parts.append(f"knee flexion only {knee_flex:.0f}°")
-    confidence = features.confidence * 0.85
-    return ScoreResult(
-        score=score, rationale="; ".join(rationale_parts),
-        confidence=confidence, notes="",
-    )

+"""
+In-Line Lunge rubric scorer — pure function, no model calls.
+FMS In-Line Lunge Criteria (bilateral):
+- Score 3: dowel contacts maintained, no torso movement, knee touches behind heel.
+- Score 2: movement completed with compensation (trunk lean, loss of balance).
+- Score 1: loss of balance, inability to maintain foot contact or posture.
+- Score 0: PAIN — never auto-scored.
+"""
+from __future__ import annotations
+from formscout.types import BiomechFeatures, ScoreResult
+def score_inline_lunge(features: BiomechFeatures) -> ScoreResult:
+    """Pure rubric scorer for in-line lunge."""
+    angles = features.angles
+    alignments = features.alignments
+    has_knee = "front_knee_flexion_deg" in angles
+    if not has_knee:
+        return ScoreResult(
+            score=1, rationale="Insufficient data: knee flexion not measurable",
+            confidence=0.3, notes="missing key measurements",
+        )
+    knee_flex = angles.get("front_knee_flexion_deg", 180)
+    trunk_upright = alignments.get("trunk_upright", False)
+    knee_over_ankle = alignments.get("knee_over_ankle", False)
+    rationale_parts = []
+    # Good lunge: knee flexion < 90° (deep), trunk upright, knee aligned
+    deep_enough = knee_flex < 100
+    if deep_enough and trunk_upright and knee_over_ankle:
+        score = 3
+        rationale_parts.append("Deep lunge with trunk upright and knee aligned")
+    elif deep_enough or (trunk_upright and knee_over_ankle):
+        score = 2
+        if not trunk_upright:
+            rationale_parts.append(f"trunk lean {angles.get('trunk_lean_from_vertical_deg', '?')}°")
+        if not knee_over_ankle:
+            rationale_parts.append("knee drifts past ankle")
+        if not deep_enough:
+            rationale_parts.append(f"knee flexion {knee_flex:.0f}° (insufficient depth)")
+        rationale_parts.insert(0, "Completed with compensation")
+    else:
+        score = 1
+        rationale_parts.append("Unable to complete lunge pattern")
+        if not deep_enough:
+            rationale_parts.append(f"knee flexion only {knee_flex:.0f}°")
+    confidence = features.confidence * 0.85
+    return ScoreResult(
+        score=score, rationale="; ".join(rationale_parts),
+        confidence=confidence, notes="",
+    )

formscout/rubric/rotary_stability.py CHANGED Viewed

@@ -1,56 +1,56 @@
-"""
-Rotary Stability rubric scorer — pure function, no model calls.
-FMS Rotary Stability Criteria:
-- Score 3: unilateral (same-side) arm/leg extension with trunk stable,
-           elbow/knee touch performed smoothly.
-- Score 2: contralateral (opposite) arm/leg extension performed with trunk stable.
-- Score 1: inability to maintain trunk stability during contralateral pattern.
-- Score 0: PAIN (spinal flexion clearing test) — never auto-scored.
-"""
-from __future__ import annotations
-from formscout.types import BiomechFeatures, ScoreResult
-def score_rotary_stability(features: BiomechFeatures) -> ScoreResult:
-    """Pure rubric scorer for rotary stability."""
-    angles = features.angles
-    alignments = features.alignments
-    has_data = "trunk_stability_std_px" in angles or "shoulder_level_diff_px" in angles
-    if not has_data:
-        return ScoreResult(
-            score=1, rationale="Insufficient data: trunk stability not measurable",
-            confidence=0.3, notes="missing key measurements",
-        )
-    trunk_stable = alignments.get("trunk_stable", False)
-    shoulders_level = alignments.get("shoulders_level", False)
-    hips_level = alignments.get("hips_level", False)
-    rationale_parts = []
-    # Without video classification of ipsi vs contra, assume contralateral (safer)
-    if trunk_stable and shoulders_level and hips_level:
-        score = 2  # Assume contralateral unless classifier says ipsilateral
-        rationale_parts.append("Trunk stable during extension, shoulders and hips level")
-        rationale_parts.append("scored as contralateral pattern (default)")
-    elif trunk_stable or (shoulders_level and hips_level):
-        score = 2
-        if not trunk_stable:
-            rationale_parts.append("minor trunk instability")
-        rationale_parts.insert(0, "Contralateral pattern with minor compensation")
-    else:
-        score = 1
-        std = angles.get("trunk_stability_std_px", 0)
-        rationale_parts.append(f"Trunk instability detected (std {std:.1f}px)")
-        if not shoulders_level:
-            rationale_parts.append("shoulder asymmetry during extension")
-    confidence = features.confidence * 0.75  # Lower confidence — hard to assess from 2D
-    return ScoreResult(
-        score=score, rationale="; ".join(rationale_parts),
-        confidence=confidence, notes="ipsi/contra distinction requires VLM classifier",
-    )

+"""
+Rotary Stability rubric scorer — pure function, no model calls.
+FMS Rotary Stability Criteria:
+- Score 3: unilateral (same-side) arm/leg extension with trunk stable,
+           elbow/knee touch performed smoothly.
+- Score 2: contralateral (opposite) arm/leg extension performed with trunk stable.
+- Score 1: inability to maintain trunk stability during contralateral pattern.
+- Score 0: PAIN (spinal flexion clearing test) — never auto-scored.
+"""
+from __future__ import annotations
+from formscout.types import BiomechFeatures, ScoreResult
+def score_rotary_stability(features: BiomechFeatures) -> ScoreResult:
+    """Pure rubric scorer for rotary stability."""
+    angles = features.angles
+    alignments = features.alignments
+    has_data = "trunk_stability_std_px" in angles or "shoulder_level_diff_px" in angles
+    if not has_data:
+        return ScoreResult(
+            score=1, rationale="Insufficient data: trunk stability not measurable",
+            confidence=0.3, notes="missing key measurements",
+        )
+    trunk_stable = alignments.get("trunk_stable", False)
+    shoulders_level = alignments.get("shoulders_level", False)
+    hips_level = alignments.get("hips_level", False)
+    rationale_parts = []
+    # Without video classification of ipsi vs contra, assume contralateral (safer)
+    if trunk_stable and shoulders_level and hips_level:
+        score = 2  # Assume contralateral unless classifier says ipsilateral
+        rationale_parts.append("Trunk stable during extension, shoulders and hips level")
+        rationale_parts.append("scored as contralateral pattern (default)")
+    elif trunk_stable or (shoulders_level and hips_level):
+        score = 2
+        if not trunk_stable:
+            rationale_parts.append("minor trunk instability")
+        rationale_parts.insert(0, "Contralateral pattern with minor compensation")
+    else:
+        score = 1
+        std = angles.get("trunk_stability_std_px", 0)
+        rationale_parts.append(f"Trunk instability detected (std {std:.1f}px)")
+        if not shoulders_level:
+            rationale_parts.append("shoulder asymmetry during extension")
+    confidence = features.confidence * 0.75  # Lower confidence — hard to assess from 2D
+    return ScoreResult(
+        score=score, rationale="; ".join(rationale_parts),
+        confidence=confidence, notes="ipsi/contra distinction requires VLM classifier",
+    )

formscout/rubric/shoulder_mobility.py CHANGED Viewed

@@ -1,46 +1,46 @@
-"""
-Shoulder Mobility rubric scorer — pure function, no model calls.
-FMS Shoulder Mobility Criteria (bilateral):
-- Score 3: fists within one hand-length of each other.
-- Score 2: fists within 1.5 hand-lengths.
-- Score 1: fists more than 1.5 hand-lengths apart.
-- Score 0: PAIN (clearing test) — never auto-scored.
-"""
-from __future__ import annotations
-from formscout.types import BiomechFeatures, ScoreResult
-def score_shoulder_mobility(features: BiomechFeatures) -> ScoreResult:
-    """Pure rubric scorer for shoulder mobility."""
-    alignments = features.alignments
-    angles = features.angles
-    has_measure = "inter_fist_normalized" in angles
-    if not has_measure:
-        return ScoreResult(
-            score=1, rationale="Insufficient data: inter-fist distance not measurable",
-            confidence=0.3, notes="missing key measurements",
-        )
-    norm_dist = angles["inter_fist_normalized"]
-    within_one = alignments.get("fists_within_one_hand", False)
-    within_1_5 = alignments.get("fists_within_1_5_hand", False)
-    if within_one:
-        score = 3
-        rationale = f"Fists within one hand-length (normalized distance {norm_dist:.2f})"
-    elif within_1_5:
-        score = 2
-        rationale = f"Fists within 1.5 hand-lengths (normalized distance {norm_dist:.2f})"
-    else:
-        score = 1
-        rationale = f"Fists beyond 1.5 hand-lengths apart (normalized distance {norm_dist:.2f})"
-    confidence = features.confidence * 0.9
-    return ScoreResult(
-        score=score, rationale=rationale,
-        confidence=confidence, notes="",
-    )

+"""
+Shoulder Mobility rubric scorer — pure function, no model calls.
+FMS Shoulder Mobility Criteria (bilateral):
+- Score 3: fists within one hand-length of each other.
+- Score 2: fists within 1.5 hand-lengths.
+- Score 1: fists more than 1.5 hand-lengths apart.
+- Score 0: PAIN (clearing test) — never auto-scored.
+"""
+from __future__ import annotations
+from formscout.types import BiomechFeatures, ScoreResult
+def score_shoulder_mobility(features: BiomechFeatures) -> ScoreResult:
+    """Pure rubric scorer for shoulder mobility."""
+    alignments = features.alignments
+    angles = features.angles
+    has_measure = "inter_fist_normalized" in angles
+    if not has_measure:
+        return ScoreResult(
+            score=1, rationale="Insufficient data: inter-fist distance not measurable",
+            confidence=0.3, notes="missing key measurements",
+        )
+    norm_dist = angles["inter_fist_normalized"]
+    within_one = alignments.get("fists_within_one_hand", False)
+    within_1_5 = alignments.get("fists_within_1_5_hand", False)
+    if within_one:
+        score = 3
+        rationale = f"Fists within one hand-length (normalized distance {norm_dist:.2f})"
+    elif within_1_5:
+        score = 2
+        rationale = f"Fists within 1.5 hand-lengths (normalized distance {norm_dist:.2f})"
+    else:
+        score = 1
+        rationale = f"Fists beyond 1.5 hand-lengths apart (normalized distance {norm_dist:.2f})"
+    confidence = features.confidence * 0.9
+    return ScoreResult(
+        score=score, rationale=rationale,
+        confidence=confidence, notes="",
+    )

formscout/rubric/trunk_stability_pushup.py CHANGED Viewed

@@ -1,55 +1,55 @@
-"""
-Trunk Stability Push-Up rubric scorer — pure function, no model calls.
-FMS Trunk Stability Push-Up Criteria:
-- Score 3: body moves as one unit (rigid) with hands at forehead level (men)
-           or chin level (women). No sag or segment lag.
-- Score 2: body moves as one unit but with hands at chin (men) or clavicle (women).
-- Score 1: unable to perform with hands lowered; body sags or segments.
-- Score 0: PAIN (spinal extension clearing test) — never auto-scored.
-"""
-from __future__ import annotations
-from formscout.types import BiomechFeatures, ScoreResult
-def score_trunk_stability_pushup(features: BiomechFeatures) -> ScoreResult:
-    """Pure rubric scorer for trunk stability push-up."""
-    angles = features.angles
-    alignments = features.alignments
-    has_data = "max_sag_px" in angles
-    if not has_data:
-        return ScoreResult(
-            score=1, rationale="Insufficient data: trunk rigidity not measurable",
-            confidence=0.3, notes="missing key measurements",
-        )
-    body_rigid = alignments.get("body_rigid", False)
-    no_sag = alignments.get("no_sag", False)
-    hands_high = alignments.get("hands_at_forehead", False)
-    rationale_parts = []
-    if body_rigid and hands_high:
-        score = 3
-        rationale_parts.append("Body rigid as one unit, hands at forehead position")
-    elif body_rigid or no_sag:
-        score = 2
-        if not hands_high:
-            rationale_parts.append("rigid body but hands in lower position")
-        else:
-            rationale_parts.append("minor trunk variance detected")
-        rationale_parts.insert(0, "Completed with regression")
-    else:
-        score = 1
-        sag = angles.get("max_sag_px", 0)
-        variance = angles.get("trunk_variance_px", 0)
-        rationale_parts.append(f"Body sag detected ({sag:.0f}px), variance {variance:.1f}px")
-    confidence = features.confidence * 0.8
-    return ScoreResult(
-        score=score, rationale="; ".join(rationale_parts),
-        confidence=confidence, notes="",
-    )

+"""
+Trunk Stability Push-Up rubric scorer — pure function, no model calls.
+FMS Trunk Stability Push-Up Criteria:
+- Score 3: body moves as one unit (rigid) with hands at forehead level (men)
+           or chin level (women). No sag or segment lag.
+- Score 2: body moves as one unit but with hands at chin (men) or clavicle (women).
+- Score 1: unable to perform with hands lowered; body sags or segments.
+- Score 0: PAIN (spinal extension clearing test) — never auto-scored.
+"""
+from __future__ import annotations
+from formscout.types import BiomechFeatures, ScoreResult
+def score_trunk_stability_pushup(features: BiomechFeatures) -> ScoreResult:
+    """Pure rubric scorer for trunk stability push-up."""
+    angles = features.angles
+    alignments = features.alignments
+    has_data = "max_sag_px" in angles
+    if not has_data:
+        return ScoreResult(
+            score=1, rationale="Insufficient data: trunk rigidity not measurable",
+            confidence=0.3, notes="missing key measurements",
+        )
+    body_rigid = alignments.get("body_rigid", False)
+    no_sag = alignments.get("no_sag", False)
+    hands_high = alignments.get("hands_at_forehead", False)
+    rationale_parts = []
+    if body_rigid and hands_high:
+        score = 3
+        rationale_parts.append("Body rigid as one unit, hands at forehead position")
+    elif body_rigid or no_sag:
+        score = 2
+        if not hands_high:
+            rationale_parts.append("rigid body but hands in lower position")
+        else:
+            rationale_parts.append("minor trunk variance detected")
+        rationale_parts.insert(0, "Completed with regression")
+    else:
+        score = 1
+        sag = angles.get("max_sag_px", 0)
+        variance = angles.get("trunk_variance_px", 0)
+        rationale_parts.append(f"Body sag detected ({sag:.0f}px), variance {variance:.1f}px")
+    confidence = features.confidence * 0.8
+    return ScoreResult(
+        score=score, rationale="; ".join(rationale_parts),
+        confidence=confidence, notes="",
+    )

formscout/serving/__init__.py CHANGED Viewed

	@@ -0,0 +1,20 @@

+"""Serving backends for the FormScout judge/classifier VLM."""
+from __future__ import annotations
+def get_vlm_client():
+    """Return the VLM client for the resolved judge backend.
+    transformers (in-process, ZeroGPU) on a Space; llama-server locally. Falls
+    back to the llama.cpp client if the transformers backend can't be imported.
+    """
+    from formscout import config
+    if config.resolve_judge_backend() == "transformers":
+        try:
+            from formscout.serving.transformers_vlm import TransformersVLMClient
+            return TransformersVLMClient()
+        except Exception:
+            pass
+    from formscout.serving.llama_cpp import LlamaCppClient
+    return LlamaCppClient(port=config.LLAMA_CPP_PORT_VLM)

formscout/serving/llama_cpp.py CHANGED Viewed

@@ -1,174 +1,148 @@
-"""
-llama.cpp HTTP client wrapper for FormScout.
-Wraps the llama.cpp server's /completion and /embedding endpoints.
-Falls back gracefully when the server is unavailable.
-Model: Qwen3-VL-8B-Instruct (Q4_K_M GGUF) for VLM inference.
-Model: Qwen3-VL-Embedding-8B (Q4_K_M GGUF) for embeddings.
-Params: 8B each (shared backbone).
-License: Apache-2.0.
-"""
-from __future__ import annotations
-import base64
-import json
-import logging
-from pathlib import Path
-from typing import Any
-import requests
-from formscout import config
-logger = logging.getLogger(__name__)
-_TIMEOUT = 120  # seconds — VLM can be slow
-class LlamaCppClient:
-    """HTTP client for a llama.cpp server instance."""
-    def __init__(self, host: str | None = None, port: int | None = None):
-        self.host = host or config.LLAMA_CPP_HOST
-        self.port = port or config.LLAMA_CPP_PORT_VLM
-        self.base_url = f"http://{self.host}:{self.port}"
-    @property
-    def available(self) -> bool:
-        """Check if the server is reachable."""
-        try:
-            r = requests.get(f"{self.base_url}/health", timeout=5)
-            return r.status_code == 200
-        except (requests.ConnectionError, requests.Timeout):
-            return False
-    def complete(
-        self,
-        prompt: str,
-        images: list[str] | None = None,
-        max_tokens: int = 512,
-        temperature: float = 0.1,
-        stop: list[str] | None = None,
-    ) -> dict[str, Any]:
-        """
-        Send a chat-completion request (OpenAI-compatible /v1/chat/completions —
-        required for multimodal: llama-server routes images through the mmproj
-        only on this endpoint). Returns parsed JSON if the response is JSON,
-        otherwise returns {"text": raw_text}.
-        Args:
-            prompt: The text prompt (system + user combined).
-            images: Optional list of base64-encoded JPEGs or file paths.
-            max_tokens: Max generation tokens.
-            temperature: Sampling temperature.
-            stop: Stop sequences (default: none — JSON output must not be truncated).
-        """
-        content: list[dict[str, Any]] = [{"type": "text", "text": prompt}]
-        for img in images or []:
-            if len(img) < 4096 and Path(img).exists():
-                with open(img, "rb") as f:
-                    b64 = base64.b64encode(f.read()).decode()
-            else:
-                b64 = img  # already base64
-            content.append({
-                "type": "image_url",
-                "image_url": {"url": f"data:image/jpeg;base64,{b64}"},
-            })
-        payload: dict[str, Any] = {
-            "model": config.LLAMA_CPP_MODEL,
-            "messages": [{"role": "user", "content": content}],
-            "max_tokens": max_tokens,
-            "temperature": temperature,
-        }
-        if stop:
-            payload["stop"] = stop
-        result = self._post(payload)
-        if "error" in result and images:
-            # Multimodal failed — retry text-only so scoring still proceeds.
-            logger.warning("Multimodal request failed (%s), retrying text-only", result["error"])
-            text_payload = {
-                "model": config.LLAMA_CPP_MODEL,
-                "messages": [{"role": "user", "content": prompt}],
-                "max_tokens": max_tokens,
-                "temperature": temperature,
-            }
-            if stop:
-                text_payload["stop"] = stop
-            result = self._post(text_payload)
-        return result
-    def _post(self, payload: dict[str, Any]) -> dict[str, Any]:
-        """POST a chat-completion payload, surfacing the response body on errors."""
-        try:
-            r = requests.post(
-                f"{self.base_url}/v1/chat/completions",
-                json=payload,
-                timeout=_TIMEOUT,
-            )
-            if not r.ok:
-                # Capture the server's explanation (e.g. "Invalid image ...")
-                body = ""
-                try:
-                    body = r.text[:500]
-                except Exception:
-                    pass
-                logger.warning("llama-server %s: %s", r.status_code, body)
-                return {"error": f"HTTP {r.status_code}: {body}", "text": ""}
-            result = r.json()
-            text = result["choices"][0]["message"]["content"] or ""
-            return self._parse_json_reply(text)
-        except requests.ConnectionError:
-            return {"error": "llama.cpp server not available", "text": ""}
-        except requests.Timeout:
-            return {"error": "llama.cpp server timeout", "text": ""}
-        except Exception as e:
-            return {"error": str(e), "text": ""}
-    @staticmethod
-    def _parse_json_reply(text: str) -> dict[str, Any]:
-        """Parse model output as JSON, tolerating markdown fences."""
-        stripped = text.strip()
-        if stripped.startswith("```"):
-            stripped = stripped.split("\n", 1)[-1]
-            stripped = stripped.rsplit("```", 1)[0].strip()
-        try:
-            parsed = json.loads(stripped)
-            if isinstance(parsed, dict):
-                return parsed
-        except (json.JSONDecodeError, TypeError):
-            pass
-        return {"text": text}
-class EmbeddingClient:
-    """HTTP client for the llama.cpp embedding server."""
-    def __init__(self, host: str | None = None, port: int | None = None):
-        self.host = host or config.LLAMA_CPP_HOST
-        self.port = port or config.LLAMA_CPP_PORT_EMBED
-        self.base_url = f"http://{self.host}:{self.port}"
-    @property
-    def available(self) -> bool:
-        try:
-            r = requests.get(f"{self.base_url}/health", timeout=5)
-            return r.status_code == 200
-        except (requests.ConnectionError, requests.Timeout):
-            return False
-    def embed(self, text: str) -> list[float] | None:
-        """Get embedding vector for text. Returns None on failure."""
-        try:
-            r = requests.post(
-                f"{self.base_url}/embedding",
-                json={"content": text},
-                timeout=30,
-            )
-            r.raise_for_status()
-            data = r.json()
-            return data.get("embedding")
-        except Exception:
-            return None

+"""
+llama.cpp HTTP client wrapper for FormScout.
+Wraps the llama.cpp server's /completion and /embedding endpoints.
+Falls back gracefully when the server is unavailable.
+Model: Qwen3-VL-8B-Instruct (Q4_K_M GGUF) for VLM inference.
+Model: Qwen3-VL-Embedding-8B (Q4_K_M GGUF) for embeddings.
+Params: 8B each (shared backbone).
+License: Apache-2.0.
+"""
+from __future__ import annotations
+import base64
+import json
+import logging
+from pathlib import Path
+from typing import Any
+import requests
+from formscout import config
+logger = logging.getLogger(__name__)
+_TIMEOUT = 120  # seconds — VLM can be slow
+class LlamaCppClient:
+    """HTTP client for a llama.cpp server instance."""
+    def __init__(self, host: str | None = None, port: int | None = None):
+        self.host = host or config.LLAMA_CPP_HOST
+        self.port = port or config.LLAMA_CPP_PORT_VLM
+        self.base_url = f"http://{self.host}:{self.port}"
+    @property
+    def available(self) -> bool:
+        """Check if the server is reachable."""
+        try:
+            r = requests.get(f"{self.base_url}/health", timeout=5)
+            return r.status_code == 200
+        except (requests.ConnectionError, requests.Timeout):
+            return False
+    def complete(
+        self,
+        prompt: str,
+        images: list[str] | None = None,
+        max_tokens: int = 512,
+        temperature: float = 0.1,
+        stop: list[str] | None = None,
+    ) -> dict[str, Any]:
+        """
+        Send a chat-completion request (OpenAI-compatible /v1/chat/completions —
+        required for multimodal: llama-server routes images through the mmproj
+        only on this endpoint). Returns parsed JSON if the response is JSON,
+        otherwise returns {"text": raw_text}.
+        Args:
+            prompt: The text prompt (system + user combined).
+            images: Optional list of base64-encoded JPEGs or file paths.
+            max_tokens: Max generation tokens.
+            temperature: Sampling temperature.
+            stop: Stop sequences (default: none — JSON output must not be truncated).
+        """
+        content: list[dict[str, Any]] = [{"type": "text", "text": prompt}]
+        for img in images or []:
+            if len(img) < 4096 and Path(img).exists():
+                with open(img, "rb") as f:
+                    b64 = base64.b64encode(f.read()).decode()
+            else:
+                b64 = img  # already base64
+            content.append({
+                "type": "image_url",
+                "image_url": {"url": f"data:image/jpeg;base64,{b64}"},
+            })
+        payload: dict[str, Any] = {
+            "messages": [{"role": "user", "content": content}],
+            "max_tokens": max_tokens,
+            "temperature": temperature,
+        }
+        if stop:
+            payload["stop"] = stop
+        try:
+            r = requests.post(
+                f"{self.base_url}/v1/chat/completions",
+                json=payload,
+                timeout=_TIMEOUT,
+            )
+            r.raise_for_status()
+            result = r.json()
+            text = result["choices"][0]["message"]["content"] or ""
+            return self._parse_json_reply(text)
+        except requests.ConnectionError:
+            return {"error": "llama.cpp server not available", "text": ""}
+        except requests.Timeout:
+            return {"error": "llama.cpp server timeout", "text": ""}
+        except Exception as e:
+            return {"error": str(e), "text": ""}
+    @staticmethod
+    def _parse_json_reply(text: str) -> dict[str, Any]:
+        """Parse model output as JSON, tolerating markdown fences."""
+        stripped = text.strip()
+        if stripped.startswith("```"):
+            stripped = stripped.split("\n", 1)[-1]
+            stripped = stripped.rsplit("```", 1)[0].strip()
+        try:
+            parsed = json.loads(stripped)
+            if isinstance(parsed, dict):
+                return parsed
+        except (json.JSONDecodeError, TypeError):
+            pass
+        return {"text": text}
+class EmbeddingClient:
+    """HTTP client for the llama.cpp embedding server."""
+    def __init__(self, host: str | None = None, port: int | None = None):
+        self.host = host or config.LLAMA_CPP_HOST
+        self.port = port or config.LLAMA_CPP_PORT_EMBED
+        self.base_url = f"http://{self.host}:{self.port}"
+    @property
+    def available(self) -> bool:
+        try:
+            r = requests.get(f"{self.base_url}/health", timeout=5)
+            return r.status_code == 200
+        except (requests.ConnectionError, requests.Timeout):
+            return False
+    def embed(self, text: str) -> list[float] | None:
+        """Get embedding vector for text. Returns None on failure."""
+        try:
+            r = requests.post(
+                f"{self.base_url}/embedding",
+                json={"content": text},
+                timeout=30,
+            )
+            r.raise_for_status()
+            data = r.json()
+            return data.get("embedding")
+        except Exception:
+            return None

formscout/serving/transformers_vlm.py ADDED Viewed

	@@ -0,0 +1,116 @@

+"""
+In-process Qwen3-VL backend via transformers — the HF Spaces (ZeroGPU) path.
+Mirrors LlamaCppClient's interface (`available` + `complete(...) -> dict`) so it
+is a drop-in alternative selected by config.resolve_judge_backend(). Inference is
+wrapped in spaces.GPU when the `spaces` package is present (ZeroGPU); otherwise it
+runs on whatever device transformers picks. The model is loaded lazily on first
+use and cached for the process.
+Any load/inference failure returns {"error": ..., "fallback": True} so JudgeAgent
+falls back to the deterministic rubric score instead of flagging needs_human.
+Model: Qwen3-VL-8B-Instruct (transformers weights, ~16 GB). License: Apache-2.0.
+NOTE: requires validation on actual ZeroGPU hardware — it cannot be exercised in
+the CPU test environment (would download 16 GB).
+"""
+from __future__ import annotations
+import base64
+import importlib.util
+import logging
+from formscout import config
+from formscout.serving.llama_cpp import LlamaCppClient
+logger = logging.getLogger(__name__)
+# Module-level model cache (loaded once per process).
+_CACHE: dict = {}
+# spaces.GPU decorator when on HF infra; a no-op decorator otherwise.
+try:  # pragma: no cover - depends on runtime env
+    import spaces
+    _gpu = spaces.GPU(duration=120)
+except Exception:  # pragma: no cover
+    def _gpu(fn):
+        return fn
+@_gpu
+def _generate(model_id: str, prompt: str, pil_images: list, max_tokens: int,
+              temperature: float) -> str:  # pragma: no cover - needs GPU + model
+    """Load (cached) and run Qwen3-VL; returns the raw decoded string."""
+    import torch
+    from transformers import AutoModelForImageTextToText, AutoProcessor
+    if "model" not in _CACHE:
+        _CACHE["processor"] = AutoProcessor.from_pretrained(model_id)
+        _CACHE["model"] = AutoModelForImageTextToText.from_pretrained(
+            model_id, torch_dtype="auto", device_map="auto",
+        )
+    processor, model = _CACHE["processor"], _CACHE["model"]
+    content = [{"type": "image", "image": im} for im in pil_images]
+    content.append({"type": "text", "text": prompt})
+    messages = [{"role": "user", "content": content}]
+    inputs = processor.apply_chat_template(
+        messages, tokenize=True, add_generation_prompt=True,
+        return_tensors="pt", return_dict=True,
+    ).to(model.device)
+    with torch.no_grad():
+        out = model.generate(
+            **inputs, max_new_tokens=max_tokens,
+            do_sample=temperature > 0, temperature=max(temperature, 1e-2),
+        )
+    gen = out[:, inputs["input_ids"].shape[1]:]
+    return processor.batch_decode(gen, skip_special_tokens=True)[0]
+class TransformersVLMClient:
+    """In-process Qwen3-VL client (ZeroGPU on Spaces)."""
+    def __init__(self, model_id: str | None = None):
+        self.model_id = model_id or config.JUDGE_HF_MODEL
+        self._failed = False
+    @property
+    def available(self) -> bool:
+        """Cheap check — does NOT load the model (so tests stay download-free)."""
+        if self._failed:
+            return False
+        return importlib.util.find_spec("transformers") is not None
+    def complete(self, prompt: str, images: list[str] | None = None,
+                 max_tokens: int = 512, temperature: float = 0.1,
+                 stop: list[str] | None = None) -> dict:
+        try:
+            pil_images = self._decode_images(images)
+            text = _generate(self.model_id, prompt, pil_images, max_tokens, temperature)
+            return LlamaCppClient._parse_json_reply(text)
+        except Exception as e:  # pragma: no cover - needs GPU + model
+            logger.warning("transformers VLM failed (%s) — falling back to rubric", e)
+            self._failed = True
+            return {"error": str(e), "fallback": True, "text": ""}
+    @staticmethod
+    def _decode_images(images: list[str] | None) -> list:
+        """Decode base64 JPEGs (as the JudgeAgent encodes them) into PIL images."""
+        if not images:
+            return []
+        import io
+        from PIL import Image
+        out = []
+        for img in images:
+            try:
+                raw = base64.b64decode(img)
+                out.append(Image.open(io.BytesIO(raw)).convert("RGB"))
+            except Exception:
+                continue
+        return out

formscout/session.py CHANGED Viewed

@@ -1,194 +1,283 @@
-"""
-Screening-session accumulator.
-Accumulates one SessionEntry per analyzed clip, persists each to a temp session
-dir (session.json + analysis.md + key-frame PNGs), and on finish builds a
-ReportResult (via ReportAgent) + a PDF (via PdfReportAgent).
-Pure orchestration — no Gradio imports. Disk writes tolerate failure with a
-logged warning and never block scoring.
-"""
-from __future__ import annotations
-import json
-import logging
-import os
-import tempfile
-import uuid
-from dataclasses import dataclass, replace
-from formscout.rubric import score_test
-from formscout.types import MovementResult, ReportResult, SessionEntry
-logger = logging.getLogger(__name__)
-# Maps each test to the BiomechFeatures.timing key holding its governing frame.
-TIMING_KEY = {
-    "deep_squat": "deepest_frame",
-    "hurdle_step": "peak_step_frame",
-    "inline_lunge": "deepest_lunge_frame",
-    "shoulder_mobility": "measure_frame",
-    "active_slr": "peak_raise_frame",
-    "trunk_stability_pushup": "max_sag_frame",
-    "rotary_stability": "peak_extension_frame",
-}
-@dataclass
-class Session:
-    """Mutable session: an id, its temp dir, and accumulated entries."""
-    session_id: str
-    session_dir: str
-    entries: list  # list[SessionEntry]
-def new_session() -> Session:
-    sid = uuid.uuid4().hex[:12]
-    base = os.path.join(tempfile.gettempdir(), "formscout_sessions", sid)
-    try:
-        os.makedirs(os.path.join(base, "keyframes"), exist_ok=True)
-    except Exception as e:
-        logger.warning("session dir create failed: %s", e)
-    return Session(session_id=sid, session_dir=base, entries=[])
-def governing_frame_index(features) -> int | None:
-    """Return the governing frame index for this test, or None."""
-    key = TIMING_KEY.get(features.test_name)
-    if key is None:
-        return None
-    idx = features.timing.get(key)
-    return int(idx) if isinstance(idx, (int, float)) else None
-def worst_compensation_caption(judge, features) -> str:
-    """Short caption naming the worst compensation for the key-frame still."""
-    if judge and getattr(judge, "compensation_tags", None):
-        return ", ".join(judge.compensation_tags)
-    failed = [k.replace("_", " ") for k, v in features.alignments.items() if v is False]
-    return ("compensation: " + ", ".join(failed)) if failed else "key position"
-def add_analysis(session, *, ingest, pose2d, features, judge, test_name, side,
-                 draw_trails: bool = False) -> SessionEntry:
-    """Build a SessionEntry from a completed analysis, render its key-frame,
-    persist the session, append, and return the entry."""
-    movement = MovementResult(test_name=test_name, side=side, confidence=1.0)
-    rubric = score_test(features)
-    needs_human = bool((judge and judge.needs_human) or rubric.needs_human)
-    if needs_human:
-        score = None
-    elif judge and judge.score is not None:
-        score = judge.score
-    else:
-        score = rubric.score
-    keyframe_path = None
-    idx = governing_frame_index(features)
-    if idx is not None and 0 <= idx < len(pose2d.keypoints):
-        from formscout.agents.visualizer import PoseVisualizer
-        caption = (f"{test_name.replace('_', ' ').title()} "
-                   f"({side}) — {worst_compensation_caption(judge, features)}")
-        layers = {"skeleton", "trails"} if draw_trails else {"skeleton"}
-        out_png = os.path.join(session.session_dir, "keyframes", f"{test_name}_{side}.png")
-        try:
-            keyframe_path = PoseVisualizer().render_frame(ingest, pose2d, idx, layers, caption, out_png)
-        except Exception as e:
-            logger.warning("keyframe render failed: %s", e)
-    measurements = {}
-    measurements.update(features.angles)
-    measurements.update(features.alignments)
-    entry = SessionEntry(
-        test_name=test_name, side=side, score=score, needs_human=needs_human,
-        rationale=(judge.rationale if judge else rubric.rationale),
-        compensation_tags=list(judge.compensation_tags) if judge else [],
-        corrective_hint=(judge.corrective_hint if judge else ""),
-        measurements=measurements,
-        confidence=(judge.confidence if judge else rubric.confidence),
-        view=features.view,
-        keyframe_path=keyframe_path,
-        movement=movement, features=features, rubric_score=rubric, judge=judge,
-    )
-    session.entries.append(entry)
-    _persist(session)
-    return entry
-def finish_session(session) -> tuple[ReportResult | None, str | None]:
-    """Build the composite report + PDF. Returns (report, pdf_path).
-    Returns (None, None) for an empty session."""
-    if not session.entries:
-        return None, None
-    from formscout.agents.report import ReportAgent
-    report_inputs = [{
-        "movement": e.movement, "features": e.features,
-        "rubric_score": e.rubric_score, "judge": e.judge, "side": e.side,
-    } for e in session.entries]
-    report = ReportAgent().run(report_inputs)
-    pdf_path = None
-    try:
-        from formscout.agents.pdf_report import PdfReportAgent
-        pdf_path = PdfReportAgent().run(report, session.entries, session.session_dir)
-    except Exception as e:
-        logger.warning("pdf generation failed: %s", e)
-    report = replace(report, pdf_path=pdf_path)
-    return report, pdf_path
-# ── Persistence ───────────────────────────────────────────────────────────────
-def _jsonable(d: dict) -> dict:
-    out = {}
-    for k, v in d.items():
-        if isinstance(v, float):
-            out[k] = round(v, 2)
-        elif isinstance(v, (int, str, bool)) or v is None:
-            out[k] = v
-        else:
-            out[k] = str(v)
-    return out
-def _entry_display(e: SessionEntry) -> dict:
-    return {
-        "test_name": e.test_name, "side": e.side, "score": e.score,
-        "needs_human": e.needs_human, "rationale": e.rationale,
-        "compensation_tags": list(e.compensation_tags), "corrective_hint": e.corrective_hint,
-        "measurements": _jsonable(e.measurements), "confidence": round(e.confidence, 2),
-        "view": e.view, "keyframe_path": e.keyframe_path,
-    }
-def _render_markdown(session: Session) -> str:
-    lines = ["# FormScout — Session Log", ""]
-    for e in session.entries:
-        title = e.test_name.replace("_", " ").title()
-        if e.side in ("left", "right"):
-            title += f" ({e.side})"
-        score = "Clinician review required" if e.needs_human else f"{e.score}/3"
-        lines.append(f"## {title} — {score}")
-        lines.append(e.rationale or "")
-        if e.compensation_tags:
-            lines.append(f"- Compensations: {', '.join(e.compensation_tags)}")
-        if e.corrective_hint:
-            lines.append(f"- Corrective: {e.corrective_hint}")
-        if e.keyframe_path:
-            lines.append(f"- Key frame: `{e.keyframe_path}`")
-        lines.append("")
-    return "\n".join(lines)
-def _persist(session: Session) -> None:
-    try:
-        with open(os.path.join(session.session_dir, "session.json"), "w") as f:
-            json.dump([_entry_display(e) for e in session.entries], f, indent=2)
-        with open(os.path.join(session.session_dir, "analysis.md"), "w") as f:
-            f.write(_render_markdown(session))
-    except Exception as e:
-        logger.warning("session persist failed: %s", e)

+"""
+Screening-session accumulator.
+Accumulates one SessionEntry per analyzed clip, persists each to a temp session
+dir (session.json + analysis.md + key-frame PNGs), and on finish builds a
+ReportResult (via ReportAgent) + a PDF (via PdfReportAgent).
+Pure orchestration — no Gradio imports. Disk writes tolerate failure with a
+logged warning and never block scoring.
+"""
+from __future__ import annotations
+import json
+import logging
+import os
+import tempfile
+import uuid
+from dataclasses import dataclass, replace
+from formscout.rubric import score_test
+from formscout.types import MovementResult, ReportResult, SessionEntry
+logger = logging.getLogger(__name__)
+# Maps each test to the BiomechFeatures.timing key holding its governing frame.
+TIMING_KEY = {
+    "deep_squat": "deepest_frame",
+    "hurdle_step": "peak_step_frame",
+    "inline_lunge": "deepest_lunge_frame",
+    "shoulder_mobility": "measure_frame",
+    "active_slr": "peak_raise_frame",
+    "trunk_stability_pushup": "max_sag_frame",
+    "rotary_stability": "peak_extension_frame",
+}
+@dataclass
+class Session:
+    """Mutable session: an id, its temp dir, and accumulated entries."""
+    session_id: str
+    session_dir: str
+    entries: list  # list[SessionEntry]
+def new_session() -> Session:
+    sid = uuid.uuid4().hex[:12]
+    base = os.path.join(tempfile.gettempdir(), "formscout_sessions", sid)
+    try:
+        os.makedirs(os.path.join(base, "keyframes"), exist_ok=True)
+    except Exception as e:
+        logger.warning("session dir create failed: %s", e)
+    return Session(session_id=sid, session_dir=base, entries=[])
+def governing_frame_index(features) -> int | None:
+    """Return the governing frame index for this test, or None."""
+    key = TIMING_KEY.get(features.test_name)
+    if key is None:
+        return None
+    idx = features.timing.get(key)
+    return int(idx) if isinstance(idx, (int, float)) else None
+def worst_compensation_caption(judge, features) -> str:
+    """Short caption naming the worst compensation for the key-frame still."""
+    if judge and getattr(judge, "compensation_tags", None):
+        return ", ".join(judge.compensation_tags)
+    failed = [k.replace("_", " ") for k, v in features.alignments.items() if v is False]
+    return ("compensation: " + ", ".join(failed)) if failed else "key position"
+def add_analysis(session, *, ingest, pose2d, features, judge, test_name, side,
+                 draw_trails: bool = False) -> SessionEntry:
+    """Build a SessionEntry from a completed analysis, render its key-frame,
+    persist the session, append, and return the entry."""
+    movement = MovementResult(test_name=test_name, side=side, confidence=1.0)
+    rubric = score_test(features)
+    needs_human = bool((judge and judge.needs_human) or rubric.needs_human)
+    if needs_human:
+        score = None
+    elif judge and judge.score is not None:
+        score = judge.score
+    else:
+        score = rubric.score
+    keyframe_path = None
+    idx = governing_frame_index(features)
+    if idx is not None and 0 <= idx < len(pose2d.keypoints):
+        from formscout.agents.visualizer import PoseVisualizer
+        caption = (f"{test_name.replace('_', ' ').title()} "
+                   f"({side}) — {worst_compensation_caption(judge, features)}")
+        layers = {"skeleton", "trails"} if draw_trails else {"skeleton"}
+        out_png = os.path.join(session.session_dir, "keyframes", f"{test_name}_{side}.png")
+        try:
+            keyframe_path = PoseVisualizer().render_frame(ingest, pose2d, idx, layers, caption, out_png)
+        except Exception as e:
+            logger.warning("keyframe render failed: %s", e)
+    measurements = {}
+    measurements.update(features.angles)
+    measurements.update(features.alignments)
+    laban, flexion, chart_paths = _run_analysis(session, pose2d, test_name, side, idx)
+    entry = SessionEntry(
+        test_name=test_name, side=side, score=score, needs_human=needs_human,
+        rationale=(judge.rationale if judge else rubric.rationale),
+        compensation_tags=list(judge.compensation_tags) if judge else [],
+        corrective_hint=(judge.corrective_hint if judge else ""),
+        measurements=measurements,
+        confidence=(judge.confidence if judge else rubric.confidence),
+        view=features.view,
+        keyframe_path=keyframe_path,
+        movement=movement, features=features, rubric_score=rubric, judge=judge,
+        laban=laban, flexion=flexion, chart_paths=chart_paths,
+    )
+    session.entries.append(entry)
+    _persist(session)
+    return entry
+def _run_analysis(session, pose2d, test_name, side, governing_idx):
+    """Compute Laban Effort, relevant-joint flexion, and per-test charts.
+    Returns (laban, flexion, chart_paths); any failure degrades to (None, None, {})
+    without blocking the entry."""
+    try:
+        from formscout.analysis import charts as C
+        from formscout.analysis.laban import compute_laban
+        from formscout.analysis.relevant_joints import primary_angle, relevant_joints
+        from formscout.analysis.timeseries import angle_series, relevant_flexion_at
+    except Exception as e:
+        logger.warning("analysis engine unavailable: %s", e)
+        return None, None, {}
+    try:
+        laban = compute_laban(pose2d, test_name, pose2d.fps)
+        gidx = governing_idx if governing_idx is not None else 0
+        flexion = relevant_flexion_at(pose2d, test_name, gidx)
+        series = angle_series(pose2d, test_name)
+        cdir = os.path.join(session.session_dir, "charts")
+        os.makedirs(cdir, exist_ok=True)
+        tag = f"{test_name}_{side}"
+        produced = {
+            "angle": C.angle_over_time(series, primary_angle(test_name), governing_idx,
+                                       os.path.join(cdir, f"{tag}_angle.png"),
+                                       title=f"{test_name.replace('_', ' ').title()} — angle over time"),
+            "velocity": C.velocity_profile(pose2d.keypoints, pose2d.fps, relevant_joints(test_name),
+                                           os.path.join(cdir, f"{tag}_vel.png")),
+            "radar": C.laban_radar(laban["effort"], os.path.join(cdir, f"{tag}_radar.png")),
+            "flexion": C.flexion_bars(flexion, os.path.join(cdir, f"{tag}_flex.png")),
+        }
+        chart_paths = {k: p for k, p in produced.items() if p}
+        return laban, flexion, chart_paths
+    except Exception as e:
+        logger.warning("analysis/charts failed: %s", e)
+        return None, None, {}
+def finish_session(session) -> tuple[ReportResult | None, str | None]:
+    """Build the composite report + PDF. Returns (report, pdf_path).
+    Returns (None, None) for an empty session."""
+    if not session.entries:
+        return None, None
+    from formscout.agents.report import ReportAgent
+    report_inputs = [{
+        "movement": e.movement, "features": e.features,
+        "rubric_score": e.rubric_score, "judge": e.judge, "side": e.side,
+    } for e in session.entries]
+    report = ReportAgent().run(report_inputs)
+    pdf_path = None
+    try:
+        from formscout.agents.pdf_report import PdfReportAgent
+        pdf_path = PdfReportAgent().run(report, session.entries, session.session_dir)
+    except Exception as e:
+        logger.warning("pdf generation failed: %s", e)
+    report = replace(report, pdf_path=pdf_path)
+    return report, pdf_path
+# ���─ Persistence ───────────────────────────────────────────────────────────────
+def _jsonable(d: dict) -> dict:
+    out = {}
+    for k, v in d.items():
+        if isinstance(v, float):
+            out[k] = round(v, 2)
+        elif isinstance(v, (int, str, bool)) or v is None:
+            out[k] = v
+        else:
+            out[k] = str(v)
+    return out
+def _entry_display(e: SessionEntry) -> dict:
+    return {
+        "test_name": e.test_name, "side": e.side, "score": e.score,
+        "needs_human": e.needs_human, "rationale": e.rationale,
+        "compensation_tags": list(e.compensation_tags), "corrective_hint": e.corrective_hint,
+        "measurements": _jsonable(e.measurements), "confidence": round(e.confidence, 2),
+        "view": e.view, "keyframe_path": e.keyframe_path,
+        "laban": e.laban, "flexion": _jsonable_flexion(e.flexion), "chart_paths": e.chart_paths,
+    }
+def _jsonable_flexion(flexion: dict | None) -> dict:
+    if not flexion:
+        return {}
+    return {k: {"deg": round(v["deg"], 1), "openness": v["openness"]} for k, v in flexion.items()}
+def _render_markdown(session: Session) -> str:
+    lines = ["# FormScout — Session Log", ""]
+    for e in session.entries:
+        title = e.test_name.replace("_", " ").title()
+        if e.side in ("left", "right"):
+            title += f" ({e.side})"
+        score = "Clinician review required" if e.needs_human else f"{e.score}/3"
+        lines.append(f"## {title} — {score}")
+        lines.append(f"*view: {e.view} · confidence: {e.confidence:.0%}*")
+        lines.append("")
+        lines.append(e.rationale or "")
+        if e.compensation_tags:
+            lines.append(f"- **Compensations:** {', '.join(e.compensation_tags)}")
+        if e.corrective_hint:
+            lines.append(f"- **Corrective:** {e.corrective_hint}")
+        # Relevant-joint flexion (degrees + open/closed)
+        if e.flexion:
+            lines.append("\n### Relevant joint flexion (key frame)")
+            lines.append("| Joint angle | Degrees | State |")
+            lines.append("|---|---|---|")
+            for name, v in e.flexion.items():
+                lines.append(f"| {name.replace('_', ' ')} | {v['deg']:.1f}° | {v['openness']} |")
+        # Laban Effort
+        if e.laban:
+            eff, lab = e.laban.get("effort", {}), e.laban.get("labels", {})
+            lines.append("\n### Laban Effort (kinematic estimate)")
+            lines.append("| Factor | Value | Quality |")
+            lines.append("|---|---|---|")
+            for k in ("space", "weight", "time", "flow"):
+                lines.append(f"| {k.title()} | {eff.get(k, 0):.2f} | {lab.get(k, '')} |")
+            if e.laban.get("body_emphasis"):
+                emph = ", ".join(f"{n} ({s})" for n, s in e.laban["body_emphasis"])
+                lines.append(f"\n- **Body emphasis:** {emph}")
+            if e.laban.get("leading_joint"):
+                lines.append(f"- **Leading joint:** {e.laban['leading_joint']}")
+            lines.append(f"- *{e.laban.get('notes', '')}*")
+        # Full measurement dump
+        if e.measurements:
+            lines.append("\n### All measurements")
+            lines.append("| Measurement | Value |")
+            lines.append("|---|---|")
+            for k, v in e.measurements.items():
+                val = f"{v:.2f}" if isinstance(v, float) else str(v)
+                lines.append(f"| {k.replace('_', ' ')} | {val} |")
+        if e.features and e.features.symmetry_delta is not None:
+            lines.append(f"\n- **L/R symmetry delta:** {e.features.symmetry_delta:.1f}")
+        # Artifacts
+        if e.keyframe_path:
+            lines.append(f"\n- Key frame: `{e.keyframe_path}`")
+        for kind, path in (e.chart_paths or {}).items():
+            lines.append(f"- Chart ({kind}): `{path}`")
+        lines.append("")
+    return "\n".join(lines)
+def _persist(session: Session) -> None:
+    try:
+        with open(os.path.join(session.session_dir, "session.json"), "w") as f:
+            json.dump([_entry_display(e) for e in session.entries], f, indent=2)
+        with open(os.path.join(session.session_dir, "analysis.md"), "w") as f:
+            f.write(_render_markdown(session))
+    except Exception as e:
+        logger.warning("session persist failed: %s", e)

formscout/startup.py CHANGED Viewed

@@ -1,47 +1,47 @@
-"""
-Checkpoint bootstrap — downloads missing model files from HF model repo on first run.
-Called once at app startup before build_app(); no-ops if files already present.
-"""
-from __future__ import annotations
-import logging
-from pathlib import Path
-logger = logging.getLogger(__name__)
-CHECKPOINT_REPO = "silas-therapy/formscout-checkpoints"
-ROOT = Path(__file__).parent.parent
-_CHECKPOINTS = [
-    "checkpoints/yolo26/yolo26n-pose.pt",
-    "checkpoints/yolo26/yolo26s-pose.pt",
-    "checkpoints/yolo26/yolo26m-pose.pt",
-    "checkpoints/yolo26/yolo26l-pose.pt",
-    "checkpoints/yolo26/yolo26x-pose.pt",
-    "checkpoints/mediapipe/pose_landmarker_full.task",
-]
-def ensure_checkpoints() -> None:
-    """Download any missing checkpoints from silas-therapy/formscout-checkpoints."""
-    try:
-        from huggingface_hub import hf_hub_download
-    except ImportError:
-        logger.warning("huggingface_hub not installed — skipping checkpoint download")
-        return
-    for rel_path in _CHECKPOINTS:
-        local = ROOT / rel_path
-        if local.exists():
-            continue
-        logger.info("Downloading %s ...", rel_path)
-        try:
-            local.parent.mkdir(parents=True, exist_ok=True)
-            hf_hub_download(
-                repo_id=CHECKPOINT_REPO,
-                filename=rel_path,
-                local_dir=str(ROOT),
-            )
-            logger.info("Downloaded %s", rel_path)
-        except Exception as e:
-            logger.warning("Failed to download %s: %s", rel_path, e)

+"""
+Checkpoint bootstrap — downloads missing model files from HF model repo on first run.
+Called once at app startup before build_app(); no-ops if files already present.
+"""
+from __future__ import annotations
+import logging
+from pathlib import Path
+logger = logging.getLogger(__name__)
+CHECKPOINT_REPO = "silas-therapy/formscout-checkpoints"
+ROOT = Path(__file__).parent.parent
+_CHECKPOINTS = [
+    "checkpoints/yolo26/yolo26n-pose.pt",
+    "checkpoints/yolo26/yolo26s-pose.pt",
+    "checkpoints/yolo26/yolo26m-pose.pt",
+    "checkpoints/yolo26/yolo26l-pose.pt",
+    "checkpoints/yolo26/yolo26x-pose.pt",
+    "checkpoints/mediapipe/pose_landmarker_full.task",
+]
+def ensure_checkpoints() -> None:
+    """Download any missing checkpoints from silas-therapy/formscout-checkpoints."""
+    try:
+        from huggingface_hub import hf_hub_download
+    except ImportError:
+        logger.warning("huggingface_hub not installed — skipping checkpoint download")
+        return
+    for rel_path in _CHECKPOINTS:
+        local = ROOT / rel_path
+        if local.exists():
+            continue
+        logger.info("Downloading %s ...", rel_path)
+        try:
+            local.parent.mkdir(parents=True, exist_ok=True)
+            hf_hub_download(
+                repo_id=CHECKPOINT_REPO,
+                filename=rel_path,
+                local_dir=str(ROOT),
+            )
+            logger.info("Downloaded %s", rel_path)
+        except Exception as e:
+            logger.warning("Failed to download %s: %s", rel_path, e)

formscout/types.py CHANGED Viewed

@@ -164,6 +164,9 @@ class SessionEntry:
     features: BiomechFeatures
     rubric_score: ScoreResult
     judge: JudgeResult | None
 @dataclass

     features: BiomechFeatures
     rubric_score: ScoreResult
     judge: JudgeResult | None
+    laban: dict | None = None          # Laban Effort factors + labels + body emphasis
+    flexion: dict | None = None        # relevant joint angles at key frame: {name: {deg, openness}}
+    chart_paths: dict | None = None    # {"angle"|"velocity"|"radar"|"flexion": png path}
 @dataclass

formscout/ui/theme.py CHANGED Viewed

@@ -1,250 +1,272 @@
-"""
-FormScout custom Gradio theme — scout/trail inspired.
-Earth tones, topographic accents, sturdy typography.
-"""
-from __future__ import annotations
-import gradio as gr
-def formscout_theme() -> gr.Theme:
-    """Create the FormScout scout/trail theme."""
-    return gr.themes.Soft(
-        primary_hue=gr.themes.colors.emerald,
-        secondary_hue=gr.themes.colors.amber,
-        neutral_hue=gr.themes.colors.stone,
-        font=[
-            gr.themes.GoogleFont("Inter"),
-            "ui-sans-serif",
-            "system-ui",
-            "sans-serif",
-        ],
-        font_mono=[
-            gr.themes.GoogleFont("JetBrains Mono"),
-            "ui-monospace",
-            "monospace",
-        ],
-    ).set(
-        # Background
-        body_background_fill="linear-gradient(135deg, #1a1a2e 0%, #16213e 50%, #0f3460 100%)",
-        body_background_fill_dark="linear-gradient(135deg, #0d1117 0%, #161b22 50%, #1a2332 100%)",
-        # Blocks
-        block_background_fill="rgba(30, 41, 59, 0.85)",
-        block_background_fill_dark="rgba(15, 23, 42, 0.9)",
-        block_border_width="1px",
-        block_border_color="rgba(100, 200, 150, 0.2)",
-        block_shadow="0 4px 20px rgba(0, 0, 0, 0.3)",
-        block_radius="12px",
-        # Buttons
-        button_primary_background_fill="linear-gradient(135deg, #059669 0%, #047857 100%)",
-        button_primary_background_fill_hover="linear-gradient(135deg, #047857 0%, #065f46 100%)",
-        button_primary_text_color="white",
-        button_primary_border_color="rgba(5, 150, 105, 0.5)",
-        button_secondary_background_fill="rgba(51, 65, 85, 0.8)",
-        button_secondary_text_color="#e2e8f0",
-        # Input
-        input_background_fill="rgba(15, 23, 42, 0.8)",
-        input_background_fill_dark="rgba(10, 15, 30, 0.9)",
-        input_border_color="rgba(100, 200, 150, 0.3)",
-        input_border_color_focus="rgba(5, 150, 105, 0.8)",
-        # Text
-        body_text_color="#e2e8f0",
-        body_text_color_dark="#f1f5f9",
-        block_title_text_color="#86efac",
-        block_label_text_color="#94a3b8",
-        # Spacing
-        block_padding="16px",
-        layout_gap="16px",
-    )
-FORMSCOUT_CSS = """
-/* FormScout Scout/Trail Theme CSS */
-.gradio-container {
-    max-width: 1400px !important;
-    margin: 0 auto;
-}
-/* Header styling */
-.formscout-header {
-    text-align: center;
-    padding: 20px 0;
-    border-bottom: 2px solid rgba(100, 200, 150, 0.3);
-    margin-bottom: 20px;
-}
-.formscout-header h1 {
-    font-size: 2.2em;
-    background: linear-gradient(135deg, #86efac, #059669);
-    -webkit-background-clip: text;
-    -webkit-text-fill-color: transparent;
-    background-clip: text;
-    margin-bottom: 8px;
-}
-/* Safety banner */
-.safety-banner {
-    background: linear-gradient(90deg, rgba(245, 158, 11, 0.15), rgba(245, 158, 11, 0.05));
-    border: 1px solid rgba(245, 158, 11, 0.4);
-    border-radius: 8px;
-    padding: 12px 16px;
-    margin: 12px 0;
-    font-size: 0.9em;
-    text-align: center;
-    color: #fbbf24;
-}
-/* Score display */
-.score-card {
-    background: rgba(5, 150, 105, 0.1);
-    border: 2px solid rgba(5, 150, 105, 0.4);
-    border-radius: 16px;
-    padding: 24px;
-    text-align: center;
-}
-.score-value {
-    font-size: 4em;
-    font-weight: 800;
-    background: linear-gradient(135deg, #86efac, #059669);
-    -webkit-background-clip: text;
-    -webkit-text-fill-color: transparent;
-    background-clip: text;
-}
-/* Confidence meter */
-.confidence-bar {
-    height: 8px;
-    border-radius: 4px;
-    background: rgba(100, 200, 150, 0.2);
-    overflow: hidden;
-    margin-top: 8px;
-}
-.confidence-fill {
-    height: 100%;
-    border-radius: 4px;
-    background: linear-gradient(90deg, #ef4444, #f59e0b, #059669);
-    transition: width 0.5s ease;
-}
-/* Pipeline steps indicator */
-.pipeline-steps {
-    display: flex;
-    gap: 4px;
-    align-items: center;
-    padding: 8px 0;
-}
-.pipeline-step {
-    flex: 1;
-    height: 4px;
-    border-radius: 2px;
-    background: rgba(100, 200, 150, 0.2);
-    transition: background 0.3s ease;
-}
-.pipeline-step.active {
-    background: #059669;
-}
-.pipeline-step.complete {
-    background: #86efac;
-}
-/* Asymmetry indicator */
-.asymmetry-bar {
-    display: flex;
-    align-items: center;
-    gap: 8px;
-    padding: 8px 12px;
-    background: rgba(30, 41, 59, 0.6);
-    border-radius: 8px;
-    margin: 4px 0;
-}
-.asymmetry-label {
-    min-width: 60px;
-    font-size: 0.85em;
-    color: #94a3b8;
-}
-.asymmetry-track {
-    flex: 1;
-    height: 6px;
-    background: rgba(100, 200, 150, 0.1);
-    border-radius: 3px;
-    position: relative;
-}
-.asymmetry-marker {
-    position: absolute;
-    top: -3px;
-    width: 12px;
-    height: 12px;
-    border-radius: 50%;
-    background: #059669;
-    border: 2px solid #86efac;
-}
-/* Topographic pattern accent */
-.topo-accent {
-    background-image:
-        repeating-linear-gradient(
-            0deg,
-            transparent,
-            transparent 40px,
-            rgba(100, 200, 150, 0.03) 40px,
-            rgba(100, 200, 150, 0.03) 41px
-        ),
-        repeating-linear-gradient(
-            90deg,
-            transparent,
-            transparent 40px,
-            rgba(100, 200, 150, 0.02) 40px,
-            rgba(100, 200, 150, 0.02) 41px
-        );
-}
-/* Warning/error states */
-.needs-review {
-    border-color: rgba(245, 158, 11, 0.6) !important;
-    background: rgba(245, 158, 11, 0.05) !important;
-}
-.low-confidence {
-    opacity: 0.7;
-    border-style: dashed !important;
-}
-/* Rubric drawer */
-.rubric-item {
-    display: flex;
-    align-items: center;
-    gap: 8px;
-    padding: 6px 10px;
-    border-radius: 6px;
-    margin: 2px 0;
-}
-.rubric-met {
-    background: rgba(5, 150, 105, 0.1);
-    border-left: 3px solid #059669;
-}
-.rubric-unmet {
-    background: rgba(239, 68, 68, 0.1);
-    border-left: 3px solid #ef4444;
-}
-/* Responsive */
-@media (max-width: 768px) {
-    .gradio-container {
-        padding: 8px !important;
-    }
-    .score-value {
-        font-size: 3em;
-    }
-}
-"""

+"""
+FormScout custom Gradio theme — Silas Therapy palette.
+Warm cream + sage green ground, petrol-teal primary, golden-orange accent,
+dark slate-green text. Matches https://silastherapy.sk branding.
+"""
+from __future__ import annotations
+import gradio as gr
+# ── Silas palette ──────────────────────────────────────────────────────────────
+CREAM = "#f7eedd"
+CREAM_DEEP = "#f1e4cc"
+SAGE = "#bcd3c8"
+SAGE_DEEP = "#9cbcad"
+TEAL = "#2b8a8a"
+TEAL_HOVER = "#1f6e6e"
+TEAL_DEEP = "#175757"
+GOLD = "#e0a43b"
+GOLD_DEEP = "#cf922a"
+INK = "#243a34"
+INK_MUTED = "#4a5f57"
+INK_FAINT = "#6b7d75"
+def formscout_theme() -> gr.Theme:
+    """Create the FormScout theme in the Silas Therapy palette."""
+    return gr.themes.Soft(
+        primary_hue=gr.themes.colors.teal,
+        secondary_hue=gr.themes.colors.amber,
+        neutral_hue=gr.themes.colors.stone,
+        font=[
+            gr.themes.GoogleFont("Inter"),
+            "ui-sans-serif",
+            "system-ui",
+            "sans-serif",
+        ],
+        font_mono=[
+            gr.themes.GoogleFont("JetBrains Mono"),
+            "ui-monospace",
+            "monospace",
+        ],
+    ).set(
+        # Background — warm cream into soft sage
+        body_background_fill=f"linear-gradient(135deg, {CREAM} 0%, {CREAM_DEEP} 45%, {SAGE} 100%)",
+        body_background_fill_dark=f"linear-gradient(135deg, {CREAM} 0%, {CREAM_DEEP} 45%, {SAGE} 100%)",
+        # Blocks — soft white cards over the sage ground
+        block_background_fill="rgba(255, 255, 255, 0.72)",
+        block_background_fill_dark="rgba(255, 255, 255, 0.72)",
+        block_border_width="1px",
+        block_border_color="rgba(43, 138, 138, 0.22)",
+        block_shadow="0 6px 22px rgba(36, 58, 52, 0.10)",
+        block_radius="14px",
+        # Buttons — petrol teal primary, sage secondary
+        button_primary_background_fill=f"linear-gradient(135deg, {TEAL} 0%, {TEAL_HOVER} 100%)",
+        button_primary_background_fill_hover=f"linear-gradient(135deg, {TEAL_HOVER} 0%, {TEAL_DEEP} 100%)",
+        button_primary_text_color="white",
+        button_primary_border_color="rgba(43, 138, 138, 0.45)",
+        button_secondary_background_fill="rgba(156, 188, 173, 0.55)",
+        button_secondary_text_color=INK,
+        # Inputs
+        input_background_fill="rgba(255, 255, 255, 0.92)",
+        input_background_fill_dark="rgba(255, 255, 255, 0.92)",
+        input_background_fill_focus="rgba(255, 255, 255, 1.0)",
+        input_border_color="rgba(43, 138, 138, 0.30)",
+        input_border_color_focus="rgba(43, 138, 138, 0.75)",
+        # Labels — pin light in both modes so no dark dropdown header appears
+        block_label_background_fill="rgba(188, 211, 200, 0.55)",
+        block_label_background_fill_dark="rgba(188, 211, 200, 0.55)",
+        block_label_text_color=INK,
+        block_label_text_color_dark=INK,
+        # Text
+        body_text_color=INK,
+        body_text_color_dark=INK,
+        block_title_text_color=TEAL_DEEP,
+        # Spacing
+        block_padding="16px",
+        layout_gap="16px",
+    )
+FORMSCOUT_CSS = f"""
+/* FormScout — Silas Therapy theme */
+.gradio-container {{
+    max-width: 1400px !important;
+    margin: 0 auto;
+}}
+/* Header styling */
+.formscout-header {{
+    text-align: center;
+    padding: 20px 0;
+    border-bottom: 2px solid rgba(43, 138, 138, 0.30);
+    margin-bottom: 20px;
+}}
+.formscout-header h1 {{
+    font-size: 2.2em;
+    background: linear-gradient(135deg, {TEAL} 0%, {GOLD} 100%);
+    -webkit-background-clip: text;
+    -webkit-text-fill-color: transparent;
+    background-clip: text;
+    margin-bottom: 8px;
+}}
+/* Safety banner — golden */
+.safety-banner {{
+    background: linear-gradient(90deg, rgba(224, 164, 59, 0.20), rgba(224, 164, 59, 0.08));
+    border: 1px solid rgba(224, 164, 59, 0.55);
+    border-radius: 8px;
+    padding: 12px 16px;
+    margin: 12px 0;
+    font-size: 0.9em;
+    text-align: center;
+    color: {GOLD_DEEP};
+}}
+/* Score display */
+.score-card {{
+    background: rgba(43, 138, 138, 0.08);
+    border: 2px solid rgba(43, 138, 138, 0.35);
+    border-radius: 16px;
+    padding: 24px;
+    text-align: center;
+}}
+.score-value {{
+    font-size: 4em;
+    font-weight: 800;
+    background: linear-gradient(135deg, {TEAL} 0%, {GOLD} 100%);
+    -webkit-background-clip: text;
+    -webkit-text-fill-color: transparent;
+    background-clip: text;
+}}
+/* Confidence meter */
+.confidence-bar {{
+    height: 8px;
+    border-radius: 4px;
+    background: rgba(43, 138, 138, 0.18);
+    overflow: hidden;
+    margin-top: 8px;
+}}
+.confidence-fill {{
+    height: 100%;
+    border-radius: 4px;
+    background: linear-gradient(90deg, #d9534f, {GOLD}, {TEAL});
+    transition: width 0.5s ease;
+}}
+/* Pipeline steps indicator */
+.pipeline-steps {{
+    display: flex;
+    gap: 4px;
+    align-items: center;
+    padding: 8px 0;
+}}
+.pipeline-step {{
+    flex: 1;
+    height: 4px;
+    border-radius: 2px;
+    background: rgba(43, 138, 138, 0.18);
+    transition: background 0.3s ease;
+}}
+.pipeline-step.active {{
+    background: {TEAL};
+}}
+.pipeline-step.complete {{
+    background: {GOLD};
+}}
+/* Asymmetry indicator */
+.asymmetry-bar {{
+    display: flex;
+    align-items: center;
+    gap: 8px;
+    padding: 8px 12px;
+    background: rgba(156, 188, 173, 0.30);
+    border-radius: 8px;
+    margin: 4px 0;
+}}
+.asymmetry-label {{
+    min-width: 60px;
+    font-size: 0.85em;
+    color: {INK_MUTED};
+}}
+.asymmetry-track {{
+    flex: 1;
+    height: 6px;
+    background: rgba(43, 138, 138, 0.12);
+    border-radius: 3px;
+    position: relative;
+}}
+.asymmetry-marker {{
+    position: absolute;
+    top: -3px;
+    width: 12px;
+    height: 12px;
+    border-radius: 50%;
+    background: {TEAL};
+    border: 2px solid {GOLD};
+}}
+/* Topographic pattern accent */
+.topo-accent {{
+    color: {INK_FAINT};
+    background-image:
+        repeating-linear-gradient(
+            0deg,
+            transparent,
+            transparent 40px,
+            rgba(43, 138, 138, 0.04) 40px,
+            rgba(43, 138, 138, 0.04) 41px
+        ),
+        repeating-linear-gradient(
+            90deg,
+            transparent,
+            transparent 40px,
+            rgba(43, 138, 138, 0.03) 40px,
+            rgba(43, 138, 138, 0.03) 41px
+        );
+}}
+/* Warning/error states */
+.needs-review {{
+    border-color: rgba(224, 164, 59, 0.65) !important;
+    background: rgba(224, 164, 59, 0.10) !important;
+}}
+.low-confidence {{
+    opacity: 0.7;
+    border-style: dashed !important;
+}}
+/* Rubric drawer */
+.rubric-item {{
+    display: flex;
+    align-items: center;
+    gap: 8px;
+    padding: 6px 10px;
+    border-radius: 6px;
+    margin: 2px 0;
+}}
+.rubric-met {{
+    background: rgba(43, 138, 138, 0.10);
+    border-left: 3px solid {TEAL};
+}}
+.rubric-unmet {{
+    background: rgba(217, 83, 79, 0.10);
+    border-left: 3px solid #d9534f;
+}}
+/* Responsive */
+@media (max-width: 768px) {{
+    .gradio-container {{
+        padding: 8px !important;
+    }}
+    .score-value {{
+        font-size: 3em;
+    }}
+}}
+"""

requirements.txt CHANGED Viewed

@@ -2,16 +2,18 @@ gradio>=5.0
 ultralytics>=8.3
 torch>=2.3
 opencv-python>=4.10
-imageio-ffmpeg>=0.5
 numpy>=1.26
 scipy>=1.13
 pillow>=10.3
 reportlab>=4.0
 requests>=2.31
 pytest>=8.2
 ruff>=0.4
 black>=24.4
 huggingface_hub>=0.23
 transformers>=4.44
 onnxruntime>=1.18
 mediapipe>=0.10

 ultralytics>=8.3
 torch>=2.3
 opencv-python>=4.10
 numpy>=1.26
 scipy>=1.13
 pillow>=10.3
 reportlab>=4.0
+matplotlib>=3.8
 requests>=2.31
 pytest>=8.2
 ruff>=0.4
 black>=24.4
 huggingface_hub>=0.23
 transformers>=4.44
+accelerate>=0.30
+spaces>=0.30
 onnxruntime>=1.18
 mediapipe>=0.10

scripts/hf_upload.sh CHANGED Viewed

@@ -1,97 +1,97 @@
-#!/usr/bin/env bash
-# Upload the FormScout source tree to both the model repo and the Space.
-#
-# Usage:
-#   ./scripts/hf_upload.sh                     # message from last git commit
-#   ./scripts/hf_upload.sh "feat: my change"   # custom message
-#
-# Pushes to:
-#   silas-therapy/small-functional-movement-screening          (model repo)
-#   spaces/silas-therapy/small-functional-movement-screening   (Gradio Space)
-#
-# `hf upload` does NOT read .hfignore — it only honors .gitignore, and only at
-# commit time (after hashing and pre-uploading everything). So we parse
-# .hfignore ourselves into --exclude globs and pass them explicitly.
-#
-# If the filtered file count still exceeds LARGE_THRESHOLD, we fall back to
-# `hf upload-large-folder` (resumable, multi-threaded). Caveats of that mode:
-# no --create-pr and no custom commit message — it commits directly to main
-# in multiple commits.
-set -euo pipefail
-cd "$(dirname "$0")/.."
-MODEL_REPO="silas-therapy/small-functional-movement-screening"
-SPACE_REPO="spaces/silas-therapy/small-functional-movement-screening"
-MSG="${1:-$(git log -1 --pretty=%s)}"
-LARGE_THRESHOLD="${FORMSCOUT_HF_LARGE_THRESHOLD:-500}"
-# Belt-and-suspenders extras on top of .hfignore. `.cache/` is the resume
-# state upload-large-folder writes into the folder being uploaded.
-PATTERNS=(
-    "*.pdf"
-    "**/node_modules/**"
-    ".cache/**"
-)
-# Parse .hfignore into fnmatch-style globs. fnmatch's `*` crosses `/`, but a
-# bare name like `.DS_Store` or `dir/` only matches at the root, so emit both
-# the rooted and `**/`-prefixed forms.
-while IFS= read -r line; do
-    line="${line%%#*}"
-    line="${line#"${line%%[![:space:]]*}"}"
-    line="${line%"${line##*[![:space:]]}"}"
-    [[ -z "$line" ]] && continue
-    if [[ "$line" == */ ]]; then
-        PATTERNS+=("${line}**" "**/${line}**")
-    else
-        PATTERNS+=("$line" "**/$line")
-    fi
-done < .hfignore
-EXCLUDES=()
-for p in "${PATTERNS[@]}"; do
-    EXCLUDES+=(--exclude="$p")
-done
-# Count what would actually be uploaded, using the same filter the hub client
-# applies, so the mode decision matches reality.
-N_FILES=$(python3 - "${PATTERNS[@]}" <<'EOF'
-import sys
-from pathlib import Path
-from huggingface_hub.utils import filter_repo_objects
-patterns = sys.argv[1:]
-files = (
-    str(p) for p in Path(".").rglob("*")
-    if p.is_file() and p.parts[0] != ".git"
-)
-print(len(list(filter_repo_objects(files, ignore_patterns=patterns))))
-EOF
-)
-echo "── $N_FILES files to upload after .hfignore filtering"
-if (( N_FILES == 0 )); then
-    echo "✗ nothing to upload — check .hfignore" >&2
-    exit 1
-fi
-upload_repo() {
-    local repo="$1"
-    if (( N_FILES > LARGE_THRESHOLD )); then
-        echo "── $repo: $N_FILES files > $LARGE_THRESHOLD, using upload-large-folder"
-        echo "   (resumable; commits directly to main — no PR, no custom message)"
-        hf upload-large-folder "$repo" . "${EXCLUDES[@]}"
-    else
-        echo "── uploading to: $repo"
-        hf upload "$repo" . . \
-            "${EXCLUDES[@]}" \
-            --create-pr \
-            --commit-message="$MSG"
-    fi
-}
-upload_repo "$MODEL_REPO"
-upload_repo "$SPACE_REPO"
-echo "✓ done"

+#!/usr/bin/env bash
+# Upload the FormScout source tree to both the model repo and the Space.
+#
+# Usage:
+#   ./scripts/hf_upload.sh                     # message from last git commit
+#   ./scripts/hf_upload.sh "feat: my change"   # custom message
+#
+# Pushes to:
+#   silas-therapy/small-functional-movement-screening          (model repo)
+#   spaces/silas-therapy/small-functional-movement-screening   (Gradio Space)
+#
+# `hf upload` does NOT read .hfignore — it only honors .gitignore, and only at
+# commit time (after hashing and pre-uploading everything). So we parse
+# .hfignore ourselves into --exclude globs and pass them explicitly.
+#
+# If the filtered file count still exceeds LARGE_THRESHOLD, we fall back to
+# `hf upload-large-folder` (resumable, multi-threaded). Caveats of that mode:
+# no --create-pr and no custom commit message — it commits directly to main
+# in multiple commits.
+set -euo pipefail
+cd "$(dirname "$0")/.."
+MODEL_REPO="silas-therapy/small-functional-movement-screening"
+SPACE_REPO="spaces/silas-therapy/small-functional-movement-screening"
+MSG="${1:-$(git log -1 --pretty=%s)}"
+LARGE_THRESHOLD="${FORMSCOUT_HF_LARGE_THRESHOLD:-500}"
+# Belt-and-suspenders extras on top of .hfignore. `.cache/` is the resume
+# state upload-large-folder writes into the folder being uploaded.
+PATTERNS=(
+    "*.pdf"
+    "**/node_modules/**"
+    ".cache/**"
+)
+# Parse .hfignore into fnmatch-style globs. fnmatch's `*` crosses `/`, but a
+# bare name like `.DS_Store` or `dir/` only matches at the root, so emit both
+# the rooted and `**/`-prefixed forms.
+while IFS= read -r line; do
+    line="${line%%#*}"
+    line="${line#"${line%%[![:space:]]*}"}"
+    line="${line%"${line##*[![:space:]]}"}"
+    [[ -z "$line" ]] && continue
+    if [[ "$line" == */ ]]; then
+        PATTERNS+=("${line}**" "**/${line}**")
+    else
+        PATTERNS+=("$line" "**/$line")
+    fi
+done < .hfignore
+EXCLUDES=()
+for p in "${PATTERNS[@]}"; do
+    EXCLUDES+=(--exclude="$p")
+done
+# Count what would actually be uploaded, using the same filter the hub client
+# applies, so the mode decision matches reality.
+N_FILES=$(python3 - "${PATTERNS[@]}" <<'EOF'
+import sys
+from pathlib import Path
+from huggingface_hub.utils import filter_repo_objects
+patterns = sys.argv[1:]
+files = (
+    str(p) for p in Path(".").rglob("*")
+    if p.is_file() and p.parts[0] != ".git"
+)
+print(len(list(filter_repo_objects(files, ignore_patterns=patterns))))
+EOF
+)
+echo "── $N_FILES files to upload after .hfignore filtering"
+if (( N_FILES == 0 )); then
+    echo "✗ nothing to upload — check .hfignore" >&2
+    exit 1
+fi
+upload_repo() {
+    local repo="$1"
+    if (( N_FILES > LARGE_THRESHOLD )); then
+        echo "── $repo: $N_FILES files > $LARGE_THRESHOLD, using upload-large-folder"
+        echo "   (resumable; commits directly to main — no PR, no custom message)"
+        hf upload-large-folder "$repo" . "${EXCLUDES[@]}"
+    else
+        echo "── uploading to: $repo"
+        hf upload "$repo" . . \
+            "${EXCLUDES[@]}" \
+            --create-pr \
+            --commit-message="$MSG"
+    fi
+}
+upload_repo "$MODEL_REPO"
+upload_repo "$SPACE_REPO"
+echo "✓ done"

scripts/serve_judge.sh CHANGED Viewed

@@ -1,35 +1,35 @@
-#!/usr/bin/env bash
-# Launch llama-server with the FormScout Judge/Classifier VLM.
-#
-# Default model: Qwen3-VL-8B-Instruct Q4_K_M (checkpoints/qwen3-vl/).
-# To serve a fine-tuned GGUF instead, set:
-#   FORMSCOUT_JUDGE_GGUF=/path/to/finetuned.gguf
-#   FORMSCOUT_JUDGE_MMPROJ=/path/to/mmproj.gguf   (only if it ships its own)
-#
-# Requires: brew install llama.cpp
-set -euo pipefail
-# Homebrew bin may be missing from non-interactive shells
-export PATH="/opt/homebrew/bin:/usr/local/bin:$PATH"
-ROOT="$(cd "$(dirname "$0")/.." && pwd)"
-GGUF="${FORMSCOUT_JUDGE_GGUF:-$ROOT/checkpoints/qwen3-vl/Qwen3VL-8B-Instruct-Q4_K_M.gguf}"
-MMPROJ="${FORMSCOUT_JUDGE_MMPROJ:-$ROOT/checkpoints/qwen3-vl/mmproj-Qwen3VL-8B-Instruct-F16.gguf}"
-HOST="${FORMSCOUT_LLAMA_HOST:-127.0.0.1}"
-PORT="${FORMSCOUT_LLAMA_PORT:-8080}"
-if [[ ! -f "$GGUF" ]]; then
-    echo "Model not found: $GGUF" >&2
-    echo "Download it with:" >&2
-    echo "  python3 -c \"from huggingface_hub import hf_hub_download; [hf_hub_download('Qwen/Qwen3-VL-8B-Instruct-GGUF', f, local_dir='$ROOT/checkpoints/qwen3-vl') for f in ['Qwen3VL-8B-Instruct-Q4_K_M.gguf', 'mmproj-Qwen3VL-8B-Instruct-F16.gguf']]\"" >&2
-    exit 1
-fi
-exec llama-server \
-    --model "$GGUF" \
-    --mmproj "$MMPROJ" \
-    --host "$HOST" \
-    --port "$PORT" \
-    --ctx-size 16384 \
-    --n-gpu-layers 99 \
-    --no-warmup

+#!/usr/bin/env bash
+# Launch llama-server with the FormScout Judge/Classifier VLM.
+#
+# Default model: Qwen3-VL-8B-Instruct Q4_K_M (checkpoints/qwen3-vl/).
+# To serve a fine-tuned GGUF instead, set:
+#   FORMSCOUT_JUDGE_GGUF=/path/to/finetuned.gguf
+#   FORMSCOUT_JUDGE_MMPROJ=/path/to/mmproj.gguf   (only if it ships its own)
+#
+# Requires: brew install llama.cpp
+set -euo pipefail
+# Homebrew bin may be missing from non-interactive shells
+export PATH="/opt/homebrew/bin:/usr/local/bin:$PATH"
+ROOT="$(cd "$(dirname "$0")/.." && pwd)"
+GGUF="${FORMSCOUT_JUDGE_GGUF:-$ROOT/checkpoints/qwen3-vl/Qwen3VL-8B-Instruct-Q4_K_M.gguf}"
+MMPROJ="${FORMSCOUT_JUDGE_MMPROJ:-$ROOT/checkpoints/qwen3-vl/mmproj-Qwen3VL-8B-Instruct-F16.gguf}"
+HOST="${FORMSCOUT_LLAMA_HOST:-127.0.0.1}"
+PORT="${FORMSCOUT_LLAMA_PORT:-8080}"
+if [[ ! -f "$GGUF" ]]; then
+    echo "Model not found: $GGUF" >&2
+    echo "Download it with:" >&2
+    echo "  python3 -c \"from huggingface_hub import hf_hub_download; [hf_hub_download('Qwen/Qwen3-VL-8B-Instruct-GGUF', f, local_dir='$ROOT/checkpoints/qwen3-vl') for f in ['Qwen3VL-8B-Instruct-Q4_K_M.gguf', 'mmproj-Qwen3VL-8B-Instruct-F16.gguf']]\"" >&2
+    exit 1
+fi
+exec llama-server \
+    --model "$GGUF" \
+    --mmproj "$MMPROJ" \
+    --host "$HOST" \
+    --port "$PORT" \
+    --ctx-size 16384 \
+    --n-gpu-layers 99 \
+    --no-warmup

tests/test_analysis.py ADDED Viewed

	@@ -0,0 +1,145 @@

+"""Tests for the movement-analysis engine — no GPU, no model downloads."""
+import math
+import numpy as np
+from formscout.types import Pose2DResult
+def _pose_from(traj_by_joint, n, base_conf=0.9):
+    """Build a Pose2DResult; traj_by_joint maps joint->callable(i)->(x,y)."""
+    kps = []
+    for i in range(n):
+        frame = {}
+        for j in range(17):
+            if j in traj_by_joint:
+                x, y = traj_by_joint[j](i)
+            else:
+                x, y = float(100 + j * 5), float(100 + j * 5)
+            frame[j] = {"x": float(x), "y": float(y), "conf": base_conf}
+        kps.append(frame)
+    return Pose2DResult(keypoints=kps, fps=30.0, confidence=0.9)
+# ── relevant_joints ────────────────────────────────────────────────────────────
+def test_relevant_joints_and_primary_consistent():
+    from formscout.analysis import relevant_joints as RJ
+    for test in RJ.RELEVANT:
+        joints = RJ.relevant_joints(test)
+        angles = RJ.relevant_angles(test)
+        prim = RJ.primary_angle(test)
+        assert joints, f"{test} has no relevant joints"
+        assert prim in angles, f"{test} primary angle not in angles"
+def test_openness_label_monotonic():
+    from formscout.analysis.relevant_joints import openness_label
+    assert "open" in openness_label(175)
+    assert openness_label(30).endswith("closed")
+# ── timeseries ──────────────────────────────────────────────────────────────────
+def test_angle_series_length_matches_frames():
+    from formscout.analysis.timeseries import angle_series
+    pose = _pose_from({}, n=8)
+    series = angle_series(pose, "deep_squat")
+    assert "left_knee_flexion" in series
+    for name, vals in series.items():
+        assert len(vals) == 8
+def test_relevant_flexion_reports_degrees_and_openness():
+    from formscout.analysis.timeseries import relevant_flexion_at
+    # Straight leg: hip, knee, ankle collinear vertical → ~180°
+    straight = {
+        11: lambda i: (200, 100), 13: lambda i: (200, 200), 15: lambda i: (200, 300),
+        12: lambda i: (260, 100), 14: lambda i: (260, 200), 16: lambda i: (260, 300),
+        5: lambda i: (200, 40), 6: lambda i: (260, 40),
+    }
+    pose = _pose_from(straight, n=5)
+    flex = relevant_flexion_at(pose, "deep_squat", 2)
+    assert "left_knee_flexion" in flex
+    assert flex["left_knee_flexion"]["deg"] > 160
+    assert "open" in flex["left_knee_flexion"]["openness"]
+# ── laban ───────────────────────────────────────────────────────────────────────
+def test_laban_factors_in_unit_range():
+    from formscout.analysis.laban import compute_laban
+    pose = _pose_from({13: lambda i: (100 + i * 8, 200)}, n=20)
+    res = compute_laban(pose, "deep_squat", fps=30.0)
+    for k, v in res["effort"].items():
+        assert 0.0 <= v <= 1.0, f"{k}={v} out of range"
+    assert set(res["labels"]) == {"space", "weight", "time", "flow"}
+def test_laban_straight_line_is_direct():
+    from formscout.analysis.laban import compute_laban
+    # Knee travels in a straight horizontal line → high directness (Space)
+    pose = _pose_from({13: lambda i: (100 + i * 10, 200)}, n=20)
+    res = compute_laban(pose, "deep_squat", fps=30.0)
+    assert res["effort"]["space"] > 0.8
+def test_laban_static_clip_low_energy():
+    from formscout.analysis.laban import compute_laban
+    pose = _pose_from({13: lambda i: (200, 200)}, n=20)  # no motion
+    res = compute_laban(pose, "deep_squat", fps=30.0)
+    assert res["effort"]["weight"] < 0.2
+# ── charts ──────────────────────────────────────────────────────────────────────
+def test_angle_over_time_chart(tmp_path):
+    from formscout.analysis import charts
+    out = str(tmp_path / "angle.png")
+    series = {"left_knee_flexion": [90, 100, 110, 95], "right_knee_flexion": [88, 99, 108, 94]}
+    p = charts.angle_over_time(series, "left_knee_flexion", 2, out)
+    assert p == out
+    import os
+    assert os.path.getsize(out) > 0
+def test_velocity_profile_chart(tmp_path):
+    from formscout.analysis import charts
+    out = str(tmp_path / "vel.png")
+    kps = [{j: {"x": 100 + j + i * 3, "y": 100 + j, "conf": 0.9} for j in range(17)} for i in range(10)]
+    p = charts.velocity_profile(kps, 30.0, [13, 14, 11, 12], out)
+    import os
+    assert p == out and os.path.getsize(out) > 0
+def test_laban_radar_chart(tmp_path):
+    from formscout.analysis import charts
+    out = str(tmp_path / "radar.png")
+    p = charts.laban_radar({"space": 0.8, "weight": 0.4, "time": 0.6, "flow": 0.3}, out)
+    import os
+    assert p == out and os.path.getsize(out) > 0
+def test_flexion_bars_chart(tmp_path):
+    from formscout.analysis import charts
+    out = str(tmp_path / "flex.png")
+    flex = {"left_knee_flexion": {"deg": 95.0, "openness": "flexed"},
+            "left_hip_flexion": {"deg": 120.0, "openness": "mid-range"}}
+    p = charts.flexion_bars(flex, out)
+    import os
+    assert p == out and os.path.getsize(out) > 0
+def test_symmetry_bars_chart(tmp_path):
+    from formscout.analysis import charts
+    out = str(tmp_path / "sym.png")
+    asym = [{"test": "hurdle_step", "left_score": 2, "right_score": 3, "delta": 1}]
+    p = charts.symmetry_bars(asym, out)
+    import os
+    assert p == out and os.path.getsize(out) > 0
+def test_charts_return_none_on_empty(tmp_path):
+    from formscout.analysis import charts
+    assert charts.angle_over_time({}, None, None, str(tmp_path / "a.png")) is None
+    assert charts.flexion_bars({}, str(tmp_path / "f.png")) is None
+    assert charts.symmetry_bars([], str(tmp_path / "s.png")) is None

tests/test_judge_backend.py ADDED Viewed

	@@ -0,0 +1,75 @@

+"""Judge backend selection + transformers-fallback — no GPU, no downloads."""
+import importlib
+import formscout.config as config
+from formscout.serving import get_vlm_client
+from formscout.serving.llama_cpp import LlamaCppClient
+def _reload_config(monkeypatch, **env):
+    for k, v in env.items():
+        if v is None:
+            monkeypatch.delenv(k, raising=False)
+        else:
+            monkeypatch.setenv(k, v)
+    importlib.reload(config)
+    return config
+def test_resolve_backend_default_local(monkeypatch):
+    cfg = _reload_config(monkeypatch, FORMSCOUT_JUDGE_BACKEND=None, SPACE_ID=None)
+    assert cfg.resolve_judge_backend() == "llama_cpp"
+def test_resolve_backend_auto_on_space(monkeypatch):
+    cfg = _reload_config(monkeypatch, FORMSCOUT_JUDGE_BACKEND="auto", SPACE_ID="me/space")
+    assert cfg.resolve_judge_backend() == "transformers"
+def test_resolve_backend_explicit(monkeypatch):
+    cfg = _reload_config(monkeypatch, FORMSCOUT_JUDGE_BACKEND="llama_cpp", SPACE_ID="me/space")
+    assert cfg.resolve_judge_backend() == "llama_cpp"
+    importlib.reload(config)  # restore
+def test_factory_returns_llama_cpp_locally(monkeypatch):
+    _reload_config(monkeypatch, FORMSCOUT_JUDGE_BACKEND="llama_cpp", SPACE_ID=None)
+    client = get_vlm_client()
+    assert isinstance(client, LlamaCppClient)
+    importlib.reload(config)
+def test_transformers_client_available_is_cheap_bool():
+    # available must not load/download the model — just report a bool
+    from formscout.serving.transformers_vlm import TransformersVLMClient
+    c = TransformersVLMClient()
+    assert isinstance(c.available, bool)
+    c._failed = True
+    assert c.available is False
+def test_judge_uses_rubric_on_fallback_sentinel():
+    from formscout.agents.judge import JudgeAgent
+    from formscout.types import BiomechFeatures, ScoreResult, MovementResult, IngestResult
+    import numpy as np
+    agent = JudgeAgent()
+    class _FakeClient:
+        available = True
+        def complete(self, *a, **k):
+            return {"error": "no gpu", "fallback": True, "text": ""}
+    agent._client = _FakeClient()
+    features = BiomechFeatures(test_name="deep_squat", view="2d", side="na",
+                               angles={}, alignments={}, symmetry_delta=None,
+                               timing={}, confidence=0.9)
+    rubric = ScoreResult(score=2, rationale="rubric says 2", confidence=0.8)
+    movement = MovementResult(test_name="deep_squat", side="na", confidence=1.0)
+    ingest = IngestResult(frames=[np.zeros((10, 10, 3), np.uint8)], fps=30.0,
+                          duration=0.03, n_people=1, width=10, height=10)
+    res = agent.run(features, rubric, movement, ingest)
+    assert res.score == 2
+    assert "rubric-only" in res.rationale

tests/test_keyframe.py CHANGED Viewed

@@ -1,37 +1,37 @@
-"""Tests for PoseVisualizer.render_frame — single annotated still."""
-import os
-import numpy as np
-from formscout.types import IngestResult, Pose2DResult
-def _ingest(n=5, h=480, w=640):
-    frames = [np.zeros((h, w, 3), dtype=np.uint8) for _ in range(n)]
-    return IngestResult(frames=frames, fps=30.0, duration=n / 30.0, n_people=1, width=w, height=h)
-def _pose(n=5):
-    kps = []
-    for i in range(n):
-        kps.append({j: {"x": float(50 + j * 25), "y": float(80 + j * 18), "conf": 0.9}
-                    for j in range(17)})
-    return Pose2DResult(keypoints=kps, fps=30.0, confidence=0.9)
-def test_render_frame_writes_png(tmp_path):
-    from formscout.agents.visualizer import PoseVisualizer
-    out = str(tmp_path / "key.png")
-    path = PoseVisualizer().render_frame(_ingest(), _pose(), frame_idx=2,
-                                         layers={"skeleton"}, caption="Deep Squat — heels elevated",
-                                         out_png=out)
-    assert path == out
-    assert os.path.exists(out)
-    assert os.path.getsize(out) > 0
-def test_render_frame_bad_index_returns_none(tmp_path):
-    from formscout.agents.visualizer import PoseVisualizer
-    out = str(tmp_path / "key.png")
-    path = PoseVisualizer().render_frame(_ingest(n=3), _pose(n=3), frame_idx=99,
-                                         layers={"skeleton"}, caption="", out_png=out)
-    assert path is None

+"""Tests for PoseVisualizer.render_frame — single annotated still."""
+import os
+import numpy as np
+from formscout.types import IngestResult, Pose2DResult
+def _ingest(n=5, h=480, w=640):
+    frames = [np.zeros((h, w, 3), dtype=np.uint8) for _ in range(n)]
+    return IngestResult(frames=frames, fps=30.0, duration=n / 30.0, n_people=1, width=w, height=h)
+def _pose(n=5):
+    kps = []
+    for i in range(n):
+        kps.append({j: {"x": float(50 + j * 25), "y": float(80 + j * 18), "conf": 0.9}
+                    for j in range(17)})
+    return Pose2DResult(keypoints=kps, fps=30.0, confidence=0.9)
+def test_render_frame_writes_png(tmp_path):
+    from formscout.agents.visualizer import PoseVisualizer
+    out = str(tmp_path / "key.png")
+    path = PoseVisualizer().render_frame(_ingest(), _pose(), frame_idx=2,
+                                         layers={"skeleton"}, caption="Deep Squat — heels elevated",
+                                         out_png=out)
+    assert path == out
+    assert os.path.exists(out)
+    assert os.path.getsize(out) > 0
+def test_render_frame_bad_index_returns_none(tmp_path):
+    from formscout.agents.visualizer import PoseVisualizer
+    out = str(tmp_path / "key.png")
+    path = PoseVisualizer().render_frame(_ingest(n=3), _pose(n=3), frame_idx=99,
+                                         layers={"skeleton"}, caption="", out_png=out)
+    assert path is None

tests/test_pdf_report.py CHANGED Viewed

@@ -1,51 +1,51 @@
-"""Tests for PdfReportAgent — no GPU, no model downloads."""
-import os
-from formscout.types import (
-    ReportResult, SessionEntry, MovementResult, BiomechFeatures, ScoreResult, JudgeResult,
-)
-def _entry(test_name="deep_squat", score=2, needs_human=False):
-    movement = MovementResult(test_name=test_name, side="na", confidence=1.0)
-    features = BiomechFeatures(
-        test_name=test_name, view="2d", side="na",
-        angles={"left_knee_flexion_deg": 95.0}, alignments={"knees_tracking_over_feet": False},
-        symmetry_delta=None, timing={"deepest_frame": 1}, confidence=0.9,
-    )
-    rubric = ScoreResult(score=2, rationale="rubric ok", confidence=0.8)
-    judge = JudgeResult(score=None if needs_human else score, rationale="judge rationale",
-                        compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
-                        confidence=0.85, needs_human=needs_human)
-    return SessionEntry(
-        test_name=test_name, side="na", score=None if needs_human else score,
-        needs_human=needs_human, rationale="judge rationale",
-        compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
-        measurements={"left_knee_flexion_deg": 95.0, "knees_tracking_over_feet": False},
-        confidence=0.85, view="2d", keyframe_path=None,
-        movement=movement, features=features, rubric_score=rubric, judge=judge,
-    )
-def _report(composite=2):
-    return ReportResult(
-        per_test=[], composite=composite, asymmetries=[],
-        overlay_video_path=None, pdf_path=None,
-        low_confidence_flags=[], disagreement_flags=[],
-    )
-def test_pdf_is_created(tmp_path):
-    from formscout.agents.pdf_report import PdfReportAgent
-    path = PdfReportAgent().run(_report(2), [_entry()], str(tmp_path))
-    assert path is not None
-    assert os.path.exists(path)
-    assert os.path.getsize(path) > 1000  # a real PDF, not an empty file
-    with open(path, "rb") as f:
-        assert f.read(5) == b"%PDF-"
-def test_pdf_handles_incomplete_composite(tmp_path):
-    from formscout.agents.pdf_report import PdfReportAgent
-    path = PdfReportAgent().run(_report(None), [_entry(needs_human=True)], str(tmp_path))
-    assert path is not None and os.path.exists(path)

+"""Tests for PdfReportAgent — no GPU, no model downloads."""
+import os
+from formscout.types import (
+    ReportResult, SessionEntry, MovementResult, BiomechFeatures, ScoreResult, JudgeResult,
+)
+def _entry(test_name="deep_squat", score=2, needs_human=False):
+    movement = MovementResult(test_name=test_name, side="na", confidence=1.0)
+    features = BiomechFeatures(
+        test_name=test_name, view="2d", side="na",
+        angles={"left_knee_flexion_deg": 95.0}, alignments={"knees_tracking_over_feet": False},
+        symmetry_delta=None, timing={"deepest_frame": 1}, confidence=0.9,
+    )
+    rubric = ScoreResult(score=2, rationale="rubric ok", confidence=0.8)
+    judge = JudgeResult(score=None if needs_human else score, rationale="judge rationale",
+                        compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
+                        confidence=0.85, needs_human=needs_human)
+    return SessionEntry(
+        test_name=test_name, side="na", score=None if needs_human else score,
+        needs_human=needs_human, rationale="judge rationale",
+        compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
+        measurements={"left_knee_flexion_deg": 95.0, "knees_tracking_over_feet": False},
+        confidence=0.85, view="2d", keyframe_path=None,
+        movement=movement, features=features, rubric_score=rubric, judge=judge,
+    )
+def _report(composite=2):
+    return ReportResult(
+        per_test=[], composite=composite, asymmetries=[],
+        overlay_video_path=None, pdf_path=None,
+        low_confidence_flags=[], disagreement_flags=[],
+    )
+def test_pdf_is_created(tmp_path):
+    from formscout.agents.pdf_report import PdfReportAgent
+    path = PdfReportAgent().run(_report(2), [_entry()], str(tmp_path))
+    assert path is not None
+    assert os.path.exists(path)
+    assert os.path.getsize(path) > 1000  # a real PDF, not an empty file
+    with open(path, "rb") as f:
+        assert f.read(5) == b"%PDF-"
+def test_pdf_handles_incomplete_composite(tmp_path):
+    from formscout.agents.pdf_report import PdfReportAgent
+    path = PdfReportAgent().run(_report(None), [_entry(needs_human=True)], str(tmp_path))
+    assert path is not None and os.path.exists(path)

tests/test_phase2.py CHANGED Viewed

@@ -1,354 +1,354 @@
-"""Tests for all rubric scorers and Phase 2 agents."""
-from formscout.types import (
-    BiomechFeatures, ScoreResult, MovementResult, JudgeResult, ReportResult,
-)
-from formscout.rubric import score_test, SCORERS
-from formscout.rubric.hurdle_step import score_hurdle_step
-from formscout.rubric.inline_lunge import score_inline_lunge
-from formscout.rubric.shoulder_mobility import score_shoulder_mobility
-from formscout.rubric.active_slr import score_active_slr
-from formscout.rubric.trunk_stability_pushup import score_trunk_stability_pushup
-from formscout.rubric.rotary_stability import score_rotary_stability
-from formscout.agents.judge import JudgeAgent
-from formscout.agents.report import ReportAgent
-def _make_features(test_name, angles=None, alignments=None, side="na", sym_delta=None):
-    return BiomechFeatures(
-        test_name=test_name, view="2d", side=side,
-        angles=angles or {}, alignments=alignments or {},
-        symmetry_delta=sym_delta, timing={}, confidence=0.8,
-    )
-# ─── Rubric dispatch ─────────────────────────────────────────────────────────
-class TestRubricDispatch:
-    def test_all_tests_have_scorers(self):
-        expected = {"deep_squat", "hurdle_step", "inline_lunge", "shoulder_mobility",
-                    "active_slr", "trunk_stability_pushup", "rotary_stability"}
-        assert set(SCORERS.keys()) == expected
-    def test_dispatch_unknown_test(self):
-        f = _make_features("unknown_test")
-        r = score_test(f)
-        assert r.confidence == 0.0
-# ─── Hurdle Step ──────────────────────────────────────────────────────────────
-class TestHurdleStep:
-    def test_score_3_good_form(self):
-        f = _make_features("hurdle_step", angles={
-            "step_hip_flexion_deg": 100.0, "stance_knee_angle_deg": 175.0,
-            "shoulder_tilt_deg": 5.0,
-        }, alignments={"trunk_stable": True, "stance_knee_extended": True})
-        r = score_hurdle_step(f)
-        assert r.score == 3
-    def test_score_2_compensation(self):
-        f = _make_features("hurdle_step", angles={
-            "step_hip_flexion_deg": 80.0, "stance_knee_angle_deg": 170.0,
-        }, alignments={"trunk_stable": True, "stance_knee_extended": True})
-        r = score_hurdle_step(f)
-        assert r.score == 2
-    def test_score_1_poor(self):
-        f = _make_features("hurdle_step", angles={
-            "step_hip_flexion_deg": 50.0, "stance_knee_angle_deg": 140.0,
-        }, alignments={"trunk_stable": False, "stance_knee_extended": False})
-        r = score_hurdle_step(f)
-        assert r.score == 1
-    def test_never_scores_zero(self):
-        f = _make_features("hurdle_step", angles={
-            "step_hip_flexion_deg": 30.0,
-        }, alignments={"trunk_stable": False, "stance_knee_extended": False})
-        r = score_hurdle_step(f)
-        assert r.score >= 1
-# ─── Inline Lunge ─────────────────────────────────────────────────────────────
-class TestInlineLunge:
-    def test_score_3_deep_and_aligned(self):
-        f = _make_features("inline_lunge", angles={
-            "front_knee_flexion_deg": 85.0, "trunk_lean_from_vertical_deg": 5.0,
-        }, alignments={"trunk_upright": True, "knee_over_ankle": True})
-        r = score_inline_lunge(f)
-        assert r.score == 3
-    def test_score_1_shallow(self):
-        f = _make_features("inline_lunge", angles={
-            "front_knee_flexion_deg": 140.0,
-        }, alignments={"trunk_upright": False, "knee_over_ankle": False})
-        r = score_inline_lunge(f)
-        assert r.score == 1
-# ─── Shoulder Mobility ────────────────────────────────────────────────────────
-class TestShoulderMobility:
-    def test_score_3_close(self):
-        f = _make_features("shoulder_mobility", angles={
-            "inter_fist_normalized": 0.25,
-        }, alignments={"fists_within_one_hand": True, "fists_within_1_5_hand": True})
-        r = score_shoulder_mobility(f)
-        assert r.score == 3
-    def test_score_2_moderate(self):
-        f = _make_features("shoulder_mobility", angles={
-            "inter_fist_normalized": 0.45,
-        }, alignments={"fists_within_one_hand": False, "fists_within_1_5_hand": True})
-        r = score_shoulder_mobility(f)
-        assert r.score == 2
-    def test_score_1_far(self):
-        f = _make_features("shoulder_mobility", angles={
-            "inter_fist_normalized": 0.8,
-        }, alignments={"fists_within_one_hand": False, "fists_within_1_5_hand": False})
-        r = score_shoulder_mobility(f)
-        assert r.score == 1
-# ─── Active SLR ───────────────────────────────────────────────────────────────
-class TestActiveSLR:
-    def test_score_3_high_raise(self):
-        f = _make_features("active_slr", angles={
-            "raised_leg_angle_deg": 80.0,
-        }, alignments={"past_contralateral_knee": True, "past_mid_thigh": True, "down_leg_flat": True})
-        r = score_active_slr(f)
-        assert r.score == 3
-    def test_score_2_moderate_raise(self):
-        f = _make_features("active_slr", angles={
-            "raised_leg_angle_deg": 55.0,
-        }, alignments={"past_contralateral_knee": False, "past_mid_thigh": True, "down_leg_flat": True})
-        r = score_active_slr(f)
-        assert r.score == 2
-    def test_score_1_low_raise(self):
-        f = _make_features("active_slr", angles={
-            "raised_leg_angle_deg": 30.0,
-        }, alignments={"past_contralateral_knee": False, "past_mid_thigh": False, "down_leg_flat": True})
-        r = score_active_slr(f)
-        assert r.score == 1
-# ─── Trunk Stability Push-Up ─────────────────────────────────────────────────
-class TestTrunkStabilityPushup:
-    def test_score_3_rigid_hands_high(self):
-        f = _make_features("trunk_stability_pushup", angles={
-            "max_sag_px": 10.0, "trunk_variance_px": 5.0,
-        }, alignments={"body_rigid": True, "no_sag": True, "hands_at_forehead": True})
-        r = score_trunk_stability_pushup(f)
-        assert r.score == 3
-    def test_score_1_sag(self):
-        f = _make_features("trunk_stability_pushup", angles={
-            "max_sag_px": 50.0, "trunk_variance_px": 25.0,
-        }, alignments={"body_rigid": False, "no_sag": False, "hands_at_forehead": True})
-        r = score_trunk_stability_pushup(f)
-        assert r.score == 1
-# ─── Rotary Stability ────────────────────────────────────────────────────────
-class TestRotaryStability:
-    def test_score_2_stable(self):
-        f = _make_features("rotary_stability", angles={
-            "trunk_stability_std_px": 8.0, "shoulder_level_diff_px": 10.0, "hip_level_diff_px": 12.0,
-        }, alignments={"trunk_stable": True, "shoulders_level": True, "hips_level": True})
-        r = score_rotary_stability(f)
-        assert r.score == 2  # Default to 2 (contralateral assumption)
-    def test_score_1_unstable(self):
-        f = _make_features("rotary_stability", angles={
-            "trunk_stability_std_px": 30.0, "shoulder_level_diff_px": 35.0, "hip_level_diff_px": 30.0,
-        }, alignments={"trunk_stable": False, "shoulders_level": False, "hips_level": False})
-        r = score_rotary_stability(f)
-        assert r.score == 1
-# ─── JudgeAgent fallback ─────────────────────────────────────────────────────
-class TestJudgeAgent:
-    def test_fallback_when_judge_disabled(self, monkeypatch):
-        """When ENABLE_JUDGE=False, judge promotes rubric score."""
-        from formscout import config
-        monkeypatch.setattr(config, "ENABLE_JUDGE", False)
-        agent = JudgeAgent()
-        features = _make_features("deep_squat", angles={"left_femur_from_horizontal_deg": 70.0})
-        rubric = ScoreResult(score=3, rationale="all good", confidence=0.9)
-        movement = MovementResult(test_name="deep_squat", side="na", confidence=1.0)
-        result = agent.run(features, rubric, movement)
-        assert isinstance(result, JudgeResult)
-        assert result.score == 3
-        assert "[rubric-only]" in result.rationale
-    def test_fallback_when_server_unavailable(self, monkeypatch):
-        """ENABLE_JUDGE=True but llama-server down → rubric fallback, never a crash."""
-        from unittest.mock import PropertyMock, patch
-        from formscout import config
-        monkeypatch.setattr(config, "ENABLE_JUDGE", True)
-        agent = JudgeAgent()
-        with patch.object(type(agent._client), "available", new_callable=PropertyMock, return_value=False):
-            features = _make_features("deep_squat")
-            rubric = ScoreResult(score=2, rationale="heels up", confidence=0.8)
-            movement = MovementResult(test_name="deep_squat", side="na", confidence=1.0)
-            result = agent.run(features, rubric, movement)
-        assert result.score == 2
-        assert "[rubric-only]" in result.rationale
-    def test_vlm_response_parsed_into_judge_result(self, monkeypatch):
-        """ENABLE_JUDGE=True with live client → VLM JSON becomes JudgeResult."""
-        from unittest.mock import PropertyMock, patch
-        from formscout import config
-        monkeypatch.setattr(config, "ENABLE_JUDGE", True)
-        agent = JudgeAgent()
-        vlm_json = {
-            "test": "deep_squat", "side": "na", "score": 2, "needs_human": False,
-            "rationale": "Femur 5° above horizontal; 2D estimate.",
-            "compensation_tags": ["forward_lean"], "corrective_hint": "Sit back into heels.",
-            "confidence": 0.78,
-        }
-        with patch.object(type(agent._client), "available", new_callable=PropertyMock, return_value=True), \
-             patch.object(agent._client, "complete", return_value=vlm_json):
-            features = _make_features("deep_squat")
-            rubric = ScoreResult(score=2, rationale="ok", confidence=0.8)
-            movement = MovementResult(test_name="deep_squat", side="na", confidence=1.0)
-            result = agent.run(features, rubric, movement)
-        assert result.score == 2
-        assert result.compensation_tags == ["forward_lean"]
-        assert result.needs_human is False
-    def test_vlm_needs_human_yields_no_score(self, monkeypatch):
-        """needs_human=True from the VLM must produce score=None."""
-        from unittest.mock import PropertyMock, patch
-        from formscout import config
-        monkeypatch.setattr(config, "ENABLE_JUDGE", True)
-        agent = JudgeAgent()
-        vlm_json = {"score": 1, "needs_human": True, "rationale": "Possible pain.", "confidence": 0.9}
-        with patch.object(type(agent._client), "available", new_callable=PropertyMock, return_value=True), \
-             patch.object(agent._client, "complete", return_value=vlm_json):
-            result = agent.run(
-                _make_features("deep_squat"),
-                ScoreResult(score=1, rationale="x", confidence=0.5),
-                MovementResult(test_name="deep_squat", side="na", confidence=1.0),
-            )
-        assert result.needs_human is True
-        assert result.score is None
-# ─── LlamaCppClient (chat-completions endpoint) ──────────────────────────────
-class TestLlamaCppClient:
-    def test_parse_plain_json(self):
-        from formscout.serving.llama_cpp import LlamaCppClient
-        assert LlamaCppClient._parse_json_reply('{"score": 3}') == {"score": 3}
-    def test_parse_fenced_json(self):
-        from formscout.serving.llama_cpp import LlamaCppClient
-        fenced = '```json\n{"score": 2, "needs_human": false}\n```'
-        assert LlamaCppClient._parse_json_reply(fenced) == {"score": 2, "needs_human": False}
-    def test_parse_non_json_returns_text(self):
-        from formscout.serving.llama_cpp import LlamaCppClient
-        assert LlamaCppClient._parse_json_reply("not json") == {"text": "not json"}
-    def test_complete_posts_chat_endpoint_with_images(self):
-        from unittest.mock import MagicMock, patch
-        from formscout.serving.llama_cpp import LlamaCppClient
-        client = LlamaCppClient(port=8080)
-        resp = MagicMock()
-        resp.json.return_value = {"choices": [{"message": {"content": '{"ok": true}'}}]}
-        resp.raise_for_status.return_value = None
-        with patch("formscout.serving.llama_cpp.requests.post", return_value=resp) as mock_post:
-            result = client.complete("score this", images=["aGVsbG8=" * 600])
-        assert result == {"ok": True}
-        url = mock_post.call_args.args[0] if mock_post.call_args.args else mock_post.call_args.kwargs.get("url")
-        assert url.endswith("/v1/chat/completions")
-        payload = mock_post.call_args.kwargs["json"]
-        content = payload["messages"][0]["content"]
-        assert content[0] == {"type": "text", "text": "score this"}
-        assert content[1]["type"] == "image_url"
-        assert content[1]["image_url"]["url"].startswith("data:image/jpeg;base64,")
-    def test_complete_connection_error_returns_safe_dict(self):
-        from unittest.mock import patch
-        import requests as _requests
-        from formscout.serving.llama_cpp import LlamaCppClient
-        client = LlamaCppClient(port=8080)
-        with patch("formscout.serving.llama_cpp.requests.post", side_effect=_requests.ConnectionError):
-            result = client.complete("hello")
-        assert "error" in result
-# ─── ReportAgent ──────────────────────────────────────────────────────────────
-class TestReportAgent:
-    def test_single_test_report(self):
-        agent = ReportAgent()
-        entries = [{
-            "movement": MovementResult(test_name="deep_squat", side="na", confidence=1.0),
-            "features": _make_features("deep_squat"),
-            "rubric_score": ScoreResult(score=3, rationale="ok", confidence=0.9),
-            "judge": JudgeResult(
-                score=3, rationale="good", compensation_tags=[], corrective_hint="",
-                confidence=0.9,
-            ),
-            "side": "na",
-        }]
-        result = agent.run(entries)
-        assert isinstance(result, ReportResult)
-        assert len(result.per_test) == 1
-        assert result.per_test[0]["score"] == 3
-    def test_bilateral_reports_lower_score(self):
-        agent = ReportAgent()
-        entries = [
-            {
-                "movement": MovementResult(test_name="hurdle_step", side="left", confidence=1.0),
-                "features": _make_features("hurdle_step", side="left"),
-                "rubric_score": ScoreResult(score=3, rationale="ok", confidence=0.9),
-                "judge": JudgeResult(
-                    score=3, rationale="", compensation_tags=[], corrective_hint="",
-                    confidence=0.9,
-                ),
-                "side": "left",
-            },
-            {
-                "movement": MovementResult(test_name="hurdle_step", side="right", confidence=1.0),
-                "features": _make_features("hurdle_step", side="right"),
-                "rubric_score": ScoreResult(score=2, rationale="comp", confidence=0.8),
-                "judge": JudgeResult(
-                    score=2, rationale="", compensation_tags=[], corrective_hint="",
-                    confidence=0.8,
-                ),
-                "side": "right",
-            },
-        ]
-        result = agent.run(entries)
-        assert result.per_test[0]["score"] == 2  # lower of 3 and 2
-        assert len(result.asymmetries) == 1
-        assert result.asymmetries[0]["delta"] == 1
-    def test_composite_none_when_unscored(self):
-        agent = ReportAgent()
-        entries = [{
-            "movement": MovementResult(test_name="deep_squat", side="na", confidence=1.0),
-            "features": _make_features("deep_squat"),
-            "rubric_score": ScoreResult(score=1, rationale="", confidence=0.5),
-            "judge": JudgeResult(
-                score=None, rationale="pain", compensation_tags=[], corrective_hint="",
-                confidence=0.0, needs_human=True,
-            ),
-            "side": "na",
-        }]
-        result = agent.run(entries)
-        assert result.composite is None

+"""Tests for all rubric scorers and Phase 2 agents."""
+from formscout.types import (
+    BiomechFeatures, ScoreResult, MovementResult, JudgeResult, ReportResult,
+)
+from formscout.rubric import score_test, SCORERS
+from formscout.rubric.hurdle_step import score_hurdle_step
+from formscout.rubric.inline_lunge import score_inline_lunge
+from formscout.rubric.shoulder_mobility import score_shoulder_mobility
+from formscout.rubric.active_slr import score_active_slr
+from formscout.rubric.trunk_stability_pushup import score_trunk_stability_pushup
+from formscout.rubric.rotary_stability import score_rotary_stability
+from formscout.agents.judge import JudgeAgent
+from formscout.agents.report import ReportAgent
+def _make_features(test_name, angles=None, alignments=None, side="na", sym_delta=None):
+    return BiomechFeatures(
+        test_name=test_name, view="2d", side=side,
+        angles=angles or {}, alignments=alignments or {},
+        symmetry_delta=sym_delta, timing={}, confidence=0.8,
+    )
+# ─── Rubric dispatch ─────────────────────────────────────────────────────────
+class TestRubricDispatch:
+    def test_all_tests_have_scorers(self):
+        expected = {"deep_squat", "hurdle_step", "inline_lunge", "shoulder_mobility",
+                    "active_slr", "trunk_stability_pushup", "rotary_stability"}
+        assert set(SCORERS.keys()) == expected
+    def test_dispatch_unknown_test(self):
+        f = _make_features("unknown_test")
+        r = score_test(f)
+        assert r.confidence == 0.0
+# ─── Hurdle Step ──────────────────────────────────────────────────────────────
+class TestHurdleStep:
+    def test_score_3_good_form(self):
+        f = _make_features("hurdle_step", angles={
+            "step_hip_flexion_deg": 100.0, "stance_knee_angle_deg": 175.0,
+            "shoulder_tilt_deg": 5.0,
+        }, alignments={"trunk_stable": True, "stance_knee_extended": True})
+        r = score_hurdle_step(f)
+        assert r.score == 3
+    def test_score_2_compensation(self):
+        f = _make_features("hurdle_step", angles={
+            "step_hip_flexion_deg": 80.0, "stance_knee_angle_deg": 170.0,
+        }, alignments={"trunk_stable": True, "stance_knee_extended": True})
+        r = score_hurdle_step(f)
+        assert r.score == 2
+    def test_score_1_poor(self):
+        f = _make_features("hurdle_step", angles={
+            "step_hip_flexion_deg": 50.0, "stance_knee_angle_deg": 140.0,
+        }, alignments={"trunk_stable": False, "stance_knee_extended": False})
+        r = score_hurdle_step(f)
+        assert r.score == 1
+    def test_never_scores_zero(self):
+        f = _make_features("hurdle_step", angles={
+            "step_hip_flexion_deg": 30.0,
+        }, alignments={"trunk_stable": False, "stance_knee_extended": False})
+        r = score_hurdle_step(f)
+        assert r.score >= 1
+# ─── Inline Lunge ─────────────────────────────────────────────────────────────
+class TestInlineLunge:
+    def test_score_3_deep_and_aligned(self):
+        f = _make_features("inline_lunge", angles={
+            "front_knee_flexion_deg": 85.0, "trunk_lean_from_vertical_deg": 5.0,
+        }, alignments={"trunk_upright": True, "knee_over_ankle": True})
+        r = score_inline_lunge(f)
+        assert r.score == 3
+    def test_score_1_shallow(self):
+        f = _make_features("inline_lunge", angles={
+            "front_knee_flexion_deg": 140.0,
+        }, alignments={"trunk_upright": False, "knee_over_ankle": False})
+        r = score_inline_lunge(f)
+        assert r.score == 1
+# ─── Shoulder Mobility ────────────────────────────────────────────────────────
+class TestShoulderMobility:
+    def test_score_3_close(self):
+        f = _make_features("shoulder_mobility", angles={
+            "inter_fist_normalized": 0.25,
+        }, alignments={"fists_within_one_hand": True, "fists_within_1_5_hand": True})
+        r = score_shoulder_mobility(f)
+        assert r.score == 3
+    def test_score_2_moderate(self):
+        f = _make_features("shoulder_mobility", angles={
+            "inter_fist_normalized": 0.45,
+        }, alignments={"fists_within_one_hand": False, "fists_within_1_5_hand": True})
+        r = score_shoulder_mobility(f)
+        assert r.score == 2
+    def test_score_1_far(self):
+        f = _make_features("shoulder_mobility", angles={
+            "inter_fist_normalized": 0.8,
+        }, alignments={"fists_within_one_hand": False, "fists_within_1_5_hand": False})
+        r = score_shoulder_mobility(f)
+        assert r.score == 1
+# ─── Active SLR ───────────────────────────────────────────────────────────────
+class TestActiveSLR:
+    def test_score_3_high_raise(self):
+        f = _make_features("active_slr", angles={
+            "raised_leg_angle_deg": 80.0,
+        }, alignments={"past_contralateral_knee": True, "past_mid_thigh": True, "down_leg_flat": True})
+        r = score_active_slr(f)
+        assert r.score == 3
+    def test_score_2_moderate_raise(self):
+        f = _make_features("active_slr", angles={
+            "raised_leg_angle_deg": 55.0,
+        }, alignments={"past_contralateral_knee": False, "past_mid_thigh": True, "down_leg_flat": True})
+        r = score_active_slr(f)
+        assert r.score == 2
+    def test_score_1_low_raise(self):
+        f = _make_features("active_slr", angles={
+            "raised_leg_angle_deg": 30.0,
+        }, alignments={"past_contralateral_knee": False, "past_mid_thigh": False, "down_leg_flat": True})
+        r = score_active_slr(f)
+        assert r.score == 1
+# ─── Trunk Stability Push-Up ─────────────────────────────────────────────────
+class TestTrunkStabilityPushup:
+    def test_score_3_rigid_hands_high(self):
+        f = _make_features("trunk_stability_pushup", angles={
+            "max_sag_px": 10.0, "trunk_variance_px": 5.0,
+        }, alignments={"body_rigid": True, "no_sag": True, "hands_at_forehead": True})
+        r = score_trunk_stability_pushup(f)
+        assert r.score == 3
+    def test_score_1_sag(self):
+        f = _make_features("trunk_stability_pushup", angles={
+            "max_sag_px": 50.0, "trunk_variance_px": 25.0,
+        }, alignments={"body_rigid": False, "no_sag": False, "hands_at_forehead": True})
+        r = score_trunk_stability_pushup(f)
+        assert r.score == 1
+# ─── Rotary Stability ────────────────────────────────────────────────────────
+class TestRotaryStability:
+    def test_score_2_stable(self):
+        f = _make_features("rotary_stability", angles={
+            "trunk_stability_std_px": 8.0, "shoulder_level_diff_px": 10.0, "hip_level_diff_px": 12.0,
+        }, alignments={"trunk_stable": True, "shoulders_level": True, "hips_level": True})
+        r = score_rotary_stability(f)
+        assert r.score == 2  # Default to 2 (contralateral assumption)
+    def test_score_1_unstable(self):
+        f = _make_features("rotary_stability", angles={
+            "trunk_stability_std_px": 30.0, "shoulder_level_diff_px": 35.0, "hip_level_diff_px": 30.0,
+        }, alignments={"trunk_stable": False, "shoulders_level": False, "hips_level": False})
+        r = score_rotary_stability(f)
+        assert r.score == 1
+# ─── JudgeAgent fallback ─────────────────────────────────────────────────────
+class TestJudgeAgent:
+    def test_fallback_when_judge_disabled(self, monkeypatch):
+        """When ENABLE_JUDGE=False, judge promotes rubric score."""
+        from formscout import config
+        monkeypatch.setattr(config, "ENABLE_JUDGE", False)
+        agent = JudgeAgent()
+        features = _make_features("deep_squat", angles={"left_femur_from_horizontal_deg": 70.0})
+        rubric = ScoreResult(score=3, rationale="all good", confidence=0.9)
+        movement = MovementResult(test_name="deep_squat", side="na", confidence=1.0)
+        result = agent.run(features, rubric, movement)
+        assert isinstance(result, JudgeResult)
+        assert result.score == 3
+        assert "[rubric-only]" in result.rationale
+    def test_fallback_when_server_unavailable(self, monkeypatch):
+        """ENABLE_JUDGE=True but llama-server down → rubric fallback, never a crash."""
+        from unittest.mock import PropertyMock, patch
+        from formscout import config
+        monkeypatch.setattr(config, "ENABLE_JUDGE", True)
+        agent = JudgeAgent()
+        with patch.object(type(agent._client), "available", new_callable=PropertyMock, return_value=False):
+            features = _make_features("deep_squat")
+            rubric = ScoreResult(score=2, rationale="heels up", confidence=0.8)
+            movement = MovementResult(test_name="deep_squat", side="na", confidence=1.0)
+            result = agent.run(features, rubric, movement)
+        assert result.score == 2
+        assert "[rubric-only]" in result.rationale
+    def test_vlm_response_parsed_into_judge_result(self, monkeypatch):
+        """ENABLE_JUDGE=True with live client → VLM JSON becomes JudgeResult."""
+        from unittest.mock import PropertyMock, patch
+        from formscout import config
+        monkeypatch.setattr(config, "ENABLE_JUDGE", True)
+        agent = JudgeAgent()
+        vlm_json = {
+            "test": "deep_squat", "side": "na", "score": 2, "needs_human": False,
+            "rationale": "Femur 5° above horizontal; 2D estimate.",
+            "compensation_tags": ["forward_lean"], "corrective_hint": "Sit back into heels.",
+            "confidence": 0.78,
+        }
+        with patch.object(type(agent._client), "available", new_callable=PropertyMock, return_value=True), \
+             patch.object(agent._client, "complete", return_value=vlm_json):
+            features = _make_features("deep_squat")
+            rubric = ScoreResult(score=2, rationale="ok", confidence=0.8)
+            movement = MovementResult(test_name="deep_squat", side="na", confidence=1.0)
+            result = agent.run(features, rubric, movement)
+        assert result.score == 2
+        assert result.compensation_tags == ["forward_lean"]
+        assert result.needs_human is False
+    def test_vlm_needs_human_yields_no_score(self, monkeypatch):
+        """needs_human=True from the VLM must produce score=None."""
+        from unittest.mock import PropertyMock, patch
+        from formscout import config
+        monkeypatch.setattr(config, "ENABLE_JUDGE", True)
+        agent = JudgeAgent()
+        vlm_json = {"score": 1, "needs_human": True, "rationale": "Possible pain.", "confidence": 0.9}
+        with patch.object(type(agent._client), "available", new_callable=PropertyMock, return_value=True), \
+             patch.object(agent._client, "complete", return_value=vlm_json):
+            result = agent.run(
+                _make_features("deep_squat"),
+                ScoreResult(score=1, rationale="x", confidence=0.5),
+                MovementResult(test_name="deep_squat", side="na", confidence=1.0),
+            )
+        assert result.needs_human is True
+        assert result.score is None
+# ─── LlamaCppClient (chat-completions endpoint) ──────────────────────────────
+class TestLlamaCppClient:
+    def test_parse_plain_json(self):
+        from formscout.serving.llama_cpp import LlamaCppClient
+        assert LlamaCppClient._parse_json_reply('{"score": 3}') == {"score": 3}
+    def test_parse_fenced_json(self):
+        from formscout.serving.llama_cpp import LlamaCppClient
+        fenced = '```json\n{"score": 2, "needs_human": false}\n```'
+        assert LlamaCppClient._parse_json_reply(fenced) == {"score": 2, "needs_human": False}
+    def test_parse_non_json_returns_text(self):
+        from formscout.serving.llama_cpp import LlamaCppClient
+        assert LlamaCppClient._parse_json_reply("not json") == {"text": "not json"}
+    def test_complete_posts_chat_endpoint_with_images(self):
+        from unittest.mock import MagicMock, patch
+        from formscout.serving.llama_cpp import LlamaCppClient
+        client = LlamaCppClient(port=8080)
+        resp = MagicMock()
+        resp.json.return_value = {"choices": [{"message": {"content": '{"ok": true}'}}]}
+        resp.raise_for_status.return_value = None
+        with patch("formscout.serving.llama_cpp.requests.post", return_value=resp) as mock_post:
+            result = client.complete("score this", images=["aGVsbG8=" * 600])
+        assert result == {"ok": True}
+        url = mock_post.call_args.args[0] if mock_post.call_args.args else mock_post.call_args.kwargs.get("url")
+        assert url.endswith("/v1/chat/completions")
+        payload = mock_post.call_args.kwargs["json"]
+        content = payload["messages"][0]["content"]
+        assert content[0] == {"type": "text", "text": "score this"}
+        assert content[1]["type"] == "image_url"
+        assert content[1]["image_url"]["url"].startswith("data:image/jpeg;base64,")
+    def test_complete_connection_error_returns_safe_dict(self):
+        from unittest.mock import patch
+        import requests as _requests
+        from formscout.serving.llama_cpp import LlamaCppClient
+        client = LlamaCppClient(port=8080)
+        with patch("formscout.serving.llama_cpp.requests.post", side_effect=_requests.ConnectionError):
+            result = client.complete("hello")
+        assert "error" in result
+# ─── ReportAgent ──────────────────────────────────────────────────────────────
+class TestReportAgent:
+    def test_single_test_report(self):
+        agent = ReportAgent()
+        entries = [{
+            "movement": MovementResult(test_name="deep_squat", side="na", confidence=1.0),
+            "features": _make_features("deep_squat"),
+            "rubric_score": ScoreResult(score=3, rationale="ok", confidence=0.9),
+            "judge": JudgeResult(
+                score=3, rationale="good", compensation_tags=[], corrective_hint="",
+                confidence=0.9,
+            ),
+            "side": "na",
+        }]
+        result = agent.run(entries)
+        assert isinstance(result, ReportResult)
+        assert len(result.per_test) == 1
+        assert result.per_test[0]["score"] == 3
+    def test_bilateral_reports_lower_score(self):
+        agent = ReportAgent()
+        entries = [
+            {
+                "movement": MovementResult(test_name="hurdle_step", side="left", confidence=1.0),
+                "features": _make_features("hurdle_step", side="left"),
+                "rubric_score": ScoreResult(score=3, rationale="ok", confidence=0.9),
+                "judge": JudgeResult(
+                    score=3, rationale="", compensation_tags=[], corrective_hint="",
+                    confidence=0.9,
+                ),
+                "side": "left",
+            },
+            {
+                "movement": MovementResult(test_name="hurdle_step", side="right", confidence=1.0),
+                "features": _make_features("hurdle_step", side="right"),
+                "rubric_score": ScoreResult(score=2, rationale="comp", confidence=0.8),
+                "judge": JudgeResult(
+                    score=2, rationale="", compensation_tags=[], corrective_hint="",
+                    confidence=0.8,
+                ),
+                "side": "right",
+            },
+        ]
+        result = agent.run(entries)
+        assert result.per_test[0]["score"] == 2  # lower of 3 and 2
+        assert len(result.asymmetries) == 1
+        assert result.asymmetries[0]["delta"] == 1
+    def test_composite_none_when_unscored(self):
+        agent = ReportAgent()
+        entries = [{
+            "movement": MovementResult(test_name="deep_squat", side="na", confidence=1.0),
+            "features": _make_features("deep_squat"),
+            "rubric_score": ScoreResult(score=1, rationale="", confidence=0.5),
+            "judge": JudgeResult(
+                score=None, rationale="pain", compensation_tags=[], corrective_hint="",
+                confidence=0.0, needs_human=True,
+            ),
+            "side": "na",
+        }]
+        result = agent.run(entries)
+        assert result.composite is None

tests/test_session.py CHANGED Viewed

@@ -1,94 +1,94 @@
-"""Tests for the FMS session accumulator — no GPU, no model downloads."""
-import numpy as np
-from formscout.types import (
-    IngestResult, Pose2DResult, BiomechFeatures, ScoreResult, JudgeResult,
-    MovementResult, SessionEntry,
-)
-def test_session_entry_holds_typed_objects():
-    movement = MovementResult(test_name="deep_squat", side="na", confidence=1.0)
-    features = BiomechFeatures(
-        test_name="deep_squat", view="2d", side="na",
-        angles={"left_knee_flexion_deg": 95.0}, alignments={"knees_tracking_over_feet": True},
-        symmetry_delta=None, timing={"deepest_frame": 2}, confidence=0.9,
-    )
-    rubric = ScoreResult(score=2, rationale="ok", confidence=0.8)
-    judge = JudgeResult(score=2, rationale="ok", compensation_tags=["heels elevated"],
-                        corrective_hint="ankle mobility", confidence=0.85)
-    entry = SessionEntry(
-        test_name="deep_squat", side="na", score=2, needs_human=False,
-        rationale="ok", compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
-        measurements={"left_knee_flexion_deg": 95.0}, confidence=0.85, view="2d",
-        keyframe_path=None, movement=movement, features=features,
-        rubric_score=rubric, judge=judge,
-    )
-    assert entry.score == 2
-    assert entry.movement.test_name == "deep_squat"
-    assert entry.rubric_score.score == 2
-    assert entry.judge.compensation_tags == ["heels elevated"]
-def _ingest(n=5, h=480, w=640):
-    frames = [np.zeros((h, w, 3), dtype=np.uint8) for _ in range(n)]
-    return IngestResult(frames=frames, fps=30.0, duration=n / 30.0, n_people=1, width=w, height=h)
-def _pose(n=5):
-    kps = []
-    for i in range(n):
-        kps.append({j: {"x": float(50 + j * 25), "y": float(80 + j * 18), "conf": 0.9}
-                    for j in range(17)})
-    return Pose2DResult(keypoints=kps, fps=30.0, confidence=0.9)
-def _features(test_name="deep_squat", side="na", frame_key="deepest_frame"):
-    return BiomechFeatures(
-        test_name=test_name, view="2d", side=side,
-        angles={"left_knee_flexion_deg": 95.0},
-        alignments={"knees_tracking_over_feet": False},
-        symmetry_delta=None, timing={frame_key: 2}, confidence=0.9,
-    )
-def _judge(score=2, needs_human=False):
-    return JudgeResult(
-        score=None if needs_human else score, rationale="r",
-        compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
-        confidence=0.85, needs_human=needs_human,
-    )
-def test_add_analysis_appends_entry_and_writes_files():
-    import os
-    from formscout import session as S
-    sess = S.new_session()
-    entry = S.add_analysis(sess, ingest=_ingest(), pose2d=_pose(),
-                           features=_features(), judge=_judge(), test_name="deep_squat", side="na")
-    assert len(sess.entries) == 1
-    assert entry.score == 2
-    assert os.path.exists(os.path.join(sess.session_dir, "session.json"))
-    assert os.path.exists(os.path.join(sess.session_dir, "analysis.md"))
-    # key-frame still written (deepest_frame=2 is valid)
-    assert entry.keyframe_path and os.path.exists(entry.keyframe_path)
-def test_finish_composite_null_when_needs_human():
-    from formscout import session as S
-    sess = S.new_session()
-    S.add_analysis(sess, ingest=_ingest(), pose2d=_pose(), features=_features(),
-                   judge=_judge(score=3), test_name="deep_squat", side="na")
-    S.add_analysis(sess, ingest=_ingest(), pose2d=_pose(),
-                   features=_features("trunk_stability_pushup", frame_key="max_sag_frame"),
-                   judge=_judge(needs_human=True), test_name="trunk_stability_pushup", side="na")
-    report, pdf_path = S.finish_session(sess)
-    assert report is not None
-    assert report.composite is None  # one test needs_human
-def test_finish_empty_session_returns_none():
-    from formscout import session as S
-    sess = S.new_session()
-    report, pdf_path = S.finish_session(sess)
-    assert report is None and pdf_path is None

+"""Tests for the FMS session accumulator — no GPU, no model downloads."""
+import numpy as np
+from formscout.types import (
+    IngestResult, Pose2DResult, BiomechFeatures, ScoreResult, JudgeResult,
+    MovementResult, SessionEntry,
+)
+def test_session_entry_holds_typed_objects():
+    movement = MovementResult(test_name="deep_squat", side="na", confidence=1.0)
+    features = BiomechFeatures(
+        test_name="deep_squat", view="2d", side="na",
+        angles={"left_knee_flexion_deg": 95.0}, alignments={"knees_tracking_over_feet": True},
+        symmetry_delta=None, timing={"deepest_frame": 2}, confidence=0.9,
+    )
+    rubric = ScoreResult(score=2, rationale="ok", confidence=0.8)
+    judge = JudgeResult(score=2, rationale="ok", compensation_tags=["heels elevated"],
+                        corrective_hint="ankle mobility", confidence=0.85)
+    entry = SessionEntry(
+        test_name="deep_squat", side="na", score=2, needs_human=False,
+        rationale="ok", compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
+        measurements={"left_knee_flexion_deg": 95.0}, confidence=0.85, view="2d",
+        keyframe_path=None, movement=movement, features=features,
+        rubric_score=rubric, judge=judge,
+    )
+    assert entry.score == 2
+    assert entry.movement.test_name == "deep_squat"
+    assert entry.rubric_score.score == 2
+    assert entry.judge.compensation_tags == ["heels elevated"]
+def _ingest(n=5, h=480, w=640):
+    frames = [np.zeros((h, w, 3), dtype=np.uint8) for _ in range(n)]
+    return IngestResult(frames=frames, fps=30.0, duration=n / 30.0, n_people=1, width=w, height=h)
+def _pose(n=5):
+    kps = []
+    for i in range(n):
+        kps.append({j: {"x": float(50 + j * 25), "y": float(80 + j * 18), "conf": 0.9}
+                    for j in range(17)})
+    return Pose2DResult(keypoints=kps, fps=30.0, confidence=0.9)
+def _features(test_name="deep_squat", side="na", frame_key="deepest_frame"):
+    return BiomechFeatures(
+        test_name=test_name, view="2d", side=side,
+        angles={"left_knee_flexion_deg": 95.0},
+        alignments={"knees_tracking_over_feet": False},
+        symmetry_delta=None, timing={frame_key: 2}, confidence=0.9,
+    )
+def _judge(score=2, needs_human=False):
+    return JudgeResult(
+        score=None if needs_human else score, rationale="r",
+        compensation_tags=["heels elevated"], corrective_hint="ankle mobility",
+        confidence=0.85, needs_human=needs_human,
+    )
+def test_add_analysis_appends_entry_and_writes_files():
+    import os
+    from formscout import session as S
+    sess = S.new_session()
+    entry = S.add_analysis(sess, ingest=_ingest(), pose2d=_pose(),
+                           features=_features(), judge=_judge(), test_name="deep_squat", side="na")
+    assert len(sess.entries) == 1
+    assert entry.score == 2
+    assert os.path.exists(os.path.join(sess.session_dir, "session.json"))
+    assert os.path.exists(os.path.join(sess.session_dir, "analysis.md"))
+    # key-frame still written (deepest_frame=2 is valid)
+    assert entry.keyframe_path and os.path.exists(entry.keyframe_path)
+def test_finish_composite_null_when_needs_human():
+    from formscout import session as S
+    sess = S.new_session()
+    S.add_analysis(sess, ingest=_ingest(), pose2d=_pose(), features=_features(),
+                   judge=_judge(score=3), test_name="deep_squat", side="na")
+    S.add_analysis(sess, ingest=_ingest(), pose2d=_pose(),
+                   features=_features("trunk_stability_pushup", frame_key="max_sag_frame"),
+                   judge=_judge(needs_human=True), test_name="trunk_stability_pushup", side="na")
+    report, pdf_path = S.finish_session(sess)
+    assert report is not None
+    assert report.composite is None  # one test needs_human
+def test_finish_empty_session_returns_none():
+    from formscout import session as S
+    sess = S.new_session()
+    report, pdf_path = S.finish_session(sess)
+    assert report is None and pdf_path is None

tests/test_visualizer.py CHANGED Viewed

@@ -1,176 +1,176 @@
-"""Tests for PoseVisualizer — no GPU, no model downloads."""
-import numpy as np
-import pytest
-from formscout.types import IngestResult, Pose2DResult
-def _make_ingest(n=5, h=480, w=640, fps=30.0):
-    frames = [np.zeros((h, w, 3), dtype=np.uint8) for _ in range(n)]
-    return IngestResult(frames=frames, fps=fps, duration=n / fps, n_people=1, width=w, height=h)
-def _make_pose(n=5, w=640, h=480):
-    """Synthetic Pose2DResult: 17 joints at fixed pixel positions, conf=0.9."""
-    kps_per_frame = []
-    for i in range(n):
-        frame_kps = {}
-        for j in range(17):
-            frame_kps[j] = {
-                "x": float(50 + j * 30 + i * 2),
-                "y": float(100 + j * 20),
-                "conf": 0.9,
-            }
-        kps_per_frame.append(frame_kps)
-    return Pose2DResult(keypoints=kps_per_frame, fps=30.0, confidence=0.9, notes="")
-class TestComputeJointVelocity:
-    def test_returns_17_joints(self):
-        from formscout.agents.visualizer import compute_joint_velocity
-        pose = _make_pose(n=5)
-        result = compute_joint_velocity(pose.keypoints, fps=30.0)
-        assert len(result) == 17
-    def test_each_list_has_n_frames(self):
-        from formscout.agents.visualizer import compute_joint_velocity
-        pose = _make_pose(n=5)
-        result = compute_joint_velocity(pose.keypoints, fps=30.0)
-        for joint_idx, speeds in result.items():
-            assert len(speeds) == 5, f"joint {joint_idx} has {len(speeds)} speeds, expected 5"
-    def test_speeds_are_non_negative(self):
-        from formscout.agents.visualizer import compute_joint_velocity
-        pose = _make_pose(n=5)
-        result = compute_joint_velocity(pose.keypoints, fps=30.0)
-        for speeds in result.values():
-            assert all(s >= 0.0 for s in speeds)
-    def test_missing_keypoints_give_zero_speed(self):
-        from formscout.agents.visualizer import compute_joint_velocity
-        empty_kps = [{} for _ in range(5)]
-        result = compute_joint_velocity(empty_kps, fps=30.0)
-        for speeds in result.values():
-            assert all(s == 0.0 for s in speeds)
-class TestDrawSkeleton:
-    def test_skeleton_draws_without_error(self):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        frame = np.zeros((480, 640, 3), dtype=np.uint8)
-        kps = {j: {"x": float(50 + j * 30), "y": float(100 + j * 20), "conf": 0.9}
-               for j in range(17)}
-        result = vis._draw_skeleton(frame.copy(), kps)
-        assert result.shape == frame.shape
-        assert not np.array_equal(result, frame)
-    def test_low_confidence_keypoints_not_drawn(self):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        frame = np.zeros((480, 640, 3), dtype=np.uint8)
-        kps = {j: {"x": float(50 + j * 30), "y": 100.0, "conf": 0.1} for j in range(17)}
-        result = vis._draw_skeleton(frame.copy(), kps)
-        assert np.array_equal(result, frame)
-class TestDrawTrails:
-    def test_trails_draw_without_error(self):
-        from formscout.agents.visualizer import PoseVisualizer, TRAIL_LENGTH
-        from collections import deque
-        vis = PoseVisualizer()
-        frame = np.zeros((480, 640, 3), dtype=np.uint8)
-        trail_history = {
-            0: deque([(100 + i * 5, 200 + i * 3) for i in range(5)], maxlen=TRAIL_LENGTH)
-        }
-        result = vis._draw_trails(frame.copy(), trail_history)
-        assert result.shape == frame.shape
-        assert not np.array_equal(result, frame)
-    def test_short_trail_no_crash(self):
-        from formscout.agents.visualizer import PoseVisualizer, TRAIL_LENGTH
-        from collections import deque
-        vis = PoseVisualizer()
-        frame = np.zeros((480, 640, 3), dtype=np.uint8)
-        trail_history = {0: deque([(100, 200)], maxlen=TRAIL_LENGTH)}
-        result = vis._draw_trails(frame.copy(), trail_history)
-        assert np.array_equal(result, frame)
-class TestDrawVelocityArrows:
-    def test_arrows_draw_without_error(self):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        frame = np.zeros((480, 640, 3), dtype=np.uint8)
-        kps = {j: {"x": float(50 + j * 30), "y": float(100 + j * 20), "conf": 0.9}
-               for j in range(17)}
-        prev_kps = {j: {"x": float(48 + j * 30), "y": float(98 + j * 20), "conf": 0.9}
-                    for j in range(17)}
-        velocities = {j: [0.0] * 5 for j in range(17)}
-        velocities[5] = [0.0, 10.0, 50.0, 80.0, 120.0]
-        result = vis._draw_velocity_arrows(frame.copy(), kps, prev_kps, velocities, frame_idx=4)
-        assert result.shape == frame.shape
-    def test_no_prev_kps_no_crash(self):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        frame = np.zeros((480, 640, 3), dtype=np.uint8)
-        kps = {j: {"x": float(50 + j * 30), "y": 100.0, "conf": 0.9} for j in range(17)}
-        velocities = {j: [50.0] * 5 for j in range(17)}
-        result = vis._draw_velocity_arrows(frame.copy(), kps, None, velocities, frame_idx=0)
-        assert result.shape == frame.shape
-class TestRenderVideo:
-    def test_creates_mp4_file(self, tmp_path):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        ingest = _make_ingest(n=5)
-        pose = _make_pose(n=5)
-        out = str(tmp_path / "out.mp4")
-        result = vis.render_video(ingest, pose, {"skeleton"}, out)
-        assert result is not None
-        import os
-        assert os.path.exists(result)
-        assert os.path.getsize(result) > 0
-    def test_empty_layers_returns_none(self, tmp_path):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        out = str(tmp_path / "out.mp4")
-        result = vis.render_video(_make_ingest(), _make_pose(), set(), out)
-        assert result is None
-    def test_no_detections_returns_none(self, tmp_path):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        ingest = _make_ingest(n=5)
-        empty_pose = Pose2DResult(
-            keypoints=[{} for _ in range(5)], fps=30.0, confidence=0.0, notes=""
-        )
-        out = str(tmp_path / "out.mp4")
-        result = vis.render_video(ingest, empty_pose, {"skeleton"}, out)
-        assert result is None
-    def test_last_velocities_set_after_render(self, tmp_path):
-        from formscout.agents.visualizer import PoseVisualizer
-        vis = PoseVisualizer()
-        out = str(tmp_path / "out.mp4")
-        vis.render_video(_make_ingest(n=5), _make_pose(n=5), {"skeleton"}, out)
-        assert len(vis.last_velocities) == 17
-class TestBuildVelocitySummary:
-    def test_returns_markdown_table(self):
-        from formscout.agents.visualizer import build_velocity_summary, compute_joint_velocity
-        pose = _make_pose(n=10)
-        vels = compute_joint_velocity(pose.keypoints, fps=30.0)
-        result = build_velocity_summary(pose.keypoints, vels)
-        assert "|" in result
-        assert any(name in result for name in ["knee", "shoulder", "hip", "ankle"])
-    def test_empty_keypoints_returns_empty_string(self):
-        from formscout.agents.visualizer import build_velocity_summary
-        empty_kps = [{} for _ in range(5)]
-        vels = {j: [0.0] * 5 for j in range(17)}
-        result = build_velocity_summary(empty_kps, vels)
-        assert result == ""

+"""Tests for PoseVisualizer — no GPU, no model downloads."""
+import numpy as np
+import pytest
+from formscout.types import IngestResult, Pose2DResult
+def _make_ingest(n=5, h=480, w=640, fps=30.0):
+    frames = [np.zeros((h, w, 3), dtype=np.uint8) for _ in range(n)]
+    return IngestResult(frames=frames, fps=fps, duration=n / fps, n_people=1, width=w, height=h)
+def _make_pose(n=5, w=640, h=480):
+    """Synthetic Pose2DResult: 17 joints at fixed pixel positions, conf=0.9."""
+    kps_per_frame = []
+    for i in range(n):
+        frame_kps = {}
+        for j in range(17):
+            frame_kps[j] = {
+                "x": float(50 + j * 30 + i * 2),
+                "y": float(100 + j * 20),
+                "conf": 0.9,
+            }
+        kps_per_frame.append(frame_kps)
+    return Pose2DResult(keypoints=kps_per_frame, fps=30.0, confidence=0.9, notes="")
+class TestComputeJointVelocity:
+    def test_returns_17_joints(self):
+        from formscout.agents.visualizer import compute_joint_velocity
+        pose = _make_pose(n=5)
+        result = compute_joint_velocity(pose.keypoints, fps=30.0)
+        assert len(result) == 17
+    def test_each_list_has_n_frames(self):
+        from formscout.agents.visualizer import compute_joint_velocity
+        pose = _make_pose(n=5)
+        result = compute_joint_velocity(pose.keypoints, fps=30.0)
+        for joint_idx, speeds in result.items():
+            assert len(speeds) == 5, f"joint {joint_idx} has {len(speeds)} speeds, expected 5"
+    def test_speeds_are_non_negative(self):
+        from formscout.agents.visualizer import compute_joint_velocity
+        pose = _make_pose(n=5)
+        result = compute_joint_velocity(pose.keypoints, fps=30.0)
+        for speeds in result.values():
+            assert all(s >= 0.0 for s in speeds)
+    def test_missing_keypoints_give_zero_speed(self):
+        from formscout.agents.visualizer import compute_joint_velocity
+        empty_kps = [{} for _ in range(5)]
+        result = compute_joint_velocity(empty_kps, fps=30.0)
+        for speeds in result.values():
+            assert all(s == 0.0 for s in speeds)
+class TestDrawSkeleton:
+    def test_skeleton_draws_without_error(self):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        frame = np.zeros((480, 640, 3), dtype=np.uint8)
+        kps = {j: {"x": float(50 + j * 30), "y": float(100 + j * 20), "conf": 0.9}
+               for j in range(17)}
+        result = vis._draw_skeleton(frame.copy(), kps)
+        assert result.shape == frame.shape
+        assert not np.array_equal(result, frame)
+    def test_low_confidence_keypoints_not_drawn(self):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        frame = np.zeros((480, 640, 3), dtype=np.uint8)
+        kps = {j: {"x": float(50 + j * 30), "y": 100.0, "conf": 0.1} for j in range(17)}
+        result = vis._draw_skeleton(frame.copy(), kps)
+        assert np.array_equal(result, frame)
+class TestDrawTrails:
+    def test_trails_draw_without_error(self):
+        from formscout.agents.visualizer import PoseVisualizer, TRAIL_LENGTH
+        from collections import deque
+        vis = PoseVisualizer()
+        frame = np.zeros((480, 640, 3), dtype=np.uint8)
+        trail_history = {
+            0: deque([(100 + i * 5, 200 + i * 3) for i in range(5)], maxlen=TRAIL_LENGTH)
+        }
+        result = vis._draw_trails(frame.copy(), trail_history)
+        assert result.shape == frame.shape
+        assert not np.array_equal(result, frame)
+    def test_short_trail_no_crash(self):
+        from formscout.agents.visualizer import PoseVisualizer, TRAIL_LENGTH
+        from collections import deque
+        vis = PoseVisualizer()
+        frame = np.zeros((480, 640, 3), dtype=np.uint8)
+        trail_history = {0: deque([(100, 200)], maxlen=TRAIL_LENGTH)}
+        result = vis._draw_trails(frame.copy(), trail_history)
+        assert np.array_equal(result, frame)
+class TestDrawVelocityArrows:
+    def test_arrows_draw_without_error(self):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        frame = np.zeros((480, 640, 3), dtype=np.uint8)
+        kps = {j: {"x": float(50 + j * 30), "y": float(100 + j * 20), "conf": 0.9}
+               for j in range(17)}
+        prev_kps = {j: {"x": float(48 + j * 30), "y": float(98 + j * 20), "conf": 0.9}
+                    for j in range(17)}
+        velocities = {j: [0.0] * 5 for j in range(17)}
+        velocities[5] = [0.0, 10.0, 50.0, 80.0, 120.0]
+        result = vis._draw_velocity_arrows(frame.copy(), kps, prev_kps, velocities, frame_idx=4)
+        assert result.shape == frame.shape
+    def test_no_prev_kps_no_crash(self):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        frame = np.zeros((480, 640, 3), dtype=np.uint8)
+        kps = {j: {"x": float(50 + j * 30), "y": 100.0, "conf": 0.9} for j in range(17)}
+        velocities = {j: [50.0] * 5 for j in range(17)}
+        result = vis._draw_velocity_arrows(frame.copy(), kps, None, velocities, frame_idx=0)
+        assert result.shape == frame.shape
+class TestRenderVideo:
+    def test_creates_mp4_file(self, tmp_path):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        ingest = _make_ingest(n=5)
+        pose = _make_pose(n=5)
+        out = str(tmp_path / "out.mp4")
+        result = vis.render_video(ingest, pose, {"skeleton"}, out)
+        assert result is not None
+        import os
+        assert os.path.exists(result)
+        assert os.path.getsize(result) > 0
+    def test_empty_layers_returns_none(self, tmp_path):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        out = str(tmp_path / "out.mp4")
+        result = vis.render_video(_make_ingest(), _make_pose(), set(), out)
+        assert result is None
+    def test_no_detections_returns_none(self, tmp_path):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        ingest = _make_ingest(n=5)
+        empty_pose = Pose2DResult(
+            keypoints=[{} for _ in range(5)], fps=30.0, confidence=0.0, notes=""
+        )
+        out = str(tmp_path / "out.mp4")
+        result = vis.render_video(ingest, empty_pose, {"skeleton"}, out)
+        assert result is None
+    def test_last_velocities_set_after_render(self, tmp_path):
+        from formscout.agents.visualizer import PoseVisualizer
+        vis = PoseVisualizer()
+        out = str(tmp_path / "out.mp4")
+        vis.render_video(_make_ingest(n=5), _make_pose(n=5), {"skeleton"}, out)
+        assert len(vis.last_velocities) == 17
+class TestBuildVelocitySummary:
+    def test_returns_markdown_table(self):
+        from formscout.agents.visualizer import build_velocity_summary, compute_joint_velocity
+        pose = _make_pose(n=10)
+        vels = compute_joint_velocity(pose.keypoints, fps=30.0)
+        result = build_velocity_summary(pose.keypoints, vels)
+        assert "|" in result
+        assert any(name in result for name in ["knee", "shoulder", "hip", "ankle"])
+    def test_empty_keypoints_returns_empty_string(self):
+        from formscout.agents.visualizer import build_velocity_summary
+        empty_kps = [{} for _ in range(5)]
+        vels = {j: [0.0] * 5 for j in range(17)}
+        result = build_velocity_summary(empty_kps, vels)
+        assert result == ""