upodate Claude code and hf upload file

#4
by BladeSzaSza - opened
.gitattributes CHANGED
@@ -35,4 +35,3 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  docs/FormScout-FMS-Spec.md.pdf filter=lfs diff=lfs merge=lfs -text
37
  docs/plans/FormScout-Build-Prompt.md.pdf filter=lfs diff=lfs merge=lfs -text
38
- checkpoints/mediapipe/pose_landmarker_full.task filter=lfs diff=lfs merge=lfs -text
 
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  docs/FormScout-FMS-Spec.md.pdf filter=lfs diff=lfs merge=lfs -text
37
  docs/plans/FormScout-Build-Prompt.md.pdf filter=lfs diff=lfs merge=lfs -text
 
.hfignore ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *.egg-info/
5
+ dist/
6
+ build/
7
+ .eggs/
8
+ *.egg
9
+
10
+ # Virtual environments
11
+ .venv/
12
+ venv/
13
+ env/
14
+
15
+ # Secrets / local config
16
+ .env
17
+ .env.*
18
+
19
+ # Model weights (managed separately)
20
+ checkpoints/
21
+ *.pt
22
+ *.pth
23
+ *.gguf
24
+ *.bin
25
+
26
+ # Run artifacts
27
+ traces/
28
+ *.mp4
29
+
30
+ # Dev tooling
31
+ .pytest_cache/
32
+ .ruff_cache/
33
+ .DS_Store
34
+ .claude/
35
+
36
+ # Git
37
+ .git/
CLAUDE.md CHANGED
@@ -6,7 +6,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
6
 
7
  FormScout is a Gradio app (Hugging Face Space) that scores Functional Movement Screen (FMS) videos 0–3 per test with a written rationale and an annotated overlay. It is a **screening aid** β€” not a diagnosis, not an injury predictor. Built for the Build Small Hackathon (Backyard AI track). Full product spec is in `docs/FormScout-FMS-Spec.md`; the engineering contract is in `docs/plans/FormScout-Build-Prompt.md`.
8
 
9
- **Current status:** Phase 2 complete. All 7 FMS test rubric scorers, JudgeAgent, MovementClassifierAgent, and ReportAgent are implemented and tested (45/46 passing). Phase 3 is next (ST-GCN fine-tune + RAG retrieval).
10
 
11
  ## Common commands
12
 
@@ -27,6 +27,12 @@ pytest tests/test_biomechanics.py::TestBiomechanicsAgent::test_deep_squat_score
27
  # Lint / format
28
  ruff check . && ruff format .
29
 
 
 
 
 
 
 
30
  # Run Svelte component tests (when frontend work is added)
31
  npx vitest run
32
  ```
@@ -43,7 +49,9 @@ IngestAgent β†’ Pose2DAgent β†’ [Body3DAgent β€” optional]
43
  β†’ rubric/score_test() β†’ JudgeAgent β†’ ReportAgent
44
  ```
45
 
46
- The **Director** (`pipeline.py`) owns the flow. `app.py` creates one `Director()` instance and calls `director.run(video_path, test_name, side)` per submission. The Gradio UI passes `test_name` directly (from dropdown), bypassing the classifier.
 
 
47
 
48
  ### The tiering rule (most important invariant)
49
 
@@ -53,7 +61,7 @@ The **Director** (`pipeline.py`) owns the flow. `app.py` creates one `Director()
53
 
54
  | Flag | Default | Meaning |
55
  |------|---------|---------|
56
- | `ENABLE_JUDGE` | `False` | When False, JudgeAgent falls back to rubric score β€” no llama.cpp needed |
57
  | `ENABLE_3D` | `False` | When False, Body3DAgent returns `used=False` immediately |
58
  | `ENABLE_STGCN` | `False` | Phase 3 β€” ST-GCN learned scoring head |
59
  | `ENABLE_RAG` | `False` | Phase 3 β€” RetrievalAgent exemplar lookup |
@@ -66,6 +74,8 @@ All model IDs, thresholds, k-values, and feature flags live in `config.py` β€” n
66
  2. `ENABLE_JUDGE=True` + llama.cpp server unreachable β†’ same fallback, logs a warning
67
  3. `ENABLE_JUDGE=True` + server available β†’ calls Qwen3-VL-8B-Instruct at `127.0.0.1:8080`
68
 
 
 
69
  This means the app is **fully functional without any GPU or llama.cpp** β€” rubric scoring is pure Python.
70
 
71
  ### Rubric scorers
@@ -99,14 +109,20 @@ Every agent I/O is a frozen dataclass from `formscout/types.py`. Key types:
99
 
100
  `MovementResult` and `JudgeResult` validate their fields in `__post_init__` β€” passing invalid values raises immediately.
101
 
102
- ### YOLO checkpoint location
103
 
104
- `config.YOLO_POSE_MODEL` points to `checkpoints/yolo26/yolo26l-pose.pt` (absolute path). Both `yolo26l-pose.pt` and `yolo26x-pose.pt` are committed to the repo. Models load once at module scope via `_get_model()` in `pose2d.py`.
 
 
105
 
106
  ### llama.cpp serving
107
 
108
  `formscout/serving/llama_cpp.py` provides `LlamaCppClient` (VLM, port 8080) and `EmbeddingClient` (embeddings, port 8081). Both check `/health` before use and return safe error dicts when unavailable. Only active when the corresponding `ENABLE_*` flag is True.
109
 
 
 
 
 
110
  ## Key constraints and invariants
111
 
112
  - **No cloud model APIs.** All inference runs on-Space (ZeroGPU). No OpenAI/Anthropic/Gemini calls.
@@ -132,13 +148,13 @@ Every agent I/O is a frozen dataclass from `formscout/types.py`. Key types:
132
 
133
  | Component | Model | Params | Status |
134
  |---|---|---|---|
135
- | 2D pose (primary) | YOLO26l-Pose | 0.026B | Ready (checkpoint committed) |
136
- | 2D pose (HQ alt) | YOLO26x-Pose | 0.058B | Ready (checkpoint committed) |
137
- | 2D pose (fallback) | `noahcao/sapiens-pose-coco` | ~0.6B | Access accepted |
138
  | Segmentation | SAM 3.1 base | ~0.85B | Access accepted |
139
  | 3D biomechanics | `facebook/sam-3d-body-dinov3` | ~0.84B | **Access ACCEPTED Jun 4 2026** |
140
  | Learned scoring | ST-GCN (pyskl) | ~0.03B | Phase 3 |
141
- | Judge + Classifier | Qwen3-VL-8B-Instruct (llama.cpp) | 8B | Ready (ENABLE_JUDGE=False for now) |
142
  | Retrieval | Qwen3-VL-Embedding-8B (llama.cpp) | 8B | Phase 3 |
143
 
144
  Track the running sum in `MODEL_BUDGET.md`. The two Qwen3-VL-8B models share a backbone.
@@ -156,7 +172,7 @@ The UI uses **Gradio `gr.Blocks`** with custom CSS/theme (`formscout/ui/theme.py
156
  2. **Phase 1 β€” Spine:** βœ… Complete. Deep Squat end-to-end.
157
  3. **Phase 2 β€” All 7 tests:** βœ… Complete. Classifier, Judge, Report agents; all rubric scorers; Gradio UI.
158
  4. **Phase 3 β€” Learned scoring + retrieval:** ST-GCN fine-tune on physio clips, publish to Hub. RetrievalAgent with embedding index.
159
- 5. **Phase 4 β€” Polish + ship:** Custom Svelte UI components, overlay video, PDF export, agent trace to Hub, blog post.
160
 
161
  ## Known issues
162
 
 
6
 
7
  FormScout is a Gradio app (Hugging Face Space) that scores Functional Movement Screen (FMS) videos 0–3 per test with a written rationale and an annotated overlay. It is a **screening aid** β€” not a diagnosis, not an injury predictor. Built for the Build Small Hackathon (Backyard AI track). Full product spec is in `docs/FormScout-FMS-Spec.md`; the engineering contract is in `docs/plans/FormScout-Build-Prompt.md`.
8
 
9
+ **Current status:** Phase 2 complete. All 7 FMS test rubric scorers, JudgeAgent, MovementClassifierAgent, ReportAgent, PoseVisualizer (overlay video), and a user-selectable pose-model registry are implemented and tested (86/87 passing). Phase 3 is next (ST-GCN fine-tune + RAG retrieval).
10
 
11
  ## Common commands
12
 
 
27
  # Lint / format
28
  ruff check . && ruff format .
29
 
30
+ # Start the local VLM judge server (llama.cpp, port 8080)
31
+ ./scripts/serve_judge.sh
32
+
33
+ # Push source tree to the HF model repo + Space (PRs; message from last commit)
34
+ ./scripts/hf_upload.sh
35
+
36
  # Run Svelte component tests (when frontend work is added)
37
  npx vitest run
38
  ```
 
49
  β†’ rubric/score_test() β†’ JudgeAgent β†’ ReportAgent
50
  ```
51
 
52
+ The **Director** (`pipeline.py`) owns the flow. `app.py` creates one `Director()` instance and calls `director.run(video_path, test_name, side, model_key)` per submission. The Gradio UI passes `test_name` directly (from dropdown), bypassing the classifier; `model_key` selects the pose backend from `config.POSE_MODELS`.
53
+
54
+ `PoseVisualizer` (`formscout/agents/visualizer.py`) renders the annotated overlay video (skeleton, trails, velocity arrows) from `IngestResult` + `Pose2DResult`. It is called from `app.py` after the pipeline run β€” it is a UI-layer component, not a Director stage. It returns `None` on failure, never raises.
55
 
56
  ### The tiering rule (most important invariant)
57
 
 
61
 
62
  | Flag | Default | Meaning |
63
  |------|---------|---------|
64
+ | `ENABLE_JUDGE` | `True` | Judge/Classifier call Qwen3-VL via llama-server; graceful rubric fallback when the server is down |
65
  | `ENABLE_3D` | `False` | When False, Body3DAgent returns `used=False` immediately |
66
  | `ENABLE_STGCN` | `False` | Phase 3 β€” ST-GCN learned scoring head |
67
  | `ENABLE_RAG` | `False` | Phase 3 β€” RetrievalAgent exemplar lookup |
 
74
  2. `ENABLE_JUDGE=True` + llama.cpp server unreachable β†’ same fallback, logs a warning
75
  3. `ENABLE_JUDGE=True` + server available β†’ calls Qwen3-VL-8B-Instruct at `127.0.0.1:8080`
76
 
77
+ Start the VLM server with `scripts/serve_judge.sh` (downloads live in `checkpoints/qwen3-vl/`, gitignored). To use a fine-tuned GGUF, set `FORMSCOUT_JUDGE_GGUF` (and `FORMSCOUT_JUDGE_MMPROJ` if it ships its own projector) β€” no code change needed. Multimodal requests go through the OpenAI-compatible `/v1/chat/completions` endpoint (the legacy `/completion` + `image_data` path does not work with modern llama-server).
78
+
79
  This means the app is **fully functional without any GPU or llama.cpp** β€” rubric scoring is pure Python.
80
 
81
  ### Rubric scorers
 
109
 
110
  `MovementResult` and `JudgeResult` validate their fields in `__post_init__` β€” passing invalid values raises immediately.
111
 
112
+ ### Pose model selection and checkpoints
113
 
114
+ `config.POSE_MODELS` is a registry of pose backends: MediaPipe (CPU-friendly), five YOLO26 sizes (n/s/m/l/x), and Sapiens2 variants (Phase 3, need the custom `sapiens` repo installed). `config.DEFAULT_POSE_MODEL` is YOLO26n. The Gradio UI exposes a dropdown built from `config.available_pose_models()` (filters to checkpoints actually present) and passes the chosen `model_key` through `Director.run` to `Pose2DAgent`. `config.YOLO_POSE_MODEL` is a backward-compat alias only.
115
+
116
+ Checkpoints are **not** committed (`checkpoints/` is gitignored). `formscout/startup.py:ensure_checkpoints()` downloads missing YOLO26/MediaPipe files from the `silas-therapy/formscout-checkpoints` HF repo once at app startup. Models load once per process and are cached β€” never inside the inference hot path.
117
 
118
  ### llama.cpp serving
119
 
120
  `formscout/serving/llama_cpp.py` provides `LlamaCppClient` (VLM, port 8080) and `EmbeddingClient` (embeddings, port 8081). Both check `/health` before use and return safe error dicts when unavailable. Only active when the corresponding `ENABLE_*` flag is True.
121
 
122
+ ### Deploying to Hugging Face
123
+
124
+ The repo deploys to both `silas-therapy/small-functional-movement-screening` (model repo) and the Space of the same name (README frontmatter is the Space config). Use `./scripts/hf_upload.sh` β€” never raw `hf upload .`: the `hf` CLI does **not** read `.hfignore`, so a raw upload hashes the entire `.venv` (~44k files) and pushes torch binaries. The script parses `.hfignore` into `--exclude` globs, preflights the file count, creates PRs on both repos, and auto-switches to `hf upload-large-folder` (resumable, but no PR / no commit message) above 500 files.
125
+
126
  ## Key constraints and invariants
127
 
128
  - **No cloud model APIs.** All inference runs on-Space (ZeroGPU). No OpenAI/Anthropic/Gemini calls.
 
148
 
149
  | Component | Model | Params | Status |
150
  |---|---|---|---|
151
+ | 2D pose (primary) | YOLO26-Pose n/s/m/l/x (default: n) | 0.0007–0.058B | Ready (auto-downloaded at startup) |
152
+ | 2D pose (CPU alt) | MediaPipe Pose Landmarker (full) | ~0.004B | Ready (auto-downloaded at startup) |
153
+ | 2D pose (HQ alt) | `facebook/sapiens2-pose-0.4b/0.8b/1b/5b` | 0.4–5B | Phase 3 β€” needs custom `sapiens` repo |
154
  | Segmentation | SAM 3.1 base | ~0.85B | Access accepted |
155
  | 3D biomechanics | `facebook/sam-3d-body-dinov3` | ~0.84B | **Access ACCEPTED Jun 4 2026** |
156
  | Learned scoring | ST-GCN (pyskl) | ~0.03B | Phase 3 |
157
+ | Judge + Classifier | Qwen3-VL-8B-Instruct (llama.cpp) | 8B | **Online** β€” `scripts/serve_judge.sh`, ENABLE_JUDGE=True |
158
  | Retrieval | Qwen3-VL-Embedding-8B (llama.cpp) | 8B | Phase 3 |
159
 
160
  Track the running sum in `MODEL_BUDGET.md`. The two Qwen3-VL-8B models share a backbone.
 
172
  2. **Phase 1 β€” Spine:** βœ… Complete. Deep Squat end-to-end.
173
  3. **Phase 2 β€” All 7 tests:** βœ… Complete. Classifier, Judge, Report agents; all rubric scorers; Gradio UI.
174
  4. **Phase 3 β€” Learned scoring + retrieval:** ST-GCN fine-tune on physio clips, publish to Hub. RetrievalAgent with embedding index.
175
+ 5. **Phase 4 β€” Polish + ship:** Custom Svelte UI components, PDF export, agent trace to Hub, blog post. (Overlay video already done via `PoseVisualizer`.)
176
 
177
  ## Known issues
178
 
MODEL_BUDGET.md CHANGED
@@ -10,7 +10,7 @@ Running sum must stay ≀ 32B params.
10
  | Segmentation | SAM 3.1 base | 0.85B |
11
  | 3D Body (optional) | SAM 3D Body DINOv3-H+ | 0.84B |
12
  | Scoring Head | ST-GCN (pyskl) | 0.03B |
13
- | Judge/Classifier | Qwen3-VL-8B-Instruct | 8B |
14
  | Retrieval | Qwen3-VL-Embedding-8B | 8B |
15
  | **Total** | | **~18.37B** |
16
 
@@ -18,3 +18,7 @@ Headroom: ~13.63B under 32B cap.
18
 
19
  Note: The two Qwen3-VL-8B models share a backbone (counted separately here for safety).
20
  Only one pose backend runs at a time (YOLO or Sapiens2, not both).
 
 
 
 
 
10
  | Segmentation | SAM 3.1 base | 0.85B |
11
  | 3D Body (optional) | SAM 3D Body DINOv3-H+ | 0.84B |
12
  | Scoring Head | ST-GCN (pyskl) | 0.03B |
13
+ | Judge/Classifier | Qwen3-VL-8B-Instruct (Q4_K_M GGUF + F16 mmproj, llama.cpp) | 8B |
14
  | Retrieval | Qwen3-VL-Embedding-8B | 8B |
15
  | **Total** | | **~18.37B** |
16
 
 
18
 
19
  Note: The two Qwen3-VL-8B models share a backbone (counted separately here for safety).
20
  Only one pose backend runs at a time (YOLO or Sapiens2, not both).
21
+
22
+ Judge/Classifier serving: `scripts/serve_judge.sh` (llama-server, port 8080).
23
+ Default GGUF: `Qwen/Qwen3-VL-8B-Instruct-GGUF` β†’ `checkpoints/qwen3-vl/` (gitignored).
24
+ Fine-tuned swap: set `FORMSCOUT_JUDGE_GGUF` (+ `FORMSCOUT_JUDGE_MMPROJ`) β€” no code change.
README.md CHANGED
@@ -1,51 +1,118 @@
1
- ---
2
- title: FormScout
3
- emoji: πŸ”οΈ
4
- colorFrom: green
5
- colorTo: green
6
- sdk: gradio
7
- app_file: app.py
8
- pinned: false
9
- license: apache-2.0
10
- short_description: FMS video scoring β€” movement screen aid
11
- ---
12
-
13
- # FormScout
14
-
15
- FMS (Functional Movement Screen) scoring pipeline β€” a screening aid that scores movement videos 0–3 per test with a written rationale and annotated overlay.
16
-
17
- **⚠️ Screening aid β€” not a diagnosis. Pain or clearing tests require a clinician.**
18
-
19
- ## Quick Start
20
-
21
- ```bash
22
- # Install dependencies
23
- pip install -r requirements.txt
24
-
25
- # Run headless on a video
26
- python -m formscout.run sample.mp4
27
-
28
- # Launch Gradio app
29
- python app.py
30
-
31
- # Run tests
32
- pytest tests/ -v
33
- ```
34
-
35
- ## Architecture
36
-
37
- Typed specialist agents orchestrated by a deterministic Director:
38
-
39
- ```
40
- Ingest β†’ Pose2D β†’ [Body3D optional] β†’ Biomechanics β†’ Rubric Score β†’ [Judge] β†’ Report
41
- ```
42
-
43
- See [CLAUDE.md](CLAUDE.md) for full architecture details.
44
-
45
- ## Model Budget
46
-
47
- ~18B params total (under 32B cap). See [MODEL_BUDGET.md](MODEL_BUDGET.md).
48
-
49
- ## License
50
-
51
- Built for the Build Small Hackathon (Backyard AI track).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: FormScout
3
+ emoji: πŸ”οΈ
4
+ colorFrom: green
5
+ colorTo: green
6
+ sdk: gradio
7
+ app_file: app.py
8
+ pinned: false
9
+ license: apache-2.0
10
+ short_description: FMS video scoring β€” movement screen aid
11
+ ---
12
+
13
+ # FormScout
14
+
15
+ FMS (Functional Movement Screen) scoring pipeline β€” a screening aid that scores movement videos 0–3 per test with a written rationale and annotated overlay.
16
+
17
+ **⚠️ Screening aid β€” not a diagnosis. Pain or clearing tests require a clinician.**
18
+
19
+ ## Running locally
20
+
21
+ ### 1. Clone and install
22
+
23
+ ```bash
24
+ git clone https://huggingface.co/silas-therapy/small-functional-movement-screening
25
+ cd small-functional-movement-screening
26
+ python3 -m venv .venv && source .venv/bin/activate
27
+ pip install -r requirements.txt
28
+ ```
29
+
30
+ ### 2. Start the VLM judge (optional but recommended)
31
+
32
+ The judge uses Qwen3-VL-8B-Instruct via llama.cpp. Without it the app falls back to the deterministic rubric score β€” fully functional, no GPU needed.
33
+
34
+ ```bash
35
+ # Install llama.cpp once
36
+ brew install llama.cpp
37
+
38
+ # Download the model (one-time, ~6 GB)
39
+ python3 -c "
40
+ from huggingface_hub import hf_hub_download
41
+ for f in ['Qwen3VL-8B-Instruct-Q4_K_M.gguf', 'mmproj-Qwen3VL-8B-Instruct-F16.gguf']:
42
+ hf_hub_download('Qwen/Qwen3-VL-8B-Instruct-GGUF', f, local_dir='checkpoints/qwen3-vl')
43
+ "
44
+
45
+ # Start the server (keep this terminal open)
46
+ ./scripts/serve_judge.sh
47
+ ```
48
+
49
+ To use a fine-tuned GGUF instead of the default:
50
+ ```bash
51
+ FORMSCOUT_JUDGE_GGUF=/path/to/finetuned.gguf ./scripts/serve_judge.sh
52
+ ```
53
+
54
+ ### 3. Launch the Gradio app
55
+
56
+ ```bash
57
+ python3 app.py
58
+ # β†’ http://127.0.0.1:7860
59
+ ```
60
+
61
+ Upload a video, select the FMS test from the dropdown, and click **Analyze**.
62
+
63
+ ### 4. Headless pipeline (no Gradio)
64
+
65
+ ```bash
66
+ python3 -m formscout.run sample.mp4
67
+ ```
68
+
69
+ ### 5. Tests
70
+
71
+ ```bash
72
+ pytest tests/ -v
73
+ ```
74
+
75
+ ### 6. Upload to Hugging Face
76
+
77
+ ```bash
78
+ # Pushes source to both model repo and Space, opens a PR on each
79
+ ./scripts/hf_upload.sh
80
+
81
+ # Or with a custom commit message
82
+ ./scripts/hf_upload.sh "feat: my change"
83
+ ```
84
+
85
+ ## Architecture
86
+
87
+ Typed specialist agents orchestrated by a deterministic Director:
88
+
89
+ ```
90
+ Ingest β†’ Pose2D β†’ [Body3D optional] β†’ Biomechanics β†’ Rubric Score β†’ [Judge] β†’ Report
91
+ ```
92
+
93
+ | Agent | Model | Status |
94
+ |---|---|---|
95
+ | Pose2D | YOLO26l-Pose (0.026B) + MediaPipe fallback | βœ… |
96
+ | Body3D | SAM 3D Body DINOv3 (0.84B) | gated, off by default |
97
+ | Judge + Classifier | Qwen3-VL-8B-Instruct via llama.cpp (8B) | βœ… |
98
+ | Scoring Head | ST-GCN (0.03B) | Phase 3 |
99
+ | Retrieval | Qwen3-VL-Embedding-8B (8B) | Phase 3 |
100
+
101
+ See [CLAUDE.md](CLAUDE.md) for full architecture and invariants.
102
+
103
+ ## Feature flags (`formscout/config.py`)
104
+
105
+ | Flag | Default | Meaning |
106
+ |---|---|---|
107
+ | `ENABLE_JUDGE` | `True` | VLM judge via llama-server; rubric fallback when server is down |
108
+ | `ENABLE_3D` | `False` | SAM 3D Body β€” off until integrated |
109
+ | `ENABLE_STGCN` | `False` | Phase 3 |
110
+ | `ENABLE_RAG` | `False` | Phase 3 |
111
+
112
+ ## Model budget
113
+
114
+ ~18B params total (under 32B cap). See [MODEL_BUDGET.md](MODEL_BUDGET.md).
115
+
116
+ ## License
117
+
118
+ Apache-2.0. Built for the Build Small Hackathon (Backyard AI track).
formscout/agents/classifier.py CHANGED
@@ -9,7 +9,6 @@ Gated: No.
9
  """
10
  from __future__ import annotations
11
 
12
- import json
13
  import logging
14
  from pathlib import Path
15
 
 
9
  """
10
  from __future__ import annotations
11
 
 
12
  import logging
13
  from pathlib import Path
14
 
formscout/config.py CHANGED
@@ -3,6 +3,7 @@ FormScout pipeline configuration.
3
  All model IDs, thresholds, k-values, and feature flags live here.
4
  No scattered literals elsewhere in the codebase.
5
  """
 
6
  from pathlib import Path
7
 
8
  ROOT = Path(__file__).parent.parent
@@ -95,7 +96,20 @@ SAM_CHECKPOINT = "sam2.1_hiera_base_plus.pt"
95
  SAM_3D_CHECKPOINT = ROOT / "checkpoints" / "sam-3d-body-dinov3" / "model.ckpt"
96
  SAM_3D_HF_REPO = "facebook/sam-3d-body-dinov3"
97
  SAM_3D_MHR_PATH = ROOT / "checkpoints" / "sam-3d-body-dinov3" / "assets" / "mhr_model.pt"
98
- QWEN_VLM_GGUF = "Qwen3-VL-8B-Instruct-Q4_K_M.gguf"
 
 
 
 
 
 
 
 
 
 
 
 
 
99
  QWEN_EMBED_GGUF = "Qwen3-VL-Embedding-8B-Q4_K_M.gguf"
100
  STGCN_CHECKPOINT = ROOT / "checkpoints" / "stgcn_fms.pth"
101
 
@@ -103,7 +117,7 @@ STGCN_CHECKPOINT = ROOT / "checkpoints" / "stgcn_fms.pth"
103
  ENABLE_3D = False # SAM 3D Body β€” access granted Jun 2026, off until integrated
104
  ENABLE_STGCN = False # Phase 3
105
  ENABLE_RAG = False # Phase 3
106
- ENABLE_JUDGE = False # Phase 2
107
 
108
  # ─── Thresholds ──────────────────────────────────────────────────────────────
109
  MIN_CONFIDENCE = 0.6
 
3
  All model IDs, thresholds, k-values, and feature flags live here.
4
  No scattered literals elsewhere in the codebase.
5
  """
6
+ import os
7
  from pathlib import Path
8
 
9
  ROOT = Path(__file__).parent.parent
 
96
  SAM_3D_CHECKPOINT = ROOT / "checkpoints" / "sam-3d-body-dinov3" / "model.ckpt"
97
  SAM_3D_HF_REPO = "facebook/sam-3d-body-dinov3"
98
  SAM_3D_MHR_PATH = ROOT / "checkpoints" / "sam-3d-body-dinov3" / "assets" / "mhr_model.pt"
99
+ # ─── Judge / Classifier VLM (Qwen3-VL-8B-Instruct via llama.cpp) ────────────
100
+ # Default: stock Qwen3-VL-8B-Instruct Q4_K_M. To swap in a fine-tuned GGUF,
101
+ # set FORMSCOUT_JUDGE_GGUF (and FORMSCOUT_JUDGE_MMPROJ if it has its own
102
+ # projector) β€” no code change needed.
103
+ _QWEN_DIR = ROOT / "checkpoints" / "qwen3-vl"
104
+ JUDGE_GGUF = Path(os.environ.get(
105
+ "FORMSCOUT_JUDGE_GGUF", _QWEN_DIR / "Qwen3VL-8B-Instruct-Q4_K_M.gguf"
106
+ ))
107
+ JUDGE_MMPROJ = Path(os.environ.get(
108
+ "FORMSCOUT_JUDGE_MMPROJ", _QWEN_DIR / "mmproj-Qwen3VL-8B-Instruct-F16.gguf"
109
+ ))
110
+ JUDGE_HF_REPO = "Qwen/Qwen3-VL-8B-Instruct-GGUF"
111
+
112
+ QWEN_VLM_GGUF = str(JUDGE_GGUF) # backward-compat alias
113
  QWEN_EMBED_GGUF = "Qwen3-VL-Embedding-8B-Q4_K_M.gguf"
114
  STGCN_CHECKPOINT = ROOT / "checkpoints" / "stgcn_fms.pth"
115
 
 
117
  ENABLE_3D = False # SAM 3D Body β€” access granted Jun 2026, off until integrated
118
  ENABLE_STGCN = False # Phase 3
119
  ENABLE_RAG = False # Phase 3
120
+ ENABLE_JUDGE = True # VLM judge/classifier β€” falls back to rubric when llama-server is down
121
 
122
  # ─── Thresholds ──────────────────────────────────────────────────────────────
123
  MIN_CONFIDENCE = 0.6
formscout/serving/llama_cpp.py CHANGED
@@ -52,49 +52,48 @@ class LlamaCppClient:
52
  stop: list[str] | None = None,
53
  ) -> dict[str, Any]:
54
  """
55
- Send a completion request. Returns parsed JSON if the response is JSON,
 
 
56
  otherwise returns {"text": raw_text}.
57
 
58
  Args:
59
  prompt: The text prompt (system + user combined).
60
- images: Optional list of base64-encoded images or file paths.
61
  max_tokens: Max generation tokens.
62
  temperature: Sampling temperature.
63
- stop: Stop sequences.
64
  """
 
 
 
 
 
 
 
 
 
 
 
 
65
  payload: dict[str, Any] = {
66
- "prompt": prompt,
67
- "n_predict": max_tokens,
68
  "temperature": temperature,
69
- "stop": stop or ["\n\n"],
70
  }
71
-
72
- # Add images for multimodal (Qwen3-VL via llama.cpp mmproj)
73
- if images:
74
- image_data = []
75
- for img in images:
76
- if Path(img).exists():
77
- with open(img, "rb") as f:
78
- image_data.append({"data": base64.b64encode(f.read()).decode()})
79
- else:
80
- # Assume already base64
81
- image_data.append({"data": img})
82
- payload["image_data"] = image_data
83
 
84
  try:
85
  r = requests.post(
86
- f"{self.base_url}/completion",
87
  json=payload,
88
  timeout=_TIMEOUT,
89
  )
90
  r.raise_for_status()
91
  result = r.json()
92
- content = result.get("content", "")
93
- # Try to parse as JSON
94
- try:
95
- return json.loads(content)
96
- except (json.JSONDecodeError, TypeError):
97
- return {"text": content}
98
  except requests.ConnectionError:
99
  return {"error": "llama.cpp server not available", "text": ""}
100
  except requests.Timeout:
@@ -102,6 +101,21 @@ class LlamaCppClient:
102
  except Exception as e:
103
  return {"error": str(e), "text": ""}
104
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
105
 
106
  class EmbeddingClient:
107
  """HTTP client for the llama.cpp embedding server."""
 
52
  stop: list[str] | None = None,
53
  ) -> dict[str, Any]:
54
  """
55
+ Send a chat-completion request (OpenAI-compatible /v1/chat/completions β€”
56
+ required for multimodal: llama-server routes images through the mmproj
57
+ only on this endpoint). Returns parsed JSON if the response is JSON,
58
  otherwise returns {"text": raw_text}.
59
 
60
  Args:
61
  prompt: The text prompt (system + user combined).
62
+ images: Optional list of base64-encoded JPEGs or file paths.
63
  max_tokens: Max generation tokens.
64
  temperature: Sampling temperature.
65
+ stop: Stop sequences (default: none β€” JSON output must not be truncated).
66
  """
67
+ content: list[dict[str, Any]] = [{"type": "text", "text": prompt}]
68
+ for img in images or []:
69
+ if len(img) < 4096 and Path(img).exists():
70
+ with open(img, "rb") as f:
71
+ b64 = base64.b64encode(f.read()).decode()
72
+ else:
73
+ b64 = img # already base64
74
+ content.append({
75
+ "type": "image_url",
76
+ "image_url": {"url": f"data:image/jpeg;base64,{b64}"},
77
+ })
78
+
79
  payload: dict[str, Any] = {
80
+ "messages": [{"role": "user", "content": content}],
81
+ "max_tokens": max_tokens,
82
  "temperature": temperature,
 
83
  }
84
+ if stop:
85
+ payload["stop"] = stop
 
 
 
 
 
 
 
 
 
 
86
 
87
  try:
88
  r = requests.post(
89
+ f"{self.base_url}/v1/chat/completions",
90
  json=payload,
91
  timeout=_TIMEOUT,
92
  )
93
  r.raise_for_status()
94
  result = r.json()
95
+ text = result["choices"][0]["message"]["content"] or ""
96
+ return self._parse_json_reply(text)
 
 
 
 
97
  except requests.ConnectionError:
98
  return {"error": "llama.cpp server not available", "text": ""}
99
  except requests.Timeout:
 
101
  except Exception as e:
102
  return {"error": str(e), "text": ""}
103
 
104
+ @staticmethod
105
+ def _parse_json_reply(text: str) -> dict[str, Any]:
106
+ """Parse model output as JSON, tolerating markdown fences."""
107
+ stripped = text.strip()
108
+ if stripped.startswith("```"):
109
+ stripped = stripped.split("\n", 1)[-1]
110
+ stripped = stripped.rsplit("```", 1)[0].strip()
111
+ try:
112
+ parsed = json.loads(stripped)
113
+ if isinstance(parsed, dict):
114
+ return parsed
115
+ except (json.JSONDecodeError, TypeError):
116
+ pass
117
+ return {"text": text}
118
+
119
 
120
  class EmbeddingClient:
121
  """HTTP client for the llama.cpp embedding server."""
scripts/hf_upload.sh ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Upload the FormScout source tree to both the model repo and the Space.
3
+ #
4
+ # Usage:
5
+ # ./scripts/hf_upload.sh # message from last git commit
6
+ # ./scripts/hf_upload.sh "feat: my change" # custom message
7
+ #
8
+ # Pushes to:
9
+ # silas-therapy/small-functional-movement-screening (model repo)
10
+ # spaces/silas-therapy/small-functional-movement-screening (Gradio Space)
11
+ #
12
+ # `hf upload` does NOT read .hfignore β€” it only honors .gitignore, and only at
13
+ # commit time (after hashing and pre-uploading everything). So we parse
14
+ # .hfignore ourselves into --exclude globs and pass them explicitly.
15
+ #
16
+ # If the filtered file count still exceeds LARGE_THRESHOLD, we fall back to
17
+ # `hf upload-large-folder` (resumable, multi-threaded). Caveats of that mode:
18
+ # no --create-pr and no custom commit message β€” it commits directly to main
19
+ # in multiple commits.
20
+ set -euo pipefail
21
+
22
+ cd "$(dirname "$0")/.."
23
+
24
+ MODEL_REPO="silas-therapy/small-functional-movement-screening"
25
+ SPACE_REPO="spaces/silas-therapy/small-functional-movement-screening"
26
+ MSG="${1:-$(git log -1 --pretty=%s)}"
27
+ LARGE_THRESHOLD="${FORMSCOUT_HF_LARGE_THRESHOLD:-500}"
28
+
29
+ # Belt-and-suspenders extras on top of .hfignore. `.cache/` is the resume
30
+ # state upload-large-folder writes into the folder being uploaded.
31
+ PATTERNS=(
32
+ "*.pdf"
33
+ "**/node_modules/**"
34
+ ".cache/**"
35
+ )
36
+
37
+ # Parse .hfignore into fnmatch-style globs. fnmatch's `*` crosses `/`, but a
38
+ # bare name like `.DS_Store` or `dir/` only matches at the root, so emit both
39
+ # the rooted and `**/`-prefixed forms.
40
+ while IFS= read -r line; do
41
+ line="${line%%#*}"
42
+ line="${line#"${line%%[![:space:]]*}"}"
43
+ line="${line%"${line##*[![:space:]]}"}"
44
+ [[ -z "$line" ]] && continue
45
+ if [[ "$line" == */ ]]; then
46
+ PATTERNS+=("${line}**" "**/${line}**")
47
+ else
48
+ PATTERNS+=("$line" "**/$line")
49
+ fi
50
+ done < .hfignore
51
+
52
+ EXCLUDES=()
53
+ for p in "${PATTERNS[@]}"; do
54
+ EXCLUDES+=(--exclude="$p")
55
+ done
56
+
57
+ # Count what would actually be uploaded, using the same filter the hub client
58
+ # applies, so the mode decision matches reality.
59
+ N_FILES=$(python3 - "${PATTERNS[@]}" <<'EOF'
60
+ import sys
61
+ from pathlib import Path
62
+ from huggingface_hub.utils import filter_repo_objects
63
+
64
+ patterns = sys.argv[1:]
65
+ files = (
66
+ str(p) for p in Path(".").rglob("*")
67
+ if p.is_file() and p.parts[0] != ".git"
68
+ )
69
+ print(len(list(filter_repo_objects(files, ignore_patterns=patterns))))
70
+ EOF
71
+ )
72
+ echo "── $N_FILES files to upload after .hfignore filtering"
73
+
74
+ if (( N_FILES == 0 )); then
75
+ echo "βœ— nothing to upload β€” check .hfignore" >&2
76
+ exit 1
77
+ fi
78
+
79
+ upload_repo() {
80
+ local repo="$1"
81
+ if (( N_FILES > LARGE_THRESHOLD )); then
82
+ echo "── $repo: $N_FILES files > $LARGE_THRESHOLD, using upload-large-folder"
83
+ echo " (resumable; commits directly to main β€” no PR, no custom message)"
84
+ hf upload-large-folder "$repo" . "${EXCLUDES[@]}"
85
+ else
86
+ echo "── uploading to: $repo"
87
+ hf upload "$repo" . . \
88
+ "${EXCLUDES[@]}" \
89
+ --create-pr \
90
+ --commit-message="$MSG"
91
+ fi
92
+ }
93
+
94
+ upload_repo "$MODEL_REPO"
95
+ upload_repo "$SPACE_REPO"
96
+
97
+ echo "βœ“ done"
scripts/serve_judge.sh ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Launch llama-server with the FormScout Judge/Classifier VLM.
3
+ #
4
+ # Default model: Qwen3-VL-8B-Instruct Q4_K_M (checkpoints/qwen3-vl/).
5
+ # To serve a fine-tuned GGUF instead, set:
6
+ # FORMSCOUT_JUDGE_GGUF=/path/to/finetuned.gguf
7
+ # FORMSCOUT_JUDGE_MMPROJ=/path/to/mmproj.gguf (only if it ships its own)
8
+ #
9
+ # Requires: brew install llama.cpp
10
+ set -euo pipefail
11
+
12
+ # Homebrew bin may be missing from non-interactive shells
13
+ export PATH="/opt/homebrew/bin:/usr/local/bin:$PATH"
14
+
15
+ ROOT="$(cd "$(dirname "$0")/.." && pwd)"
16
+ GGUF="${FORMSCOUT_JUDGE_GGUF:-$ROOT/checkpoints/qwen3-vl/Qwen3VL-8B-Instruct-Q4_K_M.gguf}"
17
+ MMPROJ="${FORMSCOUT_JUDGE_MMPROJ:-$ROOT/checkpoints/qwen3-vl/mmproj-Qwen3VL-8B-Instruct-F16.gguf}"
18
+ HOST="${FORMSCOUT_LLAMA_HOST:-127.0.0.1}"
19
+ PORT="${FORMSCOUT_LLAMA_PORT:-8080}"
20
+
21
+ if [[ ! -f "$GGUF" ]]; then
22
+ echo "Model not found: $GGUF" >&2
23
+ echo "Download it with:" >&2
24
+ echo " python3 -c \"from huggingface_hub import hf_hub_download; [hf_hub_download('Qwen/Qwen3-VL-8B-Instruct-GGUF', f, local_dir='$ROOT/checkpoints/qwen3-vl') for f in ['Qwen3VL-8B-Instruct-Q4_K_M.gguf', 'mmproj-Qwen3VL-8B-Instruct-F16.gguf']]\"" >&2
25
+ exit 1
26
+ fi
27
+
28
+ exec llama-server \
29
+ --model "$GGUF" \
30
+ --mmproj "$MMPROJ" \
31
+ --host "$HOST" \
32
+ --port "$PORT" \
33
+ --ctx-size 16384 \
34
+ --n-gpu-layers 99 \
35
+ --no-warmup
tests/test_phase2.py CHANGED
@@ -1,9 +1,7 @@
1
  """Tests for all rubric scorers and Phase 2 agents."""
2
- import pytest
3
 
4
  from formscout.types import (
5
- BiomechFeatures, ScoreResult, MovementResult, IngestResult,
6
- Pose2DResult, JudgeResult, ReportResult,
7
  )
8
  from formscout.rubric import score_test, SCORERS
9
  from formscout.rubric.hurdle_step import score_hurdle_step
@@ -178,8 +176,10 @@ class TestRotaryStability:
178
  # ─── JudgeAgent fallback ─────────────────────────────────────────────────────
179
 
180
  class TestJudgeAgent:
181
- def test_fallback_when_judge_disabled(self):
182
  """When ENABLE_JUDGE=False, judge promotes rubric score."""
 
 
183
  agent = JudgeAgent()
184
  features = _make_features("deep_squat", angles={"left_femur_from_horizontal_deg": 70.0})
185
  rubric = ScoreResult(score=3, rationale="all good", confidence=0.9)
@@ -189,6 +189,105 @@ class TestJudgeAgent:
189
  assert result.score == 3
190
  assert "[rubric-only]" in result.rationale
191
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
192
 
193
  # ─── ReportAgent ──────────────────────────────────────────────────────────────
194
 
 
1
  """Tests for all rubric scorers and Phase 2 agents."""
 
2
 
3
  from formscout.types import (
4
+ BiomechFeatures, ScoreResult, MovementResult, JudgeResult, ReportResult,
 
5
  )
6
  from formscout.rubric import score_test, SCORERS
7
  from formscout.rubric.hurdle_step import score_hurdle_step
 
176
  # ─── JudgeAgent fallback ─────────────────────────────────────────────────────
177
 
178
  class TestJudgeAgent:
179
+ def test_fallback_when_judge_disabled(self, monkeypatch):
180
  """When ENABLE_JUDGE=False, judge promotes rubric score."""
181
+ from formscout import config
182
+ monkeypatch.setattr(config, "ENABLE_JUDGE", False)
183
  agent = JudgeAgent()
184
  features = _make_features("deep_squat", angles={"left_femur_from_horizontal_deg": 70.0})
185
  rubric = ScoreResult(score=3, rationale="all good", confidence=0.9)
 
189
  assert result.score == 3
190
  assert "[rubric-only]" in result.rationale
191
 
192
+ def test_fallback_when_server_unavailable(self, monkeypatch):
193
+ """ENABLE_JUDGE=True but llama-server down β†’ rubric fallback, never a crash."""
194
+ from unittest.mock import PropertyMock, patch
195
+ from formscout import config
196
+ monkeypatch.setattr(config, "ENABLE_JUDGE", True)
197
+ agent = JudgeAgent()
198
+ with patch.object(type(agent._client), "available", new_callable=PropertyMock, return_value=False):
199
+ features = _make_features("deep_squat")
200
+ rubric = ScoreResult(score=2, rationale="heels up", confidence=0.8)
201
+ movement = MovementResult(test_name="deep_squat", side="na", confidence=1.0)
202
+ result = agent.run(features, rubric, movement)
203
+ assert result.score == 2
204
+ assert "[rubric-only]" in result.rationale
205
+
206
+ def test_vlm_response_parsed_into_judge_result(self, monkeypatch):
207
+ """ENABLE_JUDGE=True with live client β†’ VLM JSON becomes JudgeResult."""
208
+ from unittest.mock import PropertyMock, patch
209
+ from formscout import config
210
+ monkeypatch.setattr(config, "ENABLE_JUDGE", True)
211
+ agent = JudgeAgent()
212
+ vlm_json = {
213
+ "test": "deep_squat", "side": "na", "score": 2, "needs_human": False,
214
+ "rationale": "Femur 5Β° above horizontal; 2D estimate.",
215
+ "compensation_tags": ["forward_lean"], "corrective_hint": "Sit back into heels.",
216
+ "confidence": 0.78,
217
+ }
218
+ with patch.object(type(agent._client), "available", new_callable=PropertyMock, return_value=True), \
219
+ patch.object(agent._client, "complete", return_value=vlm_json):
220
+ features = _make_features("deep_squat")
221
+ rubric = ScoreResult(score=2, rationale="ok", confidence=0.8)
222
+ movement = MovementResult(test_name="deep_squat", side="na", confidence=1.0)
223
+ result = agent.run(features, rubric, movement)
224
+ assert result.score == 2
225
+ assert result.compensation_tags == ["forward_lean"]
226
+ assert result.needs_human is False
227
+
228
+ def test_vlm_needs_human_yields_no_score(self, monkeypatch):
229
+ """needs_human=True from the VLM must produce score=None."""
230
+ from unittest.mock import PropertyMock, patch
231
+ from formscout import config
232
+ monkeypatch.setattr(config, "ENABLE_JUDGE", True)
233
+ agent = JudgeAgent()
234
+ vlm_json = {"score": 1, "needs_human": True, "rationale": "Possible pain.", "confidence": 0.9}
235
+ with patch.object(type(agent._client), "available", new_callable=PropertyMock, return_value=True), \
236
+ patch.object(agent._client, "complete", return_value=vlm_json):
237
+ result = agent.run(
238
+ _make_features("deep_squat"),
239
+ ScoreResult(score=1, rationale="x", confidence=0.5),
240
+ MovementResult(test_name="deep_squat", side="na", confidence=1.0),
241
+ )
242
+ assert result.needs_human is True
243
+ assert result.score is None
244
+
245
+
246
+ # ─── LlamaCppClient (chat-completions endpoint) ──────────────────────────────
247
+
248
+ class TestLlamaCppClient:
249
+ def test_parse_plain_json(self):
250
+ from formscout.serving.llama_cpp import LlamaCppClient
251
+ assert LlamaCppClient._parse_json_reply('{"score": 3}') == {"score": 3}
252
+
253
+ def test_parse_fenced_json(self):
254
+ from formscout.serving.llama_cpp import LlamaCppClient
255
+ fenced = '```json\n{"score": 2, "needs_human": false}\n```'
256
+ assert LlamaCppClient._parse_json_reply(fenced) == {"score": 2, "needs_human": False}
257
+
258
+ def test_parse_non_json_returns_text(self):
259
+ from formscout.serving.llama_cpp import LlamaCppClient
260
+ assert LlamaCppClient._parse_json_reply("not json") == {"text": "not json"}
261
+
262
+ def test_complete_posts_chat_endpoint_with_images(self):
263
+ from unittest.mock import MagicMock, patch
264
+ from formscout.serving.llama_cpp import LlamaCppClient
265
+
266
+ client = LlamaCppClient(port=8080)
267
+ resp = MagicMock()
268
+ resp.json.return_value = {"choices": [{"message": {"content": '{"ok": true}'}}]}
269
+ resp.raise_for_status.return_value = None
270
+ with patch("formscout.serving.llama_cpp.requests.post", return_value=resp) as mock_post:
271
+ result = client.complete("score this", images=["aGVsbG8=" * 600])
272
+ assert result == {"ok": True}
273
+ url = mock_post.call_args.args[0] if mock_post.call_args.args else mock_post.call_args.kwargs.get("url")
274
+ assert url.endswith("/v1/chat/completions")
275
+ payload = mock_post.call_args.kwargs["json"]
276
+ content = payload["messages"][0]["content"]
277
+ assert content[0] == {"type": "text", "text": "score this"}
278
+ assert content[1]["type"] == "image_url"
279
+ assert content[1]["image_url"]["url"].startswith("data:image/jpeg;base64,")
280
+
281
+ def test_complete_connection_error_returns_safe_dict(self):
282
+ from unittest.mock import patch
283
+ import requests as _requests
284
+ from formscout.serving.llama_cpp import LlamaCppClient
285
+
286
+ client = LlamaCppClient(port=8080)
287
+ with patch("formscout.serving.llama_cpp.requests.post", side_effect=_requests.ConnectionError):
288
+ result = client.complete("hello")
289
+ assert "error" in result
290
+
291
 
292
  # ─── ReportAgent ──────────────────────────────────────────────────────────────
293