ChatGPT commited on
Commit
e07820e
·
1 Parent(s): 03d531b

feat: render supervised edits into artifacts

Browse files
README.md CHANGED
@@ -46,18 +46,19 @@ Implemented:
46
  - accept/favorite hit,
47
  - suppress hit as bleed,
48
  - lock/unlock cluster,
49
- - suggestion inbox,
50
  - cluster explanation drawer,
 
 
 
51
  - constraint/event log.
52
  - Documentation for features, progress, tasks, API, timing, hit review, realtime suitability, UI, remaining work, and interactive UX.
53
  - Legacy Gradio apps preserved in `legacy/` for reference only.
54
 
55
  Not fully complete yet:
56
 
57
- - Semantic edits do not yet regenerate WAV/MIDI/ZIP exports.
58
- - No force-onset/click-to-add missed onset yet.
59
- - No restore for suppressed hits yet.
60
  - No true cached feature-vector local reclustering yet.
 
61
  - No frontend TypeScript build/test harness yet.
62
  - Demucs remains offline/batch by design.
63
 
@@ -69,6 +70,7 @@ See:
69
  - `docs/API.md`
70
  - `docs/interactive-ux/README.md`
71
  - `docs/REMAINING_WORK.md`
 
72
 
73
  ## Run locally
74
 
@@ -91,10 +93,11 @@ That bypasses Demucs and uses the near-realtime clustering path.
91
  ## Run checks
92
 
93
  ```bash
94
- python3 -m py_compile app.py pipeline_runner.py sample_extractor.py supervised_state.py scripts/*.py
95
  node --check web/app.js
96
  python3 scripts/test_sse_and_review_hits.py
97
  python3 scripts/test_interactive_supervision.py
 
98
  ```
99
 
100
  ## Run benchmarks
@@ -148,10 +151,12 @@ curl http://127.0.0.1:7860/api/jobs
148
  | `app.py` | FastAPI app, static UI serving, job API, run history, artifact downloads, supervised editing endpoints |
149
  | `pipeline_runner.py` | Timed extraction pipeline, disk stem/source cache, batch/online clustering routing |
150
  | `sample_extractor.py` | Core DSP/sample extraction implementation |
151
- | `supervised_state.py` | Persistent semantic state, confidence, constraints, events, suggestions, undo |
152
- | `web/` | Custom no-build browser frontend with waveform, hit review, sample audition, and supervision panel |
 
153
  | `scripts/benchmark_subprocesses.py` | Synthetic benchmark runner for stage timings |
154
  | `scripts/test_interactive_supervision.py` | Smoke test for supervised state endpoints |
 
155
  | `docs/interactive-ux/` | Supplied interactive UX docs aligned to current implementation |
156
  | `docs/` | Review, timing, API, UI, feature, task, progress, and remaining-work documentation |
157
  | `legacy/` | Previous Gradio apps retained for reference |
@@ -168,6 +173,11 @@ Each run is stored under `.runs/<job-id>/output/`:
168
  - `review/hits/*.wav`
169
  - `manifest.json`
170
  - `supervision_state.json`
 
 
 
 
 
171
 
172
  Generated runtime directories are ignored by git:
173
 
 
46
  - accept/favorite hit,
47
  - suppress hit as bleed,
48
  - lock/unlock cluster,
49
+ - suggestion inbox with exact diff previews,
50
  - cluster explanation drawer,
51
+ - force-onset waveform mode,
52
+ - restore suppressed hits,
53
+ - edited sample-pack export,
54
  - constraint/event log.
55
  - Documentation for features, progress, tasks, API, timing, hit review, realtime suitability, UI, remaining work, and interactive UX.
56
  - Legacy Gradio apps preserved in `legacy/` for reference only.
57
 
58
  Not fully complete yet:
59
 
 
 
 
60
  - No true cached feature-vector local reclustering yet.
61
+ - No cluster merge/split/relabel workflow beyond move/pull-to-new-cluster.
62
  - No frontend TypeScript build/test harness yet.
63
  - Demucs remains offline/batch by design.
64
 
 
70
  - `docs/API.md`
71
  - `docs/interactive-ux/README.md`
72
  - `docs/REMAINING_WORK.md`
73
+ - `docs/SUPERVISED_EXPORT_AND_FORCE_ONSET.md`
74
 
75
  ## Run locally
76
 
 
93
  ## Run checks
94
 
95
  ```bash
96
+ python3 -m py_compile app.py pipeline_runner.py sample_extractor.py supervised_state.py supervised_export.py scripts/*.py
97
  node --check web/app.js
98
  python3 scripts/test_sse_and_review_hits.py
99
  python3 scripts/test_interactive_supervision.py
100
+ python3 scripts/test_supervised_export_and_force_onset.py
101
  ```
102
 
103
  ## Run benchmarks
 
151
  | `app.py` | FastAPI app, static UI serving, job API, run history, artifact downloads, supervised editing endpoints |
152
  | `pipeline_runner.py` | Timed extraction pipeline, disk stem/source cache, batch/online clustering routing |
153
  | `sample_extractor.py` | Core DSP/sample extraction implementation |
154
+ | `supervised_state.py` | Persistent semantic state, confidence, constraints, events, suggestions, force-onset, restore, undo |
155
+ | `supervised_export.py` | Renders edited semantic state into supervised WAV/MIDI/reconstruction/ZIP artifacts |
156
+ | `web/` | Custom no-build browser frontend with waveform, hit review, sample audition, add-onset mode, edited export, and supervision panel |
157
  | `scripts/benchmark_subprocesses.py` | Synthetic benchmark runner for stage timings |
158
  | `scripts/test_interactive_supervision.py` | Smoke test for supervised state endpoints |
159
+ | `scripts/test_supervised_export_and_force_onset.py` | Smoke test for force-onset, restore, suggestion diffs, and edited exports |
160
  | `docs/interactive-ux/` | Supplied interactive UX docs aligned to current implementation |
161
  | `docs/` | Review, timing, API, UI, feature, task, progress, and remaining-work documentation |
162
  | `legacy/` | Previous Gradio apps retained for reference |
 
173
  - `review/hits/*.wav`
174
  - `manifest.json`
175
  - `supervision_state.json`
176
+ - `supervised/manifest.json` after edited export
177
+ - `supervised/sample-pack.zip` after edited export
178
+ - `supervised/samples/*.wav` after edited export
179
+ - `supervised/reconstruction.mid` after edited export
180
+ - `supervised/reconstruction.wav` after edited export
181
 
182
  Generated runtime directories are ignored by git:
183
 
app.py CHANGED
@@ -29,16 +29,19 @@ from sample_extractor import DEMUCS_MODELS, DEMUCS_STEMS, cache_clear
29
  from supervised_state import (
30
  accept_suggestion,
31
  explain_cluster as build_cluster_explanation,
 
32
  load_or_create_state,
33
  lock_cluster as apply_cluster_lock,
34
  move_hit as apply_hit_move,
35
  public_state,
36
  pull_hit_to_new_cluster,
37
  reject_suggestion,
 
38
  set_hit_review_status,
39
  suppress_hit as apply_hit_suppression,
40
  undo_last as apply_undo,
41
  )
 
42
 
43
  ROOT = Path(__file__).resolve().parent
44
  WEB_DIR = ROOT / "web"
@@ -63,6 +66,16 @@ def _job_url(job_id: str, relative_path: str) -> str:
63
  return f"/api/jobs/{job_id}/files/{relative_path}"
64
 
65
 
 
 
 
 
 
 
 
 
 
 
66
  def _serialise_job(job: dict[str, Any]) -> dict[str, Any]:
67
  payload = {key: value for key, value in job.items() if key not in {"input_path", "output_dir"}}
68
  if payload.get("result"):
@@ -334,6 +347,46 @@ def get_job_state(job_id: str) -> dict[str, Any]:
334
  return _state_payload(job_id)
335
 
336
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
337
  @app.post("/api/jobs/{job_id}/hits/{hit_id}/move")
338
  def post_move_hit(job_id: str, hit_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
339
  target_cluster_id = _json_patch(payload).get("target_cluster_id")
@@ -372,6 +425,17 @@ def post_suppress_hit(job_id: str, hit_id: str, payload: dict[str, Any] = Body(d
372
  return _state_payload(job_id)
373
 
374
 
 
 
 
 
 
 
 
 
 
 
 
375
  @app.post("/api/jobs/{job_id}/hits/{hit_id}/review")
376
  def post_review_hit(job_id: str, hit_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
377
  status = str(_json_patch(payload).get("status") or "accepted")
 
29
  from supervised_state import (
30
  accept_suggestion,
31
  explain_cluster as build_cluster_explanation,
32
+ force_onset as apply_force_onset,
33
  load_or_create_state,
34
  lock_cluster as apply_cluster_lock,
35
  move_hit as apply_hit_move,
36
  public_state,
37
  pull_hit_to_new_cluster,
38
  reject_suggestion,
39
+ restore_hit as apply_hit_restore,
40
  set_hit_review_status,
41
  suppress_hit as apply_hit_suppression,
42
  undo_last as apply_undo,
43
  )
44
+ from supervised_export import export_supervised_state
45
 
46
  ROOT = Path(__file__).resolve().parent
47
  WEB_DIR = ROOT / "web"
 
66
  return f"/api/jobs/{job_id}/files/{relative_path}"
67
 
68
 
69
+ def _serialise_export(job_id: str, export_manifest: dict[str, Any]) -> dict[str, Any]:
70
+ payload = dict(export_manifest)
71
+ payload["file_urls"] = {key: _job_url(job_id, path) for key, path in payload.get("files", {}).items()}
72
+ payload["samples"] = [
73
+ {**sample, "url": _job_url(job_id, sample["file"])}
74
+ for sample in payload.get("samples", [])
75
+ ]
76
+ return payload
77
+
78
+
79
  def _serialise_job(job: dict[str, Any]) -> dict[str, Any]:
80
  payload = {key: value for key, value in job.items() if key not in {"input_path", "output_dir"}}
81
  if payload.get("result"):
 
347
  return _state_payload(job_id)
348
 
349
 
350
+ @app.post("/api/jobs/{job_id}/export")
351
+ def post_supervised_export(job_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
352
+ patch = _json_patch(payload)
353
+ try:
354
+ export_manifest = export_supervised_state(
355
+ _job_output_dir(job_id),
356
+ job_id,
357
+ synthesize=bool(patch.get("synthesize", True)),
358
+ quantize=patch.get("quantize"),
359
+ subdivision=patch.get("subdivision"),
360
+ )
361
+ except Exception as exc:
362
+ raise HTTPException(status_code=500, detail=str(exc)) from exc
363
+ return {"export": _serialise_export(job_id, export_manifest), "state": _state_payload(job_id)}
364
+
365
+
366
+ @app.post("/api/jobs/{job_id}/hits/force-onset")
367
+ def post_force_onset(job_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
368
+ patch = _json_patch(payload)
369
+ if "onset_sec" not in patch:
370
+ raise HTTPException(status_code=400, detail="onset_sec is required")
371
+ try:
372
+ apply_force_onset(
373
+ _job_output_dir(job_id),
374
+ job_id,
375
+ float(patch["onset_sec"]),
376
+ duration_ms=patch.get("duration_ms"),
377
+ label=patch.get("label"),
378
+ target_cluster_id=patch.get("target_cluster_id"),
379
+ pre_pad_sec=float(patch.get("pre_pad_sec", 0.003)),
380
+ )
381
+ except KeyError as exc:
382
+ raise HTTPException(status_code=404, detail=str(exc)) from exc
383
+ except ValueError as exc:
384
+ raise HTTPException(status_code=400, detail=str(exc)) from exc
385
+ except Exception as exc:
386
+ raise HTTPException(status_code=500, detail=str(exc)) from exc
387
+ return _state_payload(job_id)
388
+
389
+
390
  @app.post("/api/jobs/{job_id}/hits/{hit_id}/move")
391
  def post_move_hit(job_id: str, hit_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
392
  target_cluster_id = _json_patch(payload).get("target_cluster_id")
 
425
  return _state_payload(job_id)
426
 
427
 
428
+ @app.post("/api/jobs/{job_id}/hits/{hit_id}/restore")
429
+ def post_restore_hit(job_id: str, hit_id: str) -> dict[str, Any]:
430
+ try:
431
+ apply_hit_restore(_job_output_dir(job_id), job_id, hit_id)
432
+ except KeyError as exc:
433
+ raise HTTPException(status_code=404, detail=str(exc)) from exc
434
+ except Exception as exc:
435
+ raise HTTPException(status_code=500, detail=str(exc)) from exc
436
+ return _state_payload(job_id)
437
+
438
+
439
  @app.post("/api/jobs/{job_id}/hits/{hit_id}/review")
440
  def post_review_hit(job_id: str, hit_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
441
  status = str(_json_patch(payload).get("status") or "accepted")
docs/API.md CHANGED
@@ -221,7 +221,7 @@ The interactive supervision API is backed by `supervised_state.py` and persists
221
  .runs/<job_id>/output/supervision_state.json
222
  ```
223
 
224
- The batch `manifest.json` remains immutable. Supervised edits currently update semantic state only; they do not yet regenerate WAV/MIDI/ZIP artifacts.
225
 
226
  ### `GET /api/jobs/{job_id}/state`
227
 
@@ -231,18 +231,105 @@ Response keys:
231
 
232
  | Key | Meaning |
233
  |---|---|
234
- | `summary` | Counts for hits, clusters, constraints, events, suggestions, suppressed hits, locked clusters, undo availability. |
235
  | `hits` | Semantic hit rows with confidence, suppression/favorite/review flags, file URLs, and current cluster assignment. |
236
  | `clusters` | Semantic clusters with hit IDs, representative hit, confidence, locked state, and suppressed count. |
237
  | `review_queue` | Low-confidence/high-priority hits sorted for review. |
238
  | `constraints` | Recent replayable constraints. |
239
  | `events` | Recent state mutation events. |
240
- | `suggestions` | Open move/split/suppress suggestions. |
241
 
242
  ```bash
243
  curl http://127.0.0.1:7860/api/jobs/<job-id>/state
244
  ```
245
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
246
  ### `POST /api/jobs/{job_id}/hits/{hit_id}/move`
247
 
248
  Moves a hit into an existing target cluster.
 
221
  .runs/<job_id>/output/supervision_state.json
222
  ```
223
 
224
+ The batch `manifest.json` remains immutable. Supervised edits update semantic state and can be rendered into a separate edited export under `supervised/` without mutating original artifacts.
225
 
226
  ### `GET /api/jobs/{job_id}/state`
227
 
 
231
 
232
  | Key | Meaning |
233
  |---|---|
234
+ | `summary` | Counts for hits, clusters, constraints, events, suggestions, suppressed/forced hits, locked clusters, latest export, undo availability. |
235
  | `hits` | Semantic hit rows with confidence, suppression/favorite/review flags, file URLs, and current cluster assignment. |
236
  | `clusters` | Semantic clusters with hit IDs, representative hit, confidence, locked state, and suppressed count. |
237
  | `review_queue` | Low-confidence/high-priority hits sorted for review. |
238
  | `constraints` | Recent replayable constraints. |
239
  | `events` | Recent state mutation events. |
240
+ | `suggestions` | Open move/split/suppress suggestions, including exact `diff` previews. |
241
 
242
  ```bash
243
  curl http://127.0.0.1:7860/api/jobs/<job-id>/state
244
  ```
245
 
246
+
247
+ ### `POST /api/jobs/{job_id}/hits/force-onset`
248
+
249
+ Creates a user-forced hit slice from `stem.wav` and adds it to semantic state.
250
+
251
+ Body:
252
+
253
+ ```json
254
+ {
255
+ "onset_sec": 0.123,
256
+ "duration_ms": 160,
257
+ "target_cluster_id": "cluster:0",
258
+ "label": "snare"
259
+ }
260
+ ```
261
+
262
+ Required fields:
263
+
264
+ | Field | Required | Meaning |
265
+ |---|---:|---|
266
+ | `onset_sec` | yes | Onset location in seconds. |
267
+ | `duration_ms` | no | Slice length. If omitted, the system slices until the next active onset or a bounded default. |
268
+ | `target_cluster_id` | no | Existing cluster to place the hit into. If omitted, a new user cluster is created. |
269
+ | `label` | no | Override label. If omitted, the rule-based classifier labels the forced slice. |
270
+
271
+ Effects:
272
+
273
+ - writes `review/hits/hit_NNNNN_<label>_forced.wav`,
274
+ - creates a semantic hit with `source=forced`,
275
+ - creates `force-onset` and `force-cluster` constraints,
276
+ - appends `hit.force_onset`,
277
+ - recomputes confidence and review queue.
278
+
279
+ ### `POST /api/jobs/{job_id}/hits/{hit_id}/restore`
280
+
281
+ Restores a suppressed hit.
282
+
283
+ Effects:
284
+
285
+ - sets `suppressed=false`,
286
+ - clears `review_status=suppressed` back to `unreviewed`,
287
+ - creates a `restore-hit` constraint,
288
+ - appends `hit.restored`,
289
+ - recomputes confidence and review queue.
290
+
291
+ ### `POST /api/jobs/{job_id}/export`
292
+
293
+ Renders the current semantic state into edited artifacts under `supervised/`. This does not modify the original `manifest.json`, original samples, or original ZIP.
294
+
295
+ Body:
296
+
297
+ ```json
298
+ {
299
+ "synthesize": true,
300
+ "quantize": true,
301
+ "subdivision": 16
302
+ }
303
+ ```
304
+
305
+ Response shape:
306
+
307
+ ```json
308
+ {
309
+ "export": {
310
+ "kind": "supervised-export",
311
+ "hit_count": 17,
312
+ "cluster_count": 10,
313
+ "files": {
314
+ "archive": "supervised/sample-pack.zip",
315
+ "midi": "supervised/reconstruction.mid",
316
+ "reconstruction": "supervised/reconstruction.wav"
317
+ },
318
+ "file_urls": {}
319
+ },
320
+ "state": {}
321
+ }
322
+ ```
323
+
324
+ Export rules:
325
+
326
+ - suppressed hits are excluded,
327
+ - forced hits are included,
328
+ - moved/pulled hits use current semantic cluster membership,
329
+ - favorite/pinned representatives are honored before quality scoring,
330
+ - cluster labels are sanitized for filenames,
331
+ - `supervision_state.json` receives `latest_export` and a `supervised.exported` event.
332
+
333
  ### `POST /api/jobs/{job_id}/hits/{hit_id}/move`
334
 
335
  Moves a hit into an existing target cluster.
docs/FEATURES.md CHANGED
@@ -38,6 +38,10 @@ Turn an input audio file into a practical drum sample pack: detected hits, group
38
  | Pipeline | Reconstruction render | Implemented | Renders MIDI-like reconstruction using selected samples. |
39
  | Pipeline | Per-hit review export | Implemented | Writes every accepted detected hit to `review/hits/*.wav` and records rows in the manifest. |
40
  | Pipeline | Sample pack ZIP | Implemented | Includes WAVs, index JSON, MIDI, rendered reconstruction. |
 
 
 
 
41
  | Docs | Project review | Implemented | `docs/PROJECT_REVIEW.md`. |
42
  | Docs | Timing/realtime analysis | Implemented | `docs/PIPELINE_TIMING_AND_REALTIME.md`. |
43
  | Docs | API docs | Implemented | `docs/API.md`. |
@@ -77,8 +81,8 @@ Turn an input audio file into a practical drum sample pack: detected hits, group
77
  | Supervision | Pull hit into new cluster | Implemented | Creates a user cluster and cannot-link/force-cluster constraints. |
78
  | Supervision | Lock cluster | Implemented | Lock state persists and updates confidence/UI. |
79
  | Supervision | Suppress hit as bleed | Implemented | Marks hit suppressed, stores suppress-pattern, may suggest similar suppressions. |
80
- | Supervision | Favorite representative | Partial | Pins semantic representative; supervised export does not yet honor it. |
81
- | Supervision | Suggestion inbox | Partial | Move/split/suppress suggestions can be accepted/rejected; exact diff preview is not implemented. |
82
  | Supervision | Cluster explanation | Implemented | Backend and UI show confidence reasons, label distribution, outliers, and constraints. |
83
- | Supervision | Edited artifact re-export | Not implemented | Semantic edits do not yet regenerate sample WAVs, MIDI, reconstruction, or ZIP. |
84
- | Supervision | Force-onset from waveform | Not implemented | Waveform click currently auditions nearest existing hit only. |
 
38
  | Pipeline | Reconstruction render | Implemented | Renders MIDI-like reconstruction using selected samples. |
39
  | Pipeline | Per-hit review export | Implemented | Writes every accepted detected hit to `review/hits/*.wav` and records rows in the manifest. |
40
  | Pipeline | Sample pack ZIP | Implemented | Includes WAVs, index JSON, MIDI, rendered reconstruction. |
41
+ | Supervision | Edited artifact re-export | Implemented | `supervised_export.py` writes edited samples, MIDI, reconstruction, ZIP, and `supervised/manifest.json`. |
42
+ | Supervision | Force-onset from waveform | Implemented | Adds user-forced hit slices from cached `stem.wav`; UI add-onset mode posts to `/hits/force-onset`. |
43
+ | Supervision | Suppressed-hit restore | Implemented | Restore endpoint and UI button reverse suppression without undoing unrelated edits. |
44
+ | Supervision | Suggestion diff previews | Implemented | Open suggestions include exact hit/cluster before-after previews and a UI `Diff` button. |
45
  | Docs | Project review | Implemented | `docs/PROJECT_REVIEW.md`. |
46
  | Docs | Timing/realtime analysis | Implemented | `docs/PIPELINE_TIMING_AND_REALTIME.md`. |
47
  | Docs | API docs | Implemented | `docs/API.md`. |
 
81
  | Supervision | Pull hit into new cluster | Implemented | Creates a user cluster and cannot-link/force-cluster constraints. |
82
  | Supervision | Lock cluster | Implemented | Lock state persists and updates confidence/UI. |
83
  | Supervision | Suppress hit as bleed | Implemented | Marks hit suppressed, stores suppress-pattern, may suggest similar suppressions. |
84
+ | Supervision | Favorite representative | Implemented | Pins semantic representative and supervised export honors it before quality scoring. |
85
+ | Supervision | Suggestion inbox | Implemented | Move/split/suppress suggestions can be accepted/rejected and inspected with exact diff previews. |
86
  | Supervision | Cluster explanation | Implemented | Backend and UI show confidence reasons, label distribution, outliers, and constraints. |
87
+ | Supervision | Edited artifact re-export | Implemented | Exports edited state into `supervised/` without mutating original batch artifacts. |
88
+ | Supervision | Force-onset from waveform | Implemented | Add-onset mode turns waveform clicks into forced hit slices from `stem.wav`. |
docs/PROGRESS.md CHANGED
@@ -134,14 +134,56 @@ analyze audio
134
  → reload completed run with decisions intact
135
  ```
136
 
137
- It does not yet satisfy the full workstation loop because edited semantic state is not yet rendered into updated sample WAVs, MIDI, reconstruction, or ZIP output.
138
 
139
  ## Next recommended pass after Pass 4
140
 
141
- 1. Add supervised re-export endpoint.
142
- 2. Exclude suppressed hits from supervised exports.
143
- 3. Honor favorite/pinned representatives in supervised sample WAVs.
144
- 4. Add force-onset endpoint using cached `stem.wav`.
145
- 5. Add add-onset mode to the waveform UI.
146
- 6. Add restore suppressed hit and batch restore.
147
- 7. Add feature-vector cache for true local reclustering.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
134
  → reload completed run with decisions intact
135
  ```
136
 
137
+ It now satisfies the first full semantic-edit loop because edited semantic state can be rendered into separate supervised sample WAVs, MIDI, reconstruction, and ZIP output.
138
 
139
  ## Next recommended pass after Pass 4
140
 
141
+ 1. Add cluster merge/relabel/split workflows.
142
+ 2. Add feature-vector cache for true local reclustering.
143
+ 3. Add edited-vs-original run comparison.
144
+ 4. Add browser-level UI tests and migrate frontend to TypeScript/Vite after UX stabilizes.
145
+
146
+
147
+ ## Pass 5: supervised export, force-onset, restore, and suggestion diffs
148
+
149
+ Completed in this pass:
150
+
151
+ 1. Added `supervised_export.py` to render `supervision_state.json` into edited artifacts under `supervised/`.
152
+ 2. Added `POST /api/jobs/{job_id}/export` for edited sample-pack export.
153
+ 3. Added `POST /api/jobs/{job_id}/hits/force-onset` to create user-forced hit slices from `stem.wav`.
154
+ 4. Added add-onset waveform mode in the frontend.
155
+ 5. Added `POST /api/jobs/{job_id}/hits/{hit_id}/restore` and a restore button for suppressed hits.
156
+ 6. Added exact suggestion diff previews through `suggestion.diff` and a UI `Diff` action.
157
+ 7. Updated supervised export to exclude suppressed hits and honor favorite/pinned representatives.
158
+ 8. Added `scripts/test_supervised_export_and_force_onset.py`.
159
+ 9. Added `docs/SUPERVISED_EXPORT_AND_FORCE_ONSET.md` and updated feature/API/task/progress docs.
160
+
161
+ Outcome:
162
+
163
+ The project now closes the main semantic-edit loop:
164
+
165
+ ```text
166
+ analyze audio
167
+ → inspect hits/clusters
168
+ → move/pull/suppress/restore/favorite/lock/force-onset
169
+ → inspect suggestions and diffs
170
+ → export edited WAV/MIDI/reconstruction/ZIP artifacts
171
+ ```
172
+
173
+ The original batch artifacts remain immutable. Edited outputs are written separately under `supervised/`.
174
+
175
+ Validation performed in this pass:
176
+
177
+ - `python3 -m py_compile app.py pipeline_runner.py sample_extractor.py supervised_state.py supervised_export.py scripts/*.py`
178
+ - `node --check web/app.js`
179
+ - `python3 scripts/test_supervised_export_and_force_onset.py`
180
+ - `python3 scripts/test_sse_and_review_hits.py`
181
+ - `python3 scripts/test_interactive_supervision.py`
182
+ - `python3 scripts/test_api_job.py`
183
+
184
+ Next recommended pass after Pass 5:
185
+
186
+ 1. Add cluster merge/relabel/split workflows.
187
+ 2. Add cached feature-vector local reclustering around edited hits.
188
+ 3. Add edited-vs-original run comparison.
189
+ 4. Add browser-level UI tests and migrate the frontend to TypeScript/Vite once the UX stops shifting.
docs/REMAINING_WORK.md CHANGED
@@ -54,3 +54,21 @@ Highest-priority remaining work now:
54
  5. **Suggestion diff preview**: show exact before/after membership changes before accepting a suggestion.
55
  6. **Constraint violation detection**: explicitly report conflicting user constraints.
56
  7. **Frontend tests and TypeScript migration**: harden the increasingly stateful UI.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
  5. **Suggestion diff preview**: show exact before/after membership changes before accepting a suggestion.
55
  6. **Constraint violation detection**: explicitly report conflicting user constraints.
56
  7. **Frontend tests and TypeScript migration**: harden the increasingly stateful UI.
57
+
58
+
59
+ ## Closed in Pass 5
60
+
61
+ - Supervised edited-state export now writes `supervised/manifest.json`, edited samples, edited MIDI, edited reconstruction WAV, and edited ZIP.
62
+ - Suppressed hits are excluded from edited exports.
63
+ - Favorite/pinned representatives are honored by edited exports.
64
+ - Add-onset mode writes forced hit slices from `stem.wav`.
65
+ - Suppressed hits can be restored without undoing unrelated edits.
66
+ - Suggestions expose exact before/after diffs and the UI can preview them.
67
+
68
+ ## Current top remaining gaps
69
+
70
+ 1. Cluster merge/relabel/split workflows.
71
+ 2. Cached feature-vector local reclustering around edited hits.
72
+ 3. Edited-vs-original comparison view.
73
+ 4. Batch restore / bulk operations for suppressed hits.
74
+ 5. Browser-level UI tests and TypeScript/Vite hardening.
docs/SUPERVISED_EXPORT_AND_FORCE_ONSET.md ADDED
@@ -0,0 +1,110 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Supervised export and force-onset workflow
2
+
3
+ Last updated: 2026-05-12
4
+
5
+ ## Purpose
6
+
7
+ The batch extraction manifest is immutable. Interactive edits are stored in `supervision_state.json`. This pass adds the missing rendering step that turns those semantic edits into new downloadable artifacts without rerunning Demucs or onset detection.
8
+
9
+ ```text
10
+ manifest.json + review/hits/*.wav + supervision_state.json
11
+ → supervised/manifest.json
12
+ → supervised/samples/*.wav
13
+ → supervised/reconstruction.mid
14
+ → supervised/reconstruction.wav
15
+ → supervised/sample-pack.zip
16
+ ```
17
+
18
+ ## Implemented features
19
+
20
+ | Feature | Status | Files |
21
+ |---|---:|---|
22
+ | Edited-state export | Implemented | `supervised_export.py`, `POST /api/jobs/{job_id}/export` |
23
+ | Suppressed-hit exclusion | Implemented | Export ignores hits with `suppressed=true`. |
24
+ | Favorite/pinned representative export | Implemented | Export honors `representative_hit_id` / favorite hits before quality scoring. |
25
+ | Force-onset from existing stem audio | Implemented | `POST /api/jobs/{job_id}/hits/force-onset` |
26
+ | Forced-hit review WAV writing | Implemented | `review/hits/hit_NNNNN_<label>_forced.wav` |
27
+ | Suppressed-hit restore | Implemented | `POST /api/jobs/{job_id}/hits/{hit_id}/restore` |
28
+ | Exact suggestion diff preview | Implemented | `suggestion.diff` in state responses and UI diff button. |
29
+ | UI add-onset mode | Implemented | Toggle in supervision header; waveform clicks add forced hits. |
30
+ | UI edited export downloads | Implemented | Edited ZIP/MIDI/reconstruction links render after export. |
31
+
32
+ ## Export behavior
33
+
34
+ The supervised export builds clusters from current semantic state:
35
+
36
+ 1. Skip suppressed hits.
37
+ 2. Load hit audio from each hit's `file` path under the job output directory.
38
+ 3. Sanitize cluster labels for output filenames.
39
+ 4. Preserve forced hits and moved/pulled hits through current cluster membership.
40
+ 5. Pick representatives from semantic `representative_hit_id` or favorite hits first.
41
+ 6. Quality-score representatives only for unpinned clusters.
42
+ 7. Write edited samples, MIDI, reconstruction WAV, ZIP, and `supervised/manifest.json`.
43
+ 8. Append a `supervised.exported` event and `latest_export` entry to `supervision_state.json`.
44
+
45
+ The original `manifest.json`, original `sample-pack.zip`, and original `samples/*.wav` are not modified.
46
+
47
+ ## Force-onset behavior
48
+
49
+ `POST /api/jobs/{job_id}/hits/force-onset` requires a completed run with `stem.wav`.
50
+
51
+ Body:
52
+
53
+ ```json
54
+ {
55
+ "onset_sec": 0.123,
56
+ "duration_ms": 160,
57
+ "target_cluster_id": "cluster:0",
58
+ "label": "snare"
59
+ }
60
+ ```
61
+
62
+ Rules:
63
+
64
+ - `onset_sec` is required.
65
+ - `duration_ms` is optional. If omitted, the system slices until the next active onset or a bounded default duration.
66
+ - `target_cluster_id` is optional. If omitted, a new user cluster is created.
67
+ - `label` is optional. If omitted, the current rule-based classifier labels the slice.
68
+ - A short fade-out is applied to avoid clicks.
69
+ - The forced hit is marked `source="forced"`, `explicit=true`, and `review_status="accepted"`.
70
+
71
+ ## Suggestion diffs
72
+
73
+ Open suggestions now include exact previews:
74
+
75
+ ```json
76
+ {
77
+ "type": "move-hits",
78
+ "affected_hit_count": 3,
79
+ "hits": [
80
+ {
81
+ "hit_id": "hit:00007",
82
+ "from_cluster_label": "bright_1",
83
+ "to_cluster_label": "snare_user_1",
84
+ "before_suppressed": false,
85
+ "after_suppressed": false
86
+ }
87
+ ],
88
+ "clusters_before": {},
89
+ "clusters_after": {}
90
+ }
91
+ ```
92
+
93
+ The frontend exposes this through the `Diff` button in the suggestion inbox.
94
+
95
+ ## Validation
96
+
97
+ Covered by:
98
+
99
+ ```bash
100
+ python3 scripts/test_supervised_export_and_force_onset.py
101
+ ```
102
+
103
+ This test verifies:
104
+
105
+ - suppression and restore,
106
+ - forced-hit creation and download,
107
+ - suggestion diff presence when suggestions exist,
108
+ - supervised export creation,
109
+ - artifact download URLs for edited ZIP/MIDI/reconstruction,
110
+ - latest export state metadata.
docs/TASKS.md CHANGED
@@ -76,9 +76,10 @@ Last updated: 2026-05-12
76
  | Add suggestion inbox | Done/Partial | UI/API supports accept/reject; exact diff preview still open. |
77
  | Add cluster explanation drawer | Done | `GET /api/jobs/{job_id}/explain/cluster/{cluster_id}` plus UI drawer. |
78
  | Add semantic undo | Done | `POST /api/jobs/{job_id}/undo`. |
79
- | Add supervised export from edited state | Todo | Needed so corrections affect ZIP/MIDI/WAV outputs. |
80
- | Add click-to-add missed onset | Todo | Needed for `force-onset` constraints and direct onset correction. |
81
- | Add suppressed-hit restore | Todo | Needed as the safety counterpart to suppression. |
 
82
  | Add true local feature-neighborhood reclustering | Todo | Requires cached feature vectors and constraint-aware assignment. |
83
 
84
  ## Latest validation tasks
@@ -87,3 +88,4 @@ Last updated: 2026-05-12
87
  - [x] `node --check web/app.js`
88
  - [x] `python3 scripts/test_sse_and_review_hits.py`
89
  - [x] `python3 scripts/test_interactive_supervision.py`
 
 
76
  | Add suggestion inbox | Done/Partial | UI/API supports accept/reject; exact diff preview still open. |
77
  | Add cluster explanation drawer | Done | `GET /api/jobs/{job_id}/explain/cluster/{cluster_id}` plus UI drawer. |
78
  | Add semantic undo | Done | `POST /api/jobs/{job_id}/undo`. |
79
+ | Add supervised export from edited state | Done | `supervised_export.py`; `POST /api/jobs/{job_id}/export`; UI edited download links. |
80
+ | Add click-to-add missed onset | Done | Add-onset waveform mode creates forced hits from `stem.wav`. |
81
+ | Add suppressed-hit restore | Done | `POST /api/jobs/{job_id}/hits/{hit_id}/restore`; UI restore button. |
82
+ | Add exact suggestion diff previews | Done | Suggestions expose `diff`; UI has `Diff` preview. |
83
  | Add true local feature-neighborhood reclustering | Todo | Requires cached feature vectors and constraint-aware assignment. |
84
 
85
  ## Latest validation tasks
 
88
  - [x] `node --check web/app.js`
89
  - [x] `python3 scripts/test_sse_and_review_hits.py`
90
  - [x] `python3 scripts/test_interactive_supervision.py`
91
+ - [x] `python3 scripts/test_supervised_export_and_force_onset.py`
docs/interactive-ux/ARCHITECTURE_NOTES.md CHANGED
@@ -14,7 +14,7 @@ Current interactive foundation:
14
  audio/cache → immutable manifest/artifacts → supervision_state.json → reactive UI → user constraints/events/suggestions
15
  ```
16
 
17
- The current implementation deliberately keeps the batch extraction artifacts immutable. Interactive edits mutate `supervision_state.json`, not the original `manifest.json`, hit WAVs, representative WAVs, MIDI, reconstruction, or ZIP. This keeps edits cheap and reversible, but supervised re-export is the next architectural step.
18
 
19
  ## Implemented modules
20
 
@@ -22,8 +22,9 @@ The current implementation deliberately keeps the batch extraction artifacts imm
22
  |---|---|
23
  | `pipeline_runner.py` | Batch extraction, timing, manifests, review-hit WAV exports |
24
  | `sample_extractor.py` | Audio analysis, classification, batch/online clustering, export helpers |
25
- | `supervised_state.py` | Persistent semantic job state, constraints, events, confidence, suggestions, undo |
26
- | `app.py` | FastAPI endpoints for batch jobs and supervised state mutations |
 
27
  | `web/app.js` | Browser state rendering, review queue, cluster board, suggestions, actions |
28
  | `web/index.html` | Workstation layout and interactive supervision panel |
29
  | `web/styles.css` | Visual treatment for low confidence, suppression, locks, panels |
@@ -85,6 +86,8 @@ Each completed run now gets:
85
  ```text
86
  .runs/<job_id>/output/manifest.json
87
  .runs/<job_id>/output/supervision_state.json
 
 
88
  ```
89
 
90
  `manifest.json` is the immutable batch result. `supervision_state.json` is the mutable, replayable semantic state.
@@ -97,6 +100,9 @@ POST /api/jobs/{job_id}/hits/{hit_id}/move
97
  POST /api/jobs/{job_id}/hits/{hit_id}/pull-out
98
  POST /api/jobs/{job_id}/hits/{hit_id}/suppress
99
  POST /api/jobs/{job_id}/hits/{hit_id}/review
 
 
 
100
  POST /api/jobs/{job_id}/clusters/{cluster_id}/lock
101
  GET /api/jobs/{job_id}/suggestions
102
  POST /api/jobs/{job_id}/suggestions/{suggestion_id}/accept
@@ -132,7 +138,7 @@ type Suggestion =
132
  | { type: "suppress-hits"; hit_ids: string[]; confidence: number; reason: string };
133
  ```
134
 
135
- Suggestion generation currently uses label, spectral centroid, and RMS-energy similarity. Accepted suggestions become explicit constraints/examples.
136
 
137
  ## Event log
138
 
@@ -151,6 +157,9 @@ suggestion.created
151
  suggestion.accepted
152
  suggestion.rejected
153
  state.undo
 
 
 
154
  ```
155
 
156
  The UI renders recent events and constraints in the supervision panel.
@@ -167,15 +176,22 @@ semantic edit
167
  → recompute hit/cluster confidence and review queue
168
  ```
169
 
170
- Not implemented yet:
171
 
172
  ```text
173
  semantic edit
174
  → load cached feature vectors
175
  → choose affected neighborhood
176
  → run constrained local reclustering
177
- create preview diff
178
- → optionally apply/re-export artifacts
 
 
 
 
 
 
 
179
  ```
180
 
181
  ## UI state implications
@@ -194,10 +210,9 @@ Implemented panels:
194
 
195
  Still missing:
196
 
197
- - edited export panel,
198
- - force-onset mode,
199
- - suppression restore UI,
200
- - side-by-side before/after diff preview.
201
 
202
  ## Implementation warning
203
 
 
14
  audio/cache → immutable manifest/artifacts → supervision_state.json → reactive UI → user constraints/events/suggestions
15
  ```
16
 
17
+ The current implementation keeps the batch extraction artifacts immutable. Interactive edits mutate `supervision_state.json`, then `supervised_export.py` can render those edits into a separate `supervised/` artifact tree. The original `manifest.json`, original sample WAVs, original MIDI/reconstruction, and original ZIP remain untouched.
18
 
19
  ## Implemented modules
20
 
 
22
  |---|---|
23
  | `pipeline_runner.py` | Batch extraction, timing, manifests, review-hit WAV exports |
24
  | `sample_extractor.py` | Audio analysis, classification, batch/online clustering, export helpers |
25
+ | `supervised_state.py` | Persistent semantic job state, constraints, events, confidence, suggestions, force-onset, restore, undo |
26
+ | `app.py` | FastAPI endpoints for batch jobs, supervised state mutations, force-onset, restore, and edited export |
27
+ | `supervised_export.py` | Converts semantic state into edited WAV/MIDI/reconstruction/ZIP artifacts under `supervised/` |
28
  | `web/app.js` | Browser state rendering, review queue, cluster board, suggestions, actions |
29
  | `web/index.html` | Workstation layout and interactive supervision panel |
30
  | `web/styles.css` | Visual treatment for low confidence, suppression, locks, panels |
 
86
  ```text
87
  .runs/<job_id>/output/manifest.json
88
  .runs/<job_id>/output/supervision_state.json
89
+ .runs/<job_id>/output/supervised/manifest.json # after edited export
90
+ .runs/<job_id>/output/supervised/sample-pack.zip # after edited export
91
  ```
92
 
93
  `manifest.json` is the immutable batch result. `supervision_state.json` is the mutable, replayable semantic state.
 
100
  POST /api/jobs/{job_id}/hits/{hit_id}/pull-out
101
  POST /api/jobs/{job_id}/hits/{hit_id}/suppress
102
  POST /api/jobs/{job_id}/hits/{hit_id}/review
103
+ POST /api/jobs/{job_id}/hits/{hit_id}/restore
104
+ POST /api/jobs/{job_id}/hits/force-onset
105
+ POST /api/jobs/{job_id}/export
106
  POST /api/jobs/{job_id}/clusters/{cluster_id}/lock
107
  GET /api/jobs/{job_id}/suggestions
108
  POST /api/jobs/{job_id}/suggestions/{suggestion_id}/accept
 
138
  | { type: "suppress-hits"; hit_ids: string[]; confidence: number; reason: string };
139
  ```
140
 
141
+ Suggestion generation currently uses label, spectral centroid, and RMS-energy similarity. Open suggestions include exact `diff` previews showing affected hits and cluster counts before/after. Accepted suggestions become explicit constraints/examples.
142
 
143
  ## Event log
144
 
 
157
  suggestion.accepted
158
  suggestion.rejected
159
  state.undo
160
+ hit.force_onset
161
+ hit.restored
162
+ supervised.exported
163
  ```
164
 
165
  The UI renders recent events and constraints in the supervision panel.
 
176
  → recompute hit/cluster confidence and review queue
177
  ```
178
 
179
+ Still not implemented:
180
 
181
  ```text
182
  semantic edit
183
  → load cached feature vectors
184
  → choose affected neighborhood
185
  → run constrained local reclustering
186
+ update suggestion/recluster preview from real feature margins
187
+ ```
188
+
189
+ Implemented now:
190
+
191
+ ```text
192
+ semantic edit
193
+ → export supervised state
194
+ → write edited WAV/MIDI/reconstruction/ZIP under supervised/
195
  ```
196
 
197
  ## UI state implications
 
210
 
211
  Still missing:
212
 
213
+ - cluster merge/relabel/split controls,
214
+ - edited-vs-original comparison view,
215
+ - cached feature-neighborhood local reclustering.
 
216
 
217
  ## Implementation warning
218
 
docs/interactive-ux/FEASIBILITY_MATRIX.md CHANGED
@@ -98,3 +98,21 @@ That foundation should be implemented before adding higher-risk semantic-space o
98
  | 5 | Diff preview for suggestions | Makes batch suggestions safer and more trustworthy. |
99
  | 6 | Constraint violation detection | Prevents silent conflicts once constraints become richer. |
100
  | 7 | Browser tests | Protects the increasingly stateful UI from regressions. |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
98
  | 5 | Diff preview for suggestions | Makes batch suggestions safer and more trustworthy. |
99
  | 6 | Constraint violation detection | Prevents silent conflicts once constraints become richer. |
100
  | 7 | Browser tests | Protects the increasingly stateful UI from regressions. |
101
+
102
+
103
+ ## Pass 5 implementation status
104
+
105
+ Implemented after initial alignment:
106
+
107
+ - supervised edited-state export under `supervised/`,
108
+ - add-onset waveform mode backed by `POST /api/jobs/{job_id}/hits/force-onset`,
109
+ - suppressed-hit restore backed by `POST /api/jobs/{job_id}/hits/{hit_id}/restore`,
110
+ - exact suggestion diff previews in API state and UI,
111
+ - validation via `scripts/test_supervised_export_and_force_onset.py`.
112
+
113
+ Still open:
114
+
115
+ - cluster merge/relabel/split workflows,
116
+ - cached feature-vector local reclustering,
117
+ - edited-vs-original comparison,
118
+ - browser-level UI tests.
docs/interactive-ux/FEATURE_REQUIREMENTS.md CHANGED
@@ -36,10 +36,10 @@ Partially implemented:
36
 
37
  - Local recomputation is currently semantic-state recomputation, not full feature-neighborhood reclustering.
38
  - Suggestions are heuristic and preview-count based, not full diff previews.
39
- - Favorite/pin changes semantic representative but does not yet regenerate the sample pack.
40
  - Confidence scoring is heuristic, not feature-margin/stability based.
41
 
42
- Not implemented yet:
43
 
44
  - Click-to-add missed onset.
45
  - Restore suppressed hit.
@@ -160,3 +160,21 @@ Status: **partial**. Constraints/events persist and can be reloaded. A dedicated
160
  ### NFR-006: No silent override of explicit user intent
161
 
162
  Status: **implemented in current semantic layer**. Explicit moves, locks, suppressions, and favorites persist unless undone or explicitly changed.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
  - Local recomputation is currently semantic-state recomputation, not full feature-neighborhood reclustering.
38
  - Suggestions are heuristic and preview-count based, not full diff previews.
39
+ - Favorite/pin changes semantic representative and supervised export honors it when generating the edited sample pack.
40
  - Confidence scoring is heuristic, not feature-margin/stability based.
41
 
42
+ Implemented in Pass 5 where noted; remaining items listed below:
43
 
44
  - Click-to-add missed onset.
45
  - Restore suppressed hit.
 
160
  ### NFR-006: No silent override of explicit user intent
161
 
162
  Status: **implemented in current semantic layer**. Explicit moves, locks, suppressions, and favorites persist unless undone or explicitly changed.
163
+
164
+
165
+ ## Pass 5 implementation status
166
+
167
+ Implemented after initial alignment:
168
+
169
+ - supervised edited-state export under `supervised/`,
170
+ - add-onset waveform mode backed by `POST /api/jobs/{job_id}/hits/force-onset`,
171
+ - suppressed-hit restore backed by `POST /api/jobs/{job_id}/hits/{hit_id}/restore`,
172
+ - exact suggestion diff previews in API state and UI,
173
+ - validation via `scripts/test_supervised_export_and_force_onset.py`.
174
+
175
+ Still open:
176
+
177
+ - cluster merge/relabel/split workflows,
178
+ - cached feature-vector local reclustering,
179
+ - edited-vs-original comparison,
180
+ - browser-level UI tests.
docs/interactive-ux/PROGRESS.md CHANGED
@@ -85,3 +85,10 @@ analyze audio
85
  ```
86
 
87
  The remaining missing piece is that edited semantic state is not yet reflected in a regenerated sample pack.
 
 
 
 
 
 
 
 
85
  ```
86
 
87
  The remaining missing piece is that edited semantic state is not yet reflected in a regenerated sample pack.
88
+
89
+
90
+ ## Pass 5 alignment
91
+
92
+ The interactive UX docs are now aligned with the implemented semantic edit/export loop. The project supports move, pull-out, suppress, restore, favorite, lock, force-onset, suggestion diff preview, undo, and edited artifact export. The current boundary is no longer “semantic only”; edits can now produce separate supervised WAV/MIDI/reconstruction/ZIP artifacts while original batch outputs remain immutable.
93
+
94
+ Remaining UX work is concentrated around cluster-level editing and comparison: merge/relabel/split, feature-vector local reclustering, edited-vs-original diff views, and browser tests.
docs/interactive-ux/README.md CHANGED
@@ -57,3 +57,21 @@ Next implementation should close the gap between semantic edits and artifact out
57
  6. Add browser-level tests for the interactive supervision panel.
58
 
59
  The strongest technical foundation remains constraint-aware clustering plus uncertainty-driven review. The current pass implements the persistent state and UI/API shell needed for that foundation.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  6. Add browser-level tests for the interactive supervision panel.
58
 
59
  The strongest technical foundation remains constraint-aware clustering plus uncertainty-driven review. The current pass implements the persistent state and UI/API shell needed for that foundation.
60
+
61
+
62
+ ## Pass 5 implementation status
63
+
64
+ Implemented after initial alignment:
65
+
66
+ - supervised edited-state export under `supervised/`,
67
+ - add-onset waveform mode backed by `POST /api/jobs/{job_id}/hits/force-onset`,
68
+ - suppressed-hit restore backed by `POST /api/jobs/{job_id}/hits/{hit_id}/restore`,
69
+ - exact suggestion diff previews in API state and UI,
70
+ - validation via `scripts/test_supervised_export_and_force_onset.py`.
71
+
72
+ Still open:
73
+
74
+ - cluster merge/relabel/split workflows,
75
+ - cached feature-vector local reclustering,
76
+ - edited-vs-original comparison,
77
+ - browser-level UI tests.
docs/interactive-ux/SCOPE.md CHANGED
@@ -171,3 +171,21 @@ Not achieved yet:
171
  → edited state can be exported reproducibly as updated WAV/MIDI/ZIP artifacts
172
  → local feature reclustering updates related hits without manually accepting suggestions
173
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
171
  → edited state can be exported reproducibly as updated WAV/MIDI/ZIP artifacts
172
  → local feature reclustering updates related hits without manually accepting suggestions
173
  ```
174
+
175
+
176
+ ## Pass 5 implementation status
177
+
178
+ Implemented after initial alignment:
179
+
180
+ - supervised edited-state export under `supervised/`,
181
+ - add-onset waveform mode backed by `POST /api/jobs/{job_id}/hits/force-onset`,
182
+ - suppressed-hit restore backed by `POST /api/jobs/{job_id}/hits/{hit_id}/restore`,
183
+ - exact suggestion diff previews in API state and UI,
184
+ - validation via `scripts/test_supervised_export_and_force_onset.py`.
185
+
186
+ Still open:
187
+
188
+ - cluster merge/relabel/split workflows,
189
+ - cached feature-vector local reclustering,
190
+ - edited-vs-original comparison,
191
+ - browser-level UI tests.
docs/interactive-ux/TASKS.md CHANGED
@@ -102,4 +102,20 @@ Start with `UX-401` plus supervised export.
102
 
103
  Reason:
104
 
105
- The project now has a replayable state/events/constraints foundation. The largest UX gap is that semantic edits do not yet regenerate edited artifacts. Force-onset is the next direct correction primitive after move/pull/lock/suppress.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
102
 
103
  Reason:
104
 
105
+ The project now has a replayable state/events/constraints foundation plus supervised edited artifact export. The largest UX gaps are cluster merge/relabel/split, cached feature-neighborhood reclustering, and edited-vs-original comparison.
106
+
107
+
108
+ ## Pass 5 completed
109
+
110
+ - [x] Render semantic edits into `supervised/` artifacts.
111
+ - [x] Add force-onset endpoint and waveform add-onset mode.
112
+ - [x] Add suppressed-hit restore endpoint and button.
113
+ - [x] Add exact suggestion diff previews.
114
+ - [x] Add validation script for supervised export and force-onset.
115
+
116
+ ## Next interactive tasks
117
+
118
+ - [ ] Cluster merge/relabel/split controls.
119
+ - [ ] Cached feature-vector local reclustering.
120
+ - [ ] Edited-vs-original comparison view.
121
+ - [ ] Browser tests for waveform add-onset and edited export.
scripts/test_supervised_export_and_force_onset.py ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Smoke-test force-onset, restore, suggestion diffs, and supervised export."""
3
+
4
+ from __future__ import annotations
5
+
6
+ import io
7
+ import json
8
+ import sys
9
+ import time
10
+ from pathlib import Path
11
+ from urllib.parse import quote
12
+
13
+ import soundfile as sf
14
+ from fastapi.testclient import TestClient
15
+
16
+ sys.path.insert(0, str(Path(__file__).resolve().parents[1]))
17
+
18
+ from app import app # noqa: E402
19
+ from synth_generator import generate_test_song # noqa: E402
20
+
21
+
22
+ def wait_for_job(client: TestClient, job_id: str) -> dict:
23
+ for _ in range(100):
24
+ payload = client.get(f"/api/jobs/{job_id}").json()
25
+ if payload["status"] in {"complete", "error"}:
26
+ return payload
27
+ time.sleep(0.12)
28
+ raise TimeoutError(job_id)
29
+
30
+
31
+ def post_json(client: TestClient, path: str, body: dict | None = None) -> dict:
32
+ response = client.post(path, json=body or {})
33
+ response.raise_for_status()
34
+ return response.json()
35
+
36
+
37
+ def main() -> int:
38
+ song = generate_test_song(pattern_name="funk", bars=1, bpm=124, add_bass=False)
39
+ buf = io.BytesIO()
40
+ sf.write(buf, song.drums_only, song.sr, format="WAV")
41
+ buf.seek(0)
42
+
43
+ client = TestClient(app)
44
+ response = client.post(
45
+ "/api/jobs",
46
+ files={"file": ("supervised.wav", buf, "audio/wav")},
47
+ data={"params": json.dumps({"stem": "all", "clustering_mode": "online_preview", "target_min": 3, "target_max": 10})},
48
+ )
49
+ response.raise_for_status()
50
+ job_id = response.json()["id"]
51
+ job = wait_for_job(client, job_id)
52
+ assert job["status"] == "complete", job.get("error")
53
+
54
+ state = client.get(f"/api/jobs/{job_id}/state").json()
55
+ assert state["summary"]["hit_count"] > 0
56
+ assert state["summary"]["cluster_count"] > 0
57
+ first_hit = state["hits"][0]
58
+ first_cluster = state["clusters"][0]
59
+
60
+ # Suppression is reversible.
61
+ q_hit = quote(first_hit["id"], safe="")
62
+ state = post_json(client, f"/api/jobs/{job_id}/hits/{q_hit}/suppress", {"reason": "bleed"})
63
+ assert state["summary"]["suppressed_hit_count"] >= 1
64
+ state = post_json(client, f"/api/jobs/{job_id}/hits/{q_hit}/restore", {})
65
+ assert state["summary"]["suppressed_hit_count"] == 0
66
+
67
+ # Forced onset writes a review WAV and appears in semantic state.
68
+ state = post_json(
69
+ client,
70
+ f"/api/jobs/{job_id}/hits/force-onset",
71
+ {"onset_sec": 0.123, "target_cluster_id": first_cluster["id"], "duration_ms": 160},
72
+ )
73
+ forced = [hit for hit in state["hits"] if hit.get("source") == "forced"]
74
+ assert forced, "expected forced hit"
75
+ forced_url = forced[-1]["url"]
76
+ forced_response = client.get(forced_url)
77
+ forced_response.raise_for_status()
78
+ assert forced_response.content[:4] == b"RIFF"
79
+
80
+ # Suggestion diffs are exact previews when suggestions exist.
81
+ suggestions = state.get("suggestions", [])
82
+ if suggestions:
83
+ assert "diff" in suggestions[0]
84
+ assert "affected_hit_count" in suggestions[0]["diff"]
85
+
86
+ # Supervised export excludes suppressed hits, includes forced hit, and writes downloadable artifacts.
87
+ export_response = post_json(client, f"/api/jobs/{job_id}/export", {"synthesize": True})
88
+ export = export_response["export"]
89
+ assert export["kind"] == "supervised-export"
90
+ assert export["hit_count"] == state["summary"]["hit_count"] - state["summary"].get("suppressed_hit_count", 0)
91
+ assert export["cluster_count"] >= 1
92
+ for key in ["archive", "midi", "reconstruction"]:
93
+ url = export["file_urls"][key]
94
+ file_response = client.get(url)
95
+ file_response.raise_for_status()
96
+ assert file_response.content, key
97
+
98
+ refreshed = export_response["state"]
99
+ assert refreshed["summary"]["latest_export"]["path"] == "supervised/manifest.json"
100
+
101
+ print(json.dumps({
102
+ "status": "ok",
103
+ "job_id": job_id,
104
+ "forced_hit_count": refreshed["summary"].get("forced_hit_count"),
105
+ "export_hit_count": export["hit_count"],
106
+ "export_cluster_count": export["cluster_count"],
107
+ "archive": export["files"]["archive"],
108
+ }, indent=2))
109
+ return 0
110
+
111
+
112
+ if __name__ == "__main__":
113
+ raise SystemExit(main())
supervised_export.py ADDED
@@ -0,0 +1,261 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Render supervised semantic state into edited sample-pack artifacts.
3
+
4
+ The batch manifest remains immutable. This module takes the mutable
5
+ ``supervision_state.json`` layer, excludes suppressed hits, honors explicit
6
+ representatives/favorites, and writes a separate ``supervised/`` export tree.
7
+ """
8
+
9
+ from __future__ import annotations
10
+
11
+ import json
12
+ import os
13
+ import re
14
+ import shutil
15
+ import time
16
+ from dataclasses import asdict
17
+ from pathlib import Path
18
+ from typing import Any
19
+
20
+ import numpy as np
21
+ import soundfile as sf
22
+
23
+ from sample_extractor import (
24
+ Cluster,
25
+ Hit,
26
+ build_archive,
27
+ export_midi,
28
+ render_midi_with_samples,
29
+ sample_quality_score,
30
+ select_best,
31
+ synthesize_from_cluster,
32
+ )
33
+ from supervised_state import load_manifest, load_or_create_state, now, recompute_scores
34
+
35
+
36
+ def _safe_label(value: Any, fallback: str) -> str:
37
+ text = str(value or fallback).strip() or fallback
38
+ text = re.sub(r"[^A-Za-z0-9._-]+", "_", text)
39
+ text = re.sub(r"_+", "_", text).strip("._-")
40
+ return text or fallback
41
+
42
+
43
+ def _read_hit_audio(output_dir: Path, hit: dict[str, Any]) -> tuple[np.ndarray, int]:
44
+ rel = hit.get("file")
45
+ if not rel:
46
+ raise FileNotFoundError(f"Hit {hit.get('id')} does not have a file path")
47
+ path = (output_dir / rel).resolve()
48
+ path.relative_to(output_dir.resolve())
49
+ if not path.exists():
50
+ raise FileNotFoundError(f"Hit audio missing for {hit.get('id')}: {rel}")
51
+ audio, sr = sf.read(path, dtype="float32", always_2d=False)
52
+ if audio.ndim > 1:
53
+ audio = audio.mean(axis=1)
54
+ return np.asarray(audio, dtype=np.float32), int(sr)
55
+
56
+
57
+ def _state_to_clusters(output_dir: Path, state: dict[str, Any]) -> list[Cluster]:
58
+ hits_by_id = state.get("hits", {})
59
+ clusters: list[Cluster] = []
60
+ used_labels: set[str] = set()
61
+
62
+ for ordinal, raw_cluster in enumerate(state.get("clusters", {}).values()):
63
+ active_hit_ids = [
64
+ hid
65
+ for hid in raw_cluster.get("hit_ids", [])
66
+ if hid in hits_by_id and not hits_by_id[hid].get("suppressed")
67
+ ]
68
+ if not active_hit_ids:
69
+ continue
70
+
71
+ label = _safe_label(raw_cluster.get("label"), f"cluster_{ordinal}")
72
+ base = label
73
+ suffix = 1
74
+ while label in used_labels:
75
+ suffix += 1
76
+ label = f"{base}_{suffix}"
77
+ used_labels.add(label)
78
+
79
+ converted_hits: list[Hit] = []
80
+ for hid in active_hit_ids:
81
+ raw_hit = hits_by_id[hid]
82
+ audio, sr = _read_hit_audio(output_dir, raw_hit)
83
+ converted_hits.append(
84
+ Hit(
85
+ audio=audio,
86
+ sr=sr,
87
+ onset_time=float(raw_hit.get("onset_sec") or 0.0),
88
+ duration=float(raw_hit.get("duration_ms") or 0.0) / 1000.0 or (len(audio) / max(sr, 1)),
89
+ index=int(raw_hit.get("index") or 0),
90
+ rms_energy=float(raw_hit.get("rms_energy") or 0.0),
91
+ spectral_centroid=float(raw_hit.get("spectral_centroid_hz") or 0.0),
92
+ label=str(raw_hit.get("label") or raw_cluster.get("classification") or "other"),
93
+ cluster_id=ordinal,
94
+ )
95
+ )
96
+
97
+ cluster = Cluster(cluster_id=ordinal, label=label, hits=converted_hits)
98
+ representative = raw_cluster.get("representative_hit_id")
99
+ pinned = False
100
+ if representative in active_hit_ids:
101
+ cluster.best_hit_idx = active_hit_ids.index(representative)
102
+ pinned = True
103
+ else:
104
+ favorite_idx = next((i for i, hid in enumerate(active_hit_ids) if hits_by_id[hid].get("favorite")), None)
105
+ if favorite_idx is not None:
106
+ cluster.best_hit_idx = favorite_idx
107
+ pinned = True
108
+ setattr(cluster, "_pinned_by_state", pinned)
109
+ clusters.append(cluster)
110
+
111
+ # Score representatives only where the supervision state did not pin one.
112
+ for cluster in clusters:
113
+ if cluster.count <= 1:
114
+ cluster.best_hit_idx = 0
115
+ unpinned = [cluster for cluster in clusters if cluster.count > 1 and not getattr(cluster, "_pinned_by_state", False)]
116
+ if unpinned:
117
+ select_best(unpinned)
118
+ return clusters
119
+
120
+
121
+ def _write_audio(path: Path, audio: np.ndarray, sr: int, subtype: str = "PCM_16") -> None:
122
+ path.parent.mkdir(parents=True, exist_ok=True)
123
+ sf.write(path, audio, sr, subtype=subtype)
124
+
125
+
126
+ def export_supervised_state(
127
+ output_dir: str | os.PathLike[str],
128
+ job_id: str,
129
+ *,
130
+ synthesize: bool = True,
131
+ quantize: bool | None = None,
132
+ subdivision: int | None = None,
133
+ ) -> dict[str, Any]:
134
+ """Create edited artifacts from ``supervision_state.json``.
135
+
136
+ Returns the JSON manifest written to ``supervised/manifest.json``.
137
+ """
138
+ out = Path(output_dir)
139
+ manifest = load_manifest(out)
140
+ state = load_or_create_state(job_id, out)
141
+ recompute_scores(state)
142
+
143
+ export_dir = out / "supervised"
144
+ if export_dir.exists():
145
+ shutil.rmtree(export_dir)
146
+ samples_dir = export_dir / "samples"
147
+ samples_dir.mkdir(parents=True, exist_ok=True)
148
+
149
+ clusters = _state_to_clusters(out, state)
150
+ bpm = float(manifest.get("bpm") or 120.0)
151
+ sr = int(manifest.get("sample_rate") or 44100)
152
+ params = manifest.get("params") or {}
153
+ if quantize is None:
154
+ quantize = bool(params.get("quantize_midi", True))
155
+ if subdivision is None:
156
+ subdivision = int(params.get("subdivision", 16))
157
+
158
+ started = time.perf_counter()
159
+ files: dict[str, str] = {}
160
+ samples: list[dict[str, Any]] = []
161
+
162
+ midi_path = export_dir / "reconstruction.mid"
163
+ if clusters:
164
+ export_midi(clusters, str(midi_path), bpm=bpm, quantize=quantize, subdivision=int(subdivision))
165
+ rendered = render_midi_with_samples(clusters, sr=sr)
166
+ if synthesize:
167
+ for cluster in clusters:
168
+ if cluster.count >= 2:
169
+ cluster.synthesized = synthesize_from_cluster(cluster)
170
+ else:
171
+ midi_path.write_bytes(b"")
172
+ rendered = np.zeros(sr, dtype=np.float32)
173
+
174
+ _write_audio(export_dir / "reconstruction.wav", rendered, sr, subtype="PCM_16")
175
+ files["midi"] = "supervised/reconstruction.mid"
176
+ files["reconstruction"] = "supervised/reconstruction.wav"
177
+
178
+ for cluster in sorted(clusters, key=lambda item: item.count, reverse=True):
179
+ best = cluster.best_hit
180
+ sample_file = f"supervised/samples/{cluster.label}.wav"
181
+ best.save(str(out / sample_file))
182
+ quality = sample_quality_score(best.audio, best.sr, cluster.label.rsplit("_", 1)[0])
183
+ samples.append(
184
+ {
185
+ "label": cluster.label,
186
+ "classification": cluster.label.rsplit("_", 1)[0],
187
+ "hits": int(cluster.count),
188
+ "midi_note": int(cluster.midi_note),
189
+ "score": round(float(quality["total"]), 2),
190
+ "cleanness": round(float(quality["cleanness"]), 4),
191
+ "completeness": round(float(quality["completeness"]), 4),
192
+ "duration_ms": round(float(best.duration * 1000), 1),
193
+ "first_onset_sec": round(float(min(hit.onset_time for hit in cluster.hits)), 4),
194
+ "file": sample_file,
195
+ }
196
+ )
197
+ if cluster.synthesized is not None:
198
+ _write_audio(out / f"supervised/samples/{cluster.label}__synth.wav", cluster.synthesized, sr, subtype="PCM_24")
199
+
200
+ archive_tmp = build_archive(clusters, bpm, sr, midi_path=str(midi_path), rendered_audio=rendered)
201
+ archive_rel = "supervised/sample-pack.zip"
202
+ shutil.copyfile(archive_tmp, out / archive_rel)
203
+ try:
204
+ os.unlink(archive_tmp)
205
+ except OSError:
206
+ pass
207
+ files["archive"] = archive_rel
208
+
209
+ active_hits = [hit for hit in state.get("hits", {}).values() if not hit.get("suppressed")]
210
+ export_manifest = {
211
+ "kind": "supervised-export",
212
+ "job_id": job_id,
213
+ "created_at": now(),
214
+ "duration_sec": round(time.perf_counter() - started, 6),
215
+ "source_manifest_fingerprint": state.get("manifest_fingerprint"),
216
+ "state_updated_at": state.get("updated_at"),
217
+ "bpm": bpm,
218
+ "sample_rate": sr,
219
+ "hit_count": len(active_hits),
220
+ "suppressed_hit_count": sum(1 for hit in state.get("hits", {}).values() if hit.get("suppressed")),
221
+ "cluster_count": len(clusters),
222
+ "quantize_midi": bool(quantize),
223
+ "subdivision": int(subdivision),
224
+ "samples": samples,
225
+ "files": files,
226
+ "state_summary": {
227
+ "constraint_count": len(state.get("constraints", [])),
228
+ "event_count": len(state.get("events", [])),
229
+ "open_suggestion_count": len([s for s in state.get("suggestions", []) if s.get("status") == "open"]),
230
+ },
231
+ }
232
+ (export_dir / "manifest.json").write_text(json.dumps(export_manifest, indent=2, sort_keys=True), encoding="utf-8")
233
+
234
+ state.setdefault("exports", []).append(
235
+ {
236
+ "created_at": export_manifest["created_at"],
237
+ "path": "supervised/manifest.json",
238
+ "hit_count": export_manifest["hit_count"],
239
+ "cluster_count": export_manifest["cluster_count"],
240
+ "suppressed_hit_count": export_manifest["suppressed_hit_count"],
241
+ }
242
+ )
243
+ state["latest_export"] = state["exports"][-1]
244
+ state.setdefault("events", []).append(
245
+ {
246
+ "id": f"event:export:{int(export_manifest['created_at'] * 1000)}",
247
+ "type": "supervised.exported",
248
+ "source": "system",
249
+ "created_at": export_manifest["created_at"],
250
+ "payload": {
251
+ "hit_count": export_manifest["hit_count"],
252
+ "cluster_count": export_manifest["cluster_count"],
253
+ "archive": archive_rel,
254
+ },
255
+ }
256
+ )
257
+ state_path = out / "supervision_state.json"
258
+ state["updated_at"] = now()
259
+ state_path.write_text(json.dumps(state, indent=2, sort_keys=True), encoding="utf-8")
260
+
261
+ return export_manifest
supervised_state.py CHANGED
@@ -5,9 +5,9 @@ The extraction pipeline produces immutable audio artifacts and a batch manifest.
5
  This module layers replayable semantic state on top of that manifest: hits,
6
  clusters, constraints, events, suggestions, confidence, and undo snapshots.
7
 
8
- The first implementation intentionally avoids rewriting audio artifacts. It makes
9
- supervised edits cheap, explicit, inspectable, and reproducible, then leaves
10
- artifact re-export as a later step.
11
  """
12
 
13
  from __future__ import annotations
@@ -70,6 +70,21 @@ def _safe_float(value: Any, default: float = 0.0) -> float:
70
  pass
71
  return default
72
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
73
 
74
  def _snapshot(state: dict[str, Any]) -> dict[str, Any]:
75
  snap = copy.deepcopy(state)
@@ -367,6 +382,77 @@ def _find_similar_hits(state: dict[str, Any], hit_id: str, *, exclude_cluster: s
367
  return scored[:limit]
368
 
369
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
370
  def _add_suggestion(state: dict[str, Any], suggestion_type: str, payload: dict[str, Any], confidence: float, reason: str) -> dict[str, Any]:
371
  suggestion = {
372
  "id": _new_id("suggestion"),
@@ -377,6 +463,7 @@ def _add_suggestion(state: dict[str, Any], suggestion_type: str, payload: dict[s
377
  "reason": reason,
378
  **payload,
379
  }
 
380
  state.setdefault("suggestions", []).append(suggestion)
381
  _event(state, "suggestion.created", {"suggestion_id": suggestion["id"], "type": suggestion_type, "reason": reason})
382
  return suggestion
@@ -528,6 +615,143 @@ def suppress_hit(output_dir: str | Path, job_id: str, hit_id: str, reason: str =
528
  return _write_state(output_dir, state)
529
 
530
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
531
  def set_hit_review_status(output_dir: str | Path, job_id: str, hit_id: str, status: str = "accepted", source: str = "user") -> dict[str, Any]:
532
  if status not in {"unreviewed", "accepted", "favorite"}:
533
  raise ValueError("status must be unreviewed, accepted, or favorite")
@@ -649,8 +873,13 @@ def public_state(state: dict[str, Any], url_for: Callable[[str], str] | None = N
649
  hit["url"] = url_for(hit["file"])
650
  clusters.sort(key=lambda c: (-len(c.get("hit_ids", [])), c.get("label", "")))
651
  hits.sort(key=lambda h: h.get("index", 0))
652
- open_suggestions = [s for s in state.get("suggestions", []) if s.get("status") == "open"]
 
 
653
  open_suggestions.sort(key=lambda s: (-_safe_float(s.get("confidence")), s.get("created_at", 0)))
 
 
 
654
  return {
655
  "version": state.get("version"),
656
  "job_id": state.get("job_id"),
@@ -665,6 +894,8 @@ def public_state(state: dict[str, Any], url_for: Callable[[str], str] | None = N
665
  "suppressed_hit_count": sum(1 for h in hits if h.get("suppressed")),
666
  "locked_cluster_count": sum(1 for c in clusters if c.get("locked")),
667
  "undo_available": bool(state.get("undo_stack")),
 
 
668
  },
669
  "hits": hits,
670
  "clusters": clusters,
 
5
  This module layers replayable semantic state on top of that manifest: hits,
6
  clusters, constraints, events, suggestions, confidence, and undo snapshots.
7
 
8
+ Supervised edits are cheap, explicit, inspectable, and reproducible. A
9
+ separate supervised export step renders the mutable state into edited WAV/MIDI/ZIP
10
+ artifacts without mutating the original batch manifest.
11
  """
12
 
13
  from __future__ import annotations
 
70
  pass
71
  return default
72
 
73
+ def _safe_int(value: Any, default: int = 0) -> int:
74
+ try:
75
+ return int(value)
76
+ except Exception:
77
+ return default
78
+
79
+
80
+ def _safe_file_component(value: str) -> str:
81
+ import re
82
+
83
+ text = str(value or "hit").strip().lower()
84
+ text = re.sub(r"[^a-z0-9._-]+", "_", text)
85
+ text = re.sub(r"_+", "_", text).strip("._-")
86
+ return text or "hit"
87
+
88
 
89
  def _snapshot(state: dict[str, Any]) -> dict[str, Any]:
90
  snap = copy.deepcopy(state)
 
382
  return scored[:limit]
383
 
384
 
385
+ def suggestion_diff(state: dict[str, Any], suggestion: dict[str, Any]) -> dict[str, Any]:
386
+ """Build an exact before/after preview for a suggestion against current state."""
387
+ hits = state.get("hits", {})
388
+ clusters = state.get("clusters", {})
389
+ stype = suggestion.get("type")
390
+ hit_ids = [hid for hid in suggestion.get("hit_ids", []) if hid in hits]
391
+
392
+ def cluster_snapshot(cluster_id: str | None) -> dict[str, Any]:
393
+ cluster = clusters.get(cluster_id or "", {})
394
+ members = [hid for hid in cluster.get("hit_ids", []) if hid in hits]
395
+ active = [hid for hid in members if not hits[hid].get("suppressed")]
396
+ return {
397
+ "cluster_id": cluster_id,
398
+ "label": cluster.get("label", cluster_id),
399
+ "active_count": len(active),
400
+ "total_count": len(members),
401
+ "suppressed_count": sum(1 for hid in members if hits[hid].get("suppressed")),
402
+ }
403
+
404
+ rows = []
405
+ cluster_ids: set[str] = set()
406
+ for hid in hit_ids:
407
+ hit = hits[hid]
408
+ source_cluster_id = hit.get("cluster_id")
409
+ target_cluster_id = suggestion.get("target_cluster_id") if stype in {"move-hits", "split-hits"} else source_cluster_id
410
+ cluster_ids.add(str(source_cluster_id))
411
+ if target_cluster_id:
412
+ cluster_ids.add(str(target_cluster_id))
413
+ rows.append(
414
+ {
415
+ "hit_id": hid,
416
+ "hit_index": hit.get("index"),
417
+ "label": hit.get("label"),
418
+ "from_cluster_id": source_cluster_id,
419
+ "from_cluster_label": clusters.get(source_cluster_id, {}).get("label"),
420
+ "to_cluster_id": target_cluster_id,
421
+ "to_cluster_label": clusters.get(target_cluster_id, {}).get("label") if target_cluster_id else None,
422
+ "before_suppressed": bool(hit.get("suppressed")),
423
+ "after_suppressed": bool(hit.get("suppressed")) or stype == "suppress-hits",
424
+ "confidence": hit.get("confidence"),
425
+ }
426
+ )
427
+
428
+ before = {cid: cluster_snapshot(cid) for cid in sorted(cluster_ids)}
429
+ after = copy.deepcopy(before)
430
+ if stype in {"move-hits", "split-hits"}:
431
+ target = suggestion.get("target_cluster_id")
432
+ for row in rows:
433
+ source = row.get("from_cluster_id")
434
+ if source in after and source != target:
435
+ after[source]["active_count"] = max(0, after[source]["active_count"] - 1)
436
+ after[source]["total_count"] = max(0, after[source]["total_count"] - 1)
437
+ if target in after and source != target:
438
+ after[target]["active_count"] += 1
439
+ after[target]["total_count"] += 1
440
+ elif stype == "suppress-hits":
441
+ for row in rows:
442
+ source = row.get("from_cluster_id")
443
+ if source in after and not row.get("before_suppressed"):
444
+ after[source]["active_count"] = max(0, after[source]["active_count"] - 1)
445
+ after[source]["suppressed_count"] += 1
446
+
447
+ return {
448
+ "type": stype,
449
+ "affected_hit_count": len(rows),
450
+ "hits": rows,
451
+ "clusters_before": before,
452
+ "clusters_after": after,
453
+ }
454
+
455
+
456
  def _add_suggestion(state: dict[str, Any], suggestion_type: str, payload: dict[str, Any], confidence: float, reason: str) -> dict[str, Any]:
457
  suggestion = {
458
  "id": _new_id("suggestion"),
 
463
  "reason": reason,
464
  **payload,
465
  }
466
+ suggestion["diff"] = suggestion_diff(state, suggestion)
467
  state.setdefault("suggestions", []).append(suggestion)
468
  _event(state, "suggestion.created", {"suggestion_id": suggestion["id"], "type": suggestion_type, "reason": reason})
469
  return suggestion
 
615
  return _write_state(output_dir, state)
616
 
617
 
618
+ def restore_hit(output_dir: str | Path, job_id: str, hit_id: str, source: str = "user") -> dict[str, Any]:
619
+ state = load_or_create_state(job_id, output_dir)
620
+ hits = state.get("hits", {})
621
+ if hit_id not in hits:
622
+ raise KeyError(f"Unknown hit: {hit_id}")
623
+ _push_undo(state)
624
+ hit = hits[hit_id]
625
+ hit["suppressed"] = False
626
+ hit["review_status"] = "unreviewed" if hit.get("review_status") == "suppressed" else hit.get("review_status", "unreviewed")
627
+ hit["explicit"] = True
628
+ _constraint(state, "restore-hit", {"hit_id": hit_id}, source=source)
629
+ _event(state, "hit.restored", {"hit_id": hit_id}, source=source)
630
+ recompute_scores(state)
631
+ return _write_state(output_dir, state)
632
+
633
+
634
+ def force_onset(
635
+ output_dir: str | Path,
636
+ job_id: str,
637
+ onset_sec: float,
638
+ *,
639
+ duration_ms: float | None = None,
640
+ label: str | None = None,
641
+ target_cluster_id: str | None = None,
642
+ pre_pad_sec: float = 0.003,
643
+ source: str = "user",
644
+ ) -> dict[str, Any]:
645
+ """Create a user-forced hit from ``stem.wav`` and add it to semantic state."""
646
+ import librosa
647
+ import numpy as np
648
+ import soundfile as sf
649
+
650
+ from sample_extractor import Hit as AudioHit, classify_hit
651
+
652
+ out = Path(output_dir)
653
+ stem_path = out / "stem.wav"
654
+ if not stem_path.exists():
655
+ raise FileNotFoundError("stem.wav is required before forcing onsets")
656
+
657
+ state = load_or_create_state(job_id, out)
658
+ hits = state.setdefault("hits", {})
659
+ clusters = state.setdefault("clusters", {})
660
+ onset = max(0.0, _safe_float(onset_sec))
661
+
662
+ audio, sr = sf.read(stem_path, dtype="float32", always_2d=False)
663
+ if audio.ndim > 1:
664
+ audio = audio.mean(axis=1)
665
+ audio = np.asarray(audio, dtype=np.float32)
666
+ duration = (_safe_float(duration_ms, 0.0) / 1000.0) if duration_ms else 0.0
667
+ if duration <= 0:
668
+ future_onsets = sorted(
669
+ _safe_float(hit.get("onset_sec"))
670
+ for hit in hits.values()
671
+ if not hit.get("suppressed") and _safe_float(hit.get("onset_sec")) > onset + 0.01
672
+ )
673
+ next_onset = future_onsets[0] if future_onsets else None
674
+ duration = min(1.5, max(0.08, (next_onset - onset) if next_onset is not None else 0.45))
675
+ duration = max(0.02, min(10.0, duration))
676
+
677
+ start = max(0, int((onset - max(0.0, pre_pad_sec)) * sr))
678
+ end = min(len(audio), start + int(duration * sr))
679
+ if end <= start:
680
+ raise ValueError("Forced onset is outside the available stem audio")
681
+ segment = audio[start:end].copy()
682
+ fade_len = min(int(0.003 * sr), len(segment) // 4)
683
+ if fade_len > 0:
684
+ segment[-fade_len:] *= np.linspace(1, 0, fade_len)
685
+ rms = float(np.sqrt(np.mean(segment**2))) if len(segment) else 0.0
686
+ spectral_centroid = float(librosa.feature.spectral_centroid(y=segment, sr=sr).mean()) if len(segment) >= 32 else 0.0
687
+
688
+ index = max((_safe_int(hit.get("index"), -1) for hit in hits.values()), default=-1) + 1
689
+ tmp_hit = AudioHit(audio=segment, sr=sr, onset_time=onset, duration=len(segment) / sr, index=index, rms_energy=rms, spectral_centroid=spectral_centroid)
690
+ inferred_label = label or classify_hit(tmp_hit)
691
+ tmp_hit.label = inferred_label
692
+ hit_id = _hit_id({"index": index})
693
+ safe_label = _safe_file_component(inferred_label or "forced")
694
+ rel_file = f"review/hits/hit_{index:05d}_{safe_label}_forced.wav"
695
+ full_path = out / rel_file
696
+ full_path.parent.mkdir(parents=True, exist_ok=True)
697
+ sf.write(full_path, segment, sr, subtype="PCM_24")
698
+
699
+ _push_undo(state)
700
+ if target_cluster_id and target_cluster_id not in clusters:
701
+ raise KeyError(f"Unknown cluster: {target_cluster_id}")
702
+ if not target_cluster_id:
703
+ state.setdefault("counters", {})["user_clusters"] = int(state.get("counters", {}).get("user_clusters", 0)) + 1
704
+ target_cluster_id = _new_id("cluster:user")
705
+ cluster_label = f"{_safe_file_component(inferred_label)}_forced_{state['counters']['user_clusters']}"
706
+ clusters[target_cluster_id] = {
707
+ "id": target_cluster_id,
708
+ "label": cluster_label,
709
+ "classification": _base_label(cluster_label),
710
+ "hit_ids": [],
711
+ "representative_hit_id": hit_id,
712
+ "locked": False,
713
+ "user_named": bool(label),
714
+ "confidence": 0.0,
715
+ "confidence_reasons": [],
716
+ "suppressed_count": 0,
717
+ "original_id": None,
718
+ }
719
+ cluster_label = clusters[target_cluster_id].get("label", target_cluster_id)
720
+ hits[hit_id] = {
721
+ "id": hit_id,
722
+ "index": index,
723
+ "label": str(inferred_label or "other"),
724
+ "cluster_id": target_cluster_id,
725
+ "original_cluster_id": None,
726
+ "cluster_label": cluster_label,
727
+ "onset_sec": round(onset, 6),
728
+ "duration_ms": round((len(segment) / sr) * 1000.0, 1),
729
+ "rms_energy": round(rms, 6),
730
+ "spectral_centroid_hz": round(spectral_centroid, 1),
731
+ "file": rel_file,
732
+ "is_representative": False,
733
+ "source": "forced",
734
+ "suppressed": False,
735
+ "favorite": False,
736
+ "review_status": "accepted",
737
+ "confidence": 0.0,
738
+ "confidence_reasons": [],
739
+ "explicit": True,
740
+ }
741
+ clusters[target_cluster_id].setdefault("hit_ids", [])
742
+ if hit_id not in clusters[target_cluster_id]["hit_ids"]:
743
+ clusters[target_cluster_id]["hit_ids"].append(hit_id)
744
+ if not clusters[target_cluster_id].get("representative_hit_id"):
745
+ clusters[target_cluster_id]["representative_hit_id"] = hit_id
746
+
747
+ _constraint(state, "force-onset", {"hit_id": hit_id, "onset_sec": round(onset, 6)}, source=source)
748
+ _constraint(state, "force-cluster", {"hit_id": hit_id, "cluster_id": target_cluster_id}, source=source)
749
+ _event(state, "hit.force_onset", {"hit_id": hit_id, "onset_sec": round(onset, 6), "cluster_id": target_cluster_id}, source=source)
750
+ _rebuild_cluster_labels(state)
751
+ recompute_scores(state)
752
+ return _write_state(out, state)
753
+
754
+
755
  def set_hit_review_status(output_dir: str | Path, job_id: str, hit_id: str, status: str = "accepted", source: str = "user") -> dict[str, Any]:
756
  if status not in {"unreviewed", "accepted", "favorite"}:
757
  raise ValueError("status must be unreviewed, accepted, or favorite")
 
873
  hit["url"] = url_for(hit["file"])
874
  clusters.sort(key=lambda c: (-len(c.get("hit_ids", [])), c.get("label", "")))
875
  hits.sort(key=lambda h: h.get("index", 0))
876
+ open_suggestions = [copy.deepcopy(s) for s in state.get("suggestions", []) if s.get("status") == "open"]
877
+ for suggestion in open_suggestions:
878
+ suggestion["diff"] = suggestion.get("diff") or suggestion_diff(state, suggestion)
879
  open_suggestions.sort(key=lambda s: (-_safe_float(s.get("confidence")), s.get("created_at", 0)))
880
+ latest_export = copy.deepcopy(state.get("latest_export"))
881
+ if latest_export and url_for and latest_export.get("path"):
882
+ latest_export["url"] = url_for(latest_export["path"])
883
  return {
884
  "version": state.get("version"),
885
  "job_id": state.get("job_id"),
 
894
  "suppressed_hit_count": sum(1 for h in hits if h.get("suppressed")),
895
  "locked_cluster_count": sum(1 for c in clusters if c.get("locked")),
896
  "undo_available": bool(state.get("undo_stack")),
897
+ "forced_hit_count": sum(1 for h in hits if h.get("source") == "forced"),
898
+ "latest_export": latest_export,
899
  },
900
  "hits": hits,
901
  "clusters": clusters,
web/app.js CHANGED
@@ -15,6 +15,7 @@ let lastResult = null;
15
  let lastSupervisionState = null;
16
  let activeJobId = null;
17
  let selectedHitIndex = null;
 
18
 
19
  function esc(value) {
20
  return String(value ?? "").replace(/[&<>'"]/g, (c) => ({ "&": "&amp;", "<": "&lt;", ">": "&gt;", "'": "&#39;", '"': "&quot;" }[c]));
@@ -90,16 +91,20 @@ function currentTargetCluster() {
90
  function setActionButtons() {
91
  const hasState = Boolean(activeJobId && lastSupervisionState);
92
  const hasHit = hasState && selectedHitIndex !== null;
93
- for (const id of ["moveHitButton", "pullHitButton", "acceptHitButton", "favoriteHitButton", "suppressHitButton"]) {
94
  const button = $(id);
95
  if (button) button.disabled = !hasHit;
96
  }
97
- for (const id of ["refreshStateButton", "undoButton", "lockClusterButton", "explainClusterButton"]) {
98
  const button = $(id);
99
  if (button) button.disabled = !hasState;
100
  }
101
  const target = currentTargetCluster();
102
  if ($("lockClusterButton")) $("lockClusterButton").textContent = target?.locked ? "Unlock target cluster" : "Lock target cluster";
 
 
 
 
103
  if ($("undoButton") && lastSupervisionState) $("undoButton").disabled = !lastSupervisionState.summary?.undo_available;
104
  }
105
 
@@ -203,6 +208,17 @@ function drawWaveform(overview) {
203
  ctx.lineTo(x, selected ? h - 3 : h - 10);
204
  ctx.stroke();
205
  }
 
 
 
 
 
 
 
 
 
 
 
206
  }
207
 
208
  function playAudio(el, url) {
@@ -213,9 +229,27 @@ function playAudio(el, url) {
213
  if (promise && typeof promise.catch === "function") promise.catch(() => {});
214
  }
215
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
216
  function selectHit(index, shouldPlay = true) {
217
- if (!lastResult) return;
218
- const rawHit = (lastResult.hits ?? []).find((item) => Number(item.index) === Number(index));
219
  if (!rawHit) return;
220
  const hit = decorateHit(rawHit);
221
  selectedHitIndex = hit.index;
@@ -260,7 +294,12 @@ function renderSamples(result) {
260
 
261
  function renderHits(result) {
262
  const tbody = $("hitsTable").querySelector("tbody");
263
- const hits = (result.hits ?? []).map(decorateHit);
 
 
 
 
 
264
  tbody.innerHTML = hits.map((hit) => {
265
  const confidence = hit.confidence === undefined ? "—" : `${Math.round(Number(hit.confidence) * 100)}%`;
266
  const flags = [hit.is_representative ? "rep" : null, hit.favorite ? "fav" : null, hit.suppressed ? "suppressed" : null, hit.review_status !== "unreviewed" ? hit.review_status : null].filter(Boolean);
@@ -301,8 +340,13 @@ function renderSupervisionState(state) {
301
  <span>${esc(summary.constraint_count ?? 0)} constraints</span>
302
  <span>${esc(summary.open_suggestion_count ?? 0)} suggestions</span>
303
  <span>${esc(summary.suppressed_hit_count ?? 0)} suppressed</span>
 
304
  <span>${esc(summary.locked_cluster_count ?? 0)} locked</span>
 
305
  `;
 
 
 
306
 
307
  const currentTarget = $("targetClusterSelect").value;
308
  $("targetClusterSelect").innerHTML = (state.clusters ?? []).map((cluster) => `
@@ -334,15 +378,30 @@ function renderSupervisionState(state) {
334
  });
335
  }
336
 
337
- $("suggestionInbox").innerHTML = (state.suggestions ?? []).map((suggestion) => `
338
- <div class="suggestion-row">
339
- <div><strong>${esc(suggestion.type)}</strong><small>${esc(suggestion.reason)} · ${Math.round(Number(suggestion.confidence ?? 0) * 100)}% · ${esc(suggestion.preview_count ?? suggestion.hit_ids?.length ?? 0)} hits</small></div>
340
- <div class="row-actions">
341
- <button class="mini-button" type="button" data-accept-suggestion="${esc(suggestion.id)}">Accept</button>
342
- <button class="mini-button" type="button" data-reject-suggestion="${esc(suggestion.id)}">Reject</button>
 
 
 
 
 
 
 
 
343
  </div>
344
- </div>
345
- `).join("") || `<p class="empty">No open suggestions.</p>`;
 
 
 
 
 
 
 
346
  for (const button of $("suggestionInbox").querySelectorAll("[data-accept-suggestion]")) {
347
  button.addEventListener("click", () => acceptSuggestion(button.dataset.acceptSuggestion));
348
  }
@@ -390,6 +449,12 @@ async function suppressSelectedHit() {
390
  await applyStateAction(`/api/jobs/${encodeURIComponent(activeJobId)}/hits/${encodeURIComponent(hitId)}/suppress`, { reason: "bleed" });
391
  }
392
 
 
 
 
 
 
 
393
  async function reviewSelectedHit(status) {
394
  const hitId = hitIdFromIndex(selectedHitIndex);
395
  if (!activeJobId || !hitId) return;
@@ -425,6 +490,40 @@ async function undoLastEdit() {
425
  await applyStateAction(`/api/jobs/${encodeURIComponent(activeJobId)}/undo`, {});
426
  }
427
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
428
  function renderResult(job) {
429
  const result = job.result;
430
  if (!result) return;
@@ -580,10 +679,15 @@ function setFile(file) {
580
  }
581
 
582
  function selectNearestWaveformHit(event) {
583
- if (!lastResult?.overview?.onsets?.length) return;
584
  const rect = $("waveform").getBoundingClientRect();
585
  const ratio = Math.min(1, Math.max(0, (event.clientX - rect.left) / Math.max(1, rect.width)));
586
  const time = ratio * Math.max(lastResult.overview.duration_sec, 0.001);
 
 
 
 
 
587
  let best = null;
588
  let bestDelta = Infinity;
589
  for (const onset of lastResult.overview.onsets) {
@@ -639,6 +743,9 @@ $("pullHitButton").addEventListener("click", () => pullSelectedHit().catch((erro
639
  $("acceptHitButton").addEventListener("click", () => reviewSelectedHit("accepted").catch((error) => { $("clusterExplanation").textContent = error.message; }));
640
  $("favoriteHitButton").addEventListener("click", () => reviewSelectedHit("favorite").catch((error) => { $("clusterExplanation").textContent = error.message; }));
641
  $("suppressHitButton").addEventListener("click", () => suppressSelectedHit().catch((error) => { $("clusterExplanation").textContent = error.message; }));
 
 
 
642
  $("lockClusterButton").addEventListener("click", () => toggleTargetClusterLock().catch((error) => { $("clusterExplanation").textContent = error.message; }));
643
  $("explainClusterButton").addEventListener("click", () => explainTargetCluster().catch((error) => { $("clusterExplanation").textContent = error.message; }));
644
  $("targetClusterSelect").addEventListener("change", setActionButtons);
 
15
  let lastSupervisionState = null;
16
  let activeJobId = null;
17
  let selectedHitIndex = null;
18
+ let forceOnsetMode = false;
19
 
20
  function esc(value) {
21
  return String(value ?? "").replace(/[&<>'"]/g, (c) => ({ "&": "&amp;", "<": "&lt;", ">": "&gt;", "'": "&#39;", '"': "&quot;" }[c]));
 
91
  function setActionButtons() {
92
  const hasState = Boolean(activeJobId && lastSupervisionState);
93
  const hasHit = hasState && selectedHitIndex !== null;
94
+ for (const id of ["moveHitButton", "pullHitButton", "acceptHitButton", "favoriteHitButton", "suppressHitButton", "restoreHitButton"]) {
95
  const button = $(id);
96
  if (button) button.disabled = !hasHit;
97
  }
98
+ for (const id of ["refreshStateButton", "undoButton", "lockClusterButton", "explainClusterButton", "exportStateButton", "forceOnsetButton"]) {
99
  const button = $(id);
100
  if (button) button.disabled = !hasState;
101
  }
102
  const target = currentTargetCluster();
103
  if ($("lockClusterButton")) $("lockClusterButton").textContent = target?.locked ? "Unlock target cluster" : "Lock target cluster";
104
+ if ($("forceOnsetButton")) {
105
+ $("forceOnsetButton").textContent = forceOnsetMode ? "Add-onset mode on" : "Add-onset mode off";
106
+ $("forceOnsetButton").classList.toggle("active", forceOnsetMode);
107
+ }
108
  if ($("undoButton") && lastSupervisionState) $("undoButton").disabled = !lastSupervisionState.summary?.undo_available;
109
  }
110
 
 
208
  ctx.lineTo(x, selected ? h - 3 : h - 10);
209
  ctx.stroke();
210
  }
211
+ for (const hit of lastSupervisionState?.hits ?? []) {
212
+ if (hit.source !== "forced") continue;
213
+ const x = (Number(hit.onset_sec) / Math.max(overview.duration_sec, 0.001)) * w;
214
+ const selected = Number(hit.index) === Number(selectedHitIndex);
215
+ ctx.strokeStyle = selected ? "rgba(255,255,255,.98)" : "rgba(85,230,165,.9)";
216
+ ctx.lineWidth = selected ? 2.8 : 1.6;
217
+ ctx.beginPath();
218
+ ctx.moveTo(x, 2);
219
+ ctx.lineTo(x, h - 2);
220
+ ctx.stroke();
221
+ }
222
  }
223
 
224
  function playAudio(el, url) {
 
229
  if (promise && typeof promise.catch === "function") promise.catch(() => {});
230
  }
231
 
232
+ function stateOnlyHitByIndex(index) {
233
+ const stateHit = stateHitByIndex(index);
234
+ if (!stateHit) return null;
235
+ return {
236
+ index: stateHit.index,
237
+ label: stateHit.label,
238
+ cluster_id: stateHit.cluster_id,
239
+ cluster_label: stateHit.cluster_label,
240
+ is_representative: false,
241
+ onset_sec: stateHit.onset_sec,
242
+ duration_ms: stateHit.duration_ms,
243
+ rms_energy: stateHit.rms_energy,
244
+ spectral_centroid_hz: stateHit.spectral_centroid_hz,
245
+ file: stateHit.file,
246
+ url: stateHit.url,
247
+ };
248
+ }
249
+
250
  function selectHit(index, shouldPlay = true) {
251
+ if (!lastResult && !lastSupervisionState) return;
252
+ const rawHit = (lastResult?.hits ?? []).find((item) => Number(item.index) === Number(index)) ?? stateOnlyHitByIndex(index);
253
  if (!rawHit) return;
254
  const hit = decorateHit(rawHit);
255
  selectedHitIndex = hit.index;
 
294
 
295
  function renderHits(result) {
296
  const tbody = $("hitsTable").querySelector("tbody");
297
+ const baseHits = (result.hits ?? []).map(decorateHit);
298
+ const seen = new Set(baseHits.map((hit) => Number(hit.index)));
299
+ const stateOnlyHits = (lastSupervisionState?.hits ?? [])
300
+ .filter((hit) => !seen.has(Number(hit.index)))
301
+ .map((hit) => decorateHit(stateOnlyHitByIndex(hit.index)));
302
+ const hits = [...baseHits, ...stateOnlyHits].sort((a, b) => Number(a.index) - Number(b.index));
303
  tbody.innerHTML = hits.map((hit) => {
304
  const confidence = hit.confidence === undefined ? "—" : `${Math.round(Number(hit.confidence) * 100)}%`;
305
  const flags = [hit.is_representative ? "rep" : null, hit.favorite ? "fav" : null, hit.suppressed ? "suppressed" : null, hit.review_status !== "unreviewed" ? hit.review_status : null].filter(Boolean);
 
340
  <span>${esc(summary.constraint_count ?? 0)} constraints</span>
341
  <span>${esc(summary.open_suggestion_count ?? 0)} suggestions</span>
342
  <span>${esc(summary.suppressed_hit_count ?? 0)} suppressed</span>
343
+ <span>${esc(summary.forced_hit_count ?? 0)} forced</span>
344
  <span>${esc(summary.locked_cluster_count ?? 0)} locked</span>
345
+ ${summary.latest_export ? `<span>latest edited export · ${esc(summary.latest_export.hit_count)} hits / ${esc(summary.latest_export.cluster_count)} clusters</span>` : ""}
346
  `;
347
+ if (summary.latest_export?.url) {
348
+ $("editedDownloads").innerHTML = `<a href="${esc(summary.latest_export.url)}" download>Edited export manifest</a>`;
349
+ }
350
 
351
  const currentTarget = $("targetClusterSelect").value;
352
  $("targetClusterSelect").innerHTML = (state.clusters ?? []).map((cluster) => `
 
378
  });
379
  }
380
 
381
+ $("suggestionInbox").innerHTML = (state.suggestions ?? []).map((suggestion) => {
382
+ const diff = suggestion.diff ?? {};
383
+ const diffText = diff.affected_hit_count == null
384
+ ? "no diff"
385
+ : `${diff.affected_hit_count} affected · ${Object.keys(diff.clusters_before ?? {}).length} clusters`;
386
+ const hitPreview = (diff.hits ?? []).slice(0, 4).map((hit) => `#${hit.hit_index}: ${hit.from_cluster_label ?? hit.from_cluster_id} → ${hit.after_suppressed ? "suppressed" : (hit.to_cluster_label ?? hit.to_cluster_id)}`).join("; ");
387
+ return `
388
+ <div class="suggestion-row">
389
+ <div><strong>${esc(suggestion.type)}</strong><small>${esc(suggestion.reason)} · ${Math.round(Number(suggestion.confidence ?? 0) * 100)}% · ${esc(diffText)}${hitPreview ? ` · ${esc(hitPreview)}` : ""}</small></div>
390
+ <div class="row-actions">
391
+ <button class="mini-button" type="button" data-preview-suggestion="${esc(suggestion.id)}">Diff</button>
392
+ <button class="mini-button" type="button" data-accept-suggestion="${esc(suggestion.id)}">Accept</button>
393
+ <button class="mini-button" type="button" data-reject-suggestion="${esc(suggestion.id)}">Reject</button>
394
+ </div>
395
  </div>
396
+ `;
397
+ }).join("") || `<p class="empty">No open suggestions.</p>`;
398
+ for (const button of $("suggestionInbox").querySelectorAll("[data-preview-suggestion]")) {
399
+ button.addEventListener("click", () => {
400
+ const suggestion = (lastSupervisionState?.suggestions ?? []).find((item) => item.id === button.dataset.previewSuggestion);
401
+ $("clusterExplanation").classList.remove("empty");
402
+ $("clusterExplanation").textContent = JSON.stringify(suggestion?.diff ?? {}, null, 2);
403
+ });
404
+ }
405
  for (const button of $("suggestionInbox").querySelectorAll("[data-accept-suggestion]")) {
406
  button.addEventListener("click", () => acceptSuggestion(button.dataset.acceptSuggestion));
407
  }
 
449
  await applyStateAction(`/api/jobs/${encodeURIComponent(activeJobId)}/hits/${encodeURIComponent(hitId)}/suppress`, { reason: "bleed" });
450
  }
451
 
452
+ async function restoreSelectedHit() {
453
+ const hitId = hitIdFromIndex(selectedHitIndex);
454
+ if (!activeJobId || !hitId) return;
455
+ await applyStateAction(`/api/jobs/${encodeURIComponent(activeJobId)}/hits/${encodeURIComponent(hitId)}/restore`, {});
456
+ }
457
+
458
  async function reviewSelectedHit(status) {
459
  const hitId = hitIdFromIndex(selectedHitIndex);
460
  if (!activeJobId || !hitId) return;
 
490
  await applyStateAction(`/api/jobs/${encodeURIComponent(activeJobId)}/undo`, {});
491
  }
492
 
493
+ function renderEditedExport(exportPayload) {
494
+ const fileUrls = exportPayload?.file_urls ?? {};
495
+ const labels = { archive: "Edited sample pack ZIP", midi: "Edited MIDI", reconstruction: "Edited reconstruction WAV" };
496
+ $("editedDownloads").innerHTML = Object.entries(fileUrls)
497
+ .map(([key, url]) => `<a href="${esc(url)}" download>${esc(labels[key] ?? key)}</a>`)
498
+ .join("");
499
+ }
500
+
501
+ async function exportEditedPack() {
502
+ if (!activeJobId) return;
503
+ $("exportStateButton").disabled = true;
504
+ try {
505
+ const payload = await jsonApi(`/api/jobs/${encodeURIComponent(activeJobId)}/export`, { synthesize: true });
506
+ renderEditedExport(payload.export);
507
+ renderSupervisionState(payload.state);
508
+ $("clusterExplanation").classList.remove("empty");
509
+ $("clusterExplanation").textContent = JSON.stringify(payload.export, null, 2);
510
+ } finally {
511
+ setActionButtons();
512
+ }
513
+ }
514
+
515
+ async function forceOnsetAtTime(timeSec) {
516
+ if (!activeJobId) return;
517
+ const body = { onset_sec: Number(timeSec) };
518
+ const target = currentTargetCluster();
519
+ if (target) body.target_cluster_id = target.id;
520
+ const before = new Set((lastSupervisionState?.hits ?? []).map((hit) => hit.id));
521
+ const state = await jsonApi(`/api/jobs/${encodeURIComponent(activeJobId)}/hits/force-onset`, body);
522
+ renderSupervisionState(state);
523
+ const added = (state.hits ?? []).find((hit) => !before.has(hit.id) && hit.source === "forced");
524
+ if (added) selectHit(added.index);
525
+ }
526
+
527
  function renderResult(job) {
528
  const result = job.result;
529
  if (!result) return;
 
679
  }
680
 
681
  function selectNearestWaveformHit(event) {
682
+ if (!lastResult?.overview) return;
683
  const rect = $("waveform").getBoundingClientRect();
684
  const ratio = Math.min(1, Math.max(0, (event.clientX - rect.left) / Math.max(1, rect.width)));
685
  const time = ratio * Math.max(lastResult.overview.duration_sec, 0.001);
686
+ if (forceOnsetMode) {
687
+ forceOnsetAtTime(time).catch((error) => { $("clusterExplanation").textContent = error.message; });
688
+ return;
689
+ }
690
+ if (!lastResult.overview.onsets?.length) return;
691
  let best = null;
692
  let bestDelta = Infinity;
693
  for (const onset of lastResult.overview.onsets) {
 
743
  $("acceptHitButton").addEventListener("click", () => reviewSelectedHit("accepted").catch((error) => { $("clusterExplanation").textContent = error.message; }));
744
  $("favoriteHitButton").addEventListener("click", () => reviewSelectedHit("favorite").catch((error) => { $("clusterExplanation").textContent = error.message; }));
745
  $("suppressHitButton").addEventListener("click", () => suppressSelectedHit().catch((error) => { $("clusterExplanation").textContent = error.message; }));
746
+ $("restoreHitButton").addEventListener("click", () => restoreSelectedHit().catch((error) => { $("clusterExplanation").textContent = error.message; }));
747
+ $("exportStateButton").addEventListener("click", () => exportEditedPack().catch((error) => { $("clusterExplanation").textContent = error.message; setActionButtons(); }));
748
+ $("forceOnsetButton").addEventListener("click", () => { forceOnsetMode = !forceOnsetMode; setActionButtons(); });
749
  $("lockClusterButton").addEventListener("click", () => toggleTargetClusterLock().catch((error) => { $("clusterExplanation").textContent = error.message; }));
750
  $("explainClusterButton").addEventListener("click", () => explainTargetCluster().catch((error) => { $("clusterExplanation").textContent = error.message; }));
751
  $("targetClusterSelect").addEventListener("change", setActionButtons);
web/index.html CHANGED
@@ -196,10 +196,13 @@
196
  </div>
197
  <div class="supervision-actions">
198
  <button id="refreshStateButton" class="ghost-button" type="button">Refresh state</button>
 
 
199
  <button id="undoButton" class="ghost-button" type="button" disabled>Undo edit</button>
200
  </div>
201
  </div>
202
  <div id="supervisionSummary" class="state-summary">No interactive state loaded.</div>
 
203
  <div class="supervision-tools">
204
  <label>Target cluster
205
  <select id="targetClusterSelect"></select>
@@ -209,6 +212,7 @@
209
  <button id="acceptHitButton" class="secondary-button" type="button" disabled>Accept hit</button>
210
  <button id="favoriteHitButton" class="secondary-button" type="button" disabled>Favorite as representative</button>
211
  <button id="suppressHitButton" class="secondary-button danger-button" type="button" disabled>Suppress as bleed</button>
 
212
  <button id="lockClusterButton" class="secondary-button" type="button" disabled>Lock target cluster</button>
213
  <button id="explainClusterButton" class="secondary-button" type="button" disabled>Explain target cluster</button>
214
  </div>
 
196
  </div>
197
  <div class="supervision-actions">
198
  <button id="refreshStateButton" class="ghost-button" type="button">Refresh state</button>
199
+ <button id="exportStateButton" class="ghost-button" type="button" disabled>Export edited pack</button>
200
+ <button id="forceOnsetButton" class="ghost-button" type="button" disabled>Add-onset mode off</button>
201
  <button id="undoButton" class="ghost-button" type="button" disabled>Undo edit</button>
202
  </div>
203
  </div>
204
  <div id="supervisionSummary" class="state-summary">No interactive state loaded.</div>
205
+ <div id="editedDownloads" class="downloads edited-downloads"></div>
206
  <div class="supervision-tools">
207
  <label>Target cluster
208
  <select id="targetClusterSelect"></select>
 
212
  <button id="acceptHitButton" class="secondary-button" type="button" disabled>Accept hit</button>
213
  <button id="favoriteHitButton" class="secondary-button" type="button" disabled>Favorite as representative</button>
214
  <button id="suppressHitButton" class="secondary-button danger-button" type="button" disabled>Suppress as bleed</button>
215
+ <button id="restoreHitButton" class="secondary-button" type="button" disabled>Restore selected hit</button>
216
  <button id="lockClusterButton" class="secondary-button" type="button" disabled>Lock target cluster</button>
217
  <button id="explainClusterButton" class="secondary-button" type="button" disabled>Explain target cluster</button>
218
  </div>
web/styles.css CHANGED
@@ -120,3 +120,6 @@ tr.low-confidence td { background: rgba(255,202,107,.06); }
120
  tr.low-confidence.selected td { background: rgba(139,211,255,.15); }
121
  @media (max-width: 1320px) { .supervision-tools { grid-template-columns: repeat(3, minmax(0, 1fr)); } .supervision-grid { grid-template-columns: repeat(2, minmax(0, 1fr)); } }
122
  @media (max-width: 760px) { .supervision-header { display: block; } .supervision-actions { justify-content: flex-start; margin-top: 10px; } .supervision-tools, .supervision-grid { grid-template-columns: 1fr; } }
 
 
 
 
120
  tr.low-confidence.selected td { background: rgba(139,211,255,.15); }
121
  @media (max-width: 1320px) { .supervision-tools { grid-template-columns: repeat(3, minmax(0, 1fr)); } .supervision-grid { grid-template-columns: repeat(2, minmax(0, 1fr)); } }
122
  @media (max-width: 760px) { .supervision-header { display: block; } .supervision-actions { justify-content: flex-start; margin-top: 10px; } .supervision-tools, .supervision-grid { grid-template-columns: 1fr; } }
123
+ .ghost-button.active, .secondary-button.active { border-color: rgba(85,230,165,.7); background: rgba(85,230,165,.13); color: #d9ffe9; }
124
+ .edited-downloads { margin: -4px 0 14px; }
125
+ .edited-downloads:empty { display: none; }