Spaces:

rikhoffbauer2
/

drum-sample-extractor

Sleeping

App Files Files Community

ChatGPT commited on 18 days ago

Commit

03d531b

1 Parent(s): 3703c4e

feat: add supervised interactive editing state

Browse files

Files changed (19) hide show

README.md +59 -13
app.py +164 -2
docs/API.md +167 -0
docs/FEATURES.md +22 -0
docs/PROGRESS.md +64 -0
docs/REMAINING_WORK.md +21 -0
docs/TASKS.md +29 -0
docs/interactive-ux/ARCHITECTURE_NOTES.md +204 -0
docs/interactive-ux/FEASIBILITY_MATRIX.md +100 -0
docs/interactive-ux/FEATURE_REQUIREMENTS.md +162 -0
docs/interactive-ux/PROGRESS.md +87 -0
docs/interactive-ux/README.md +59 -0
docs/interactive-ux/SCOPE.md +173 -0
docs/interactive-ux/TASKS.md +105 -0
scripts/test_interactive_supervision.py +112 -0
supervised_state.py +675 -0
web/app.js +232 -16
web/index.html +46 -1
web/styles.css +21 -0

README.md CHANGED Viewed

@@ -10,18 +10,18 @@ pinned: false
 # Drum Sample Extractor
-A custom FastAPI + browser workstation for extracting reusable drum samples from an audio file.
-The pipeline can isolate a stem with Demucs, detect onsets, classify hits, cluster similar transients, choose representative samples, optionally synthesize alternate samples, and export WAVs, MIDI, reconstruction audio, manifests, and a complete ZIP sample pack.
 ## Current status
 The project is usable as a local/Hugging Face Space application. Gradio is no longer the active UI; the active app is a custom FastAPI backend plus a no-build browser frontend.
-Implemented in the current development pass:
 - Custom web frontend in `web/`, served by `app.py`.
-- FastAPI job API with upload, polling, safe artifact downloads, config, health, cache clearing, and run-history listing.
 - Timed pipeline runner in `pipeline_runner.py`.
 - Per-stage timing in every `manifest.json`.
 - Two clustering modes:
@@ -31,15 +31,34 @@ Implemented in the current development pass:
 - Run history panel indexing `.runs/*/output/manifest.json`.
 - Individual review WAVs for every detected hit under `review/hits/`.
 - Click-to-audition workflow for waveform onsets, detected hit rows, and representative sample rows.
-- Server-sent-events progress endpoint with frontend `EventSource` support and polling fallback.
-- Documentation for features, progress, tasks, API, timing, hit review, realtime suitability, UI, and remaining work.
 - Legacy Gradio apps preserved in `legacy/` for reference only.
 Not fully complete yet:
-- No interactive waveform editing of onsets/clusters.
-- No interactive onset/cluster editing yet.
-- No frontend TypeScript build/test harness.
 - Demucs remains offline/batch by design.
 See:
@@ -47,7 +66,8 @@ See:
 - `docs/FEATURES.md`
 - `docs/TASKS.md`
 - `docs/PROGRESS.md`
-- `docs/HIT_REVIEW_AND_STREAMING.md`
 - `docs/REMAINING_WORK.md`
 ## Run locally
@@ -68,11 +88,19 @@ For fast iteration, set:
 That bypasses Demucs and uses the near-realtime clustering path.
 ## Run benchmarks
 ```bash
 python3 scripts/benchmark_subprocesses.py --runs 2 --bars 4 --output docs/benchmark-subprocesses.json
-python3 scripts/test_sse_and_review_hits.py
 ```
 The benchmark uses synthetic drum fixtures and `stem=all` so the DSP stages are measured without Demucs model download/runtime noise.
@@ -93,6 +121,20 @@ Then poll the returned job id:
 curl http://127.0.0.1:7860/api/jobs/<job-id>
 ```
 List active/completed runs:
 ```bash
@@ -103,11 +145,14 @@ curl http://127.0.0.1:7860/api/jobs
 | Path | Purpose |
 |---|---|
-| `app.py` | FastAPI app, static UI serving, job API, run history, artifact downloads |
 | `pipeline_runner.py` | Timed extraction pipeline, disk stem/source cache, batch/online clustering routing |
 | `sample_extractor.py` | Core DSP/sample extraction implementation |
-| `web/` | Custom no-build browser frontend with waveform, hit review, and sample audition |
 | `scripts/benchmark_subprocesses.py` | Synthetic benchmark runner for stage timings |
 | `docs/` | Review, timing, API, UI, feature, task, progress, and remaining-work documentation |
 | `legacy/` | Previous Gradio apps retained for reference |
@@ -122,6 +167,7 @@ Each run is stored under `.runs/<job-id>/output/`:
 - `samples/*.wav`
 - `review/hits/*.wav`
 - `manifest.json`
 Generated runtime directories are ignored by git:

 # Drum Sample Extractor
+A custom FastAPI + browser workstation for extracting, reviewing, and now semantically supervising reusable drum samples from an audio file.
+The pipeline can isolate a stem with Demucs, detect onsets, classify hits, cluster similar transients, choose representative samples, optionally synthesize alternate samples, and export WAVs, MIDI, reconstruction audio, manifests, and a complete ZIP sample pack. The interactive layer stores user corrections as replayable semantic state beside each run manifest.
 ## Current status
 The project is usable as a local/Hugging Face Space application. Gradio is no longer the active UI; the active app is a custom FastAPI backend plus a no-build browser frontend.
+Implemented:
 - Custom web frontend in `web/`, served by `app.py`.
+- FastAPI job API with upload, polling, safe artifact downloads, config, health, cache clearing, run history, and SSE progress.
 - Timed pipeline runner in `pipeline_runner.py`.
 - Per-stage timing in every `manifest.json`.
 - Two clustering modes:
 - Run history panel indexing `.runs/*/output/manifest.json`.
 - Individual review WAVs for every detected hit under `review/hits/`.
 - Click-to-audition workflow for waveform onsets, detected hit rows, and representative sample rows.
+- Interactive supervised state in `supervised_state.py`:
+  - persisted `supervision_state.json`,
+  - hit/cluster confidence,
+  - outlier-first review queue,
+  - constraints,
+  - event log,
+  - suggestions,
+  - undo stack.
+- Supervision UI:
+  - selected-hit actions,
+  - move hit to cluster,
+  - pull hit into a new cluster,
+  - accept/favorite hit,
+  - suppress hit as bleed,
+  - lock/unlock cluster,
+  - suggestion inbox,
+  - cluster explanation drawer,
+  - constraint/event log.
+- Documentation for features, progress, tasks, API, timing, hit review, realtime suitability, UI, remaining work, and interactive UX.
 - Legacy Gradio apps preserved in `legacy/` for reference only.
 Not fully complete yet:
+- Semantic edits do not yet regenerate WAV/MIDI/ZIP exports.
+- No force-onset/click-to-add missed onset yet.
+- No restore for suppressed hits yet.
+- No true cached feature-vector local reclustering yet.
+- No frontend TypeScript build/test harness yet.
 - Demucs remains offline/batch by design.
 See:
 - `docs/FEATURES.md`
 - `docs/TASKS.md`
 - `docs/PROGRESS.md`
+- `docs/API.md`
+- `docs/interactive-ux/README.md`
 - `docs/REMAINING_WORK.md`
 ## Run locally
 That bypasses Demucs and uses the near-realtime clustering path.
+## Run checks
+```bash
+python3 -m py_compile app.py pipeline_runner.py sample_extractor.py supervised_state.py scripts/*.py
+node --check web/app.js
+python3 scripts/test_sse_and_review_hits.py
+python3 scripts/test_interactive_supervision.py
+```
 ## Run benchmarks
 ```bash
 python3 scripts/benchmark_subprocesses.py --runs 2 --bars 4 --output docs/benchmark-subprocesses.json
 ```
 The benchmark uses synthetic drum fixtures and `stem=all` so the DSP stages are measured without Demucs model download/runtime noise.
 curl http://127.0.0.1:7860/api/jobs/<job-id>
 ```
+Read supervised state:
+```bash
+curl http://127.0.0.1:7860/api/jobs/<job-id>/state
+```
+Move a hit into a target cluster:
+```bash
+curl -X POST http://127.0.0.1:7860/api/jobs/<job-id>/hits/hit%3A00003/move \
+  -H 'Content-Type: application/json' \
+  -d '{"target_cluster_id":"cluster:0"}'
+```
 List active/completed runs:
 ```bash
 | Path | Purpose |
 |---|---|
+| `app.py` | FastAPI app, static UI serving, job API, run history, artifact downloads, supervised editing endpoints |
 | `pipeline_runner.py` | Timed extraction pipeline, disk stem/source cache, batch/online clustering routing |
 | `sample_extractor.py` | Core DSP/sample extraction implementation |
+| `supervised_state.py` | Persistent semantic state, confidence, constraints, events, suggestions, undo |
+| `web/` | Custom no-build browser frontend with waveform, hit review, sample audition, and supervision panel |
 | `scripts/benchmark_subprocesses.py` | Synthetic benchmark runner for stage timings |
+| `scripts/test_interactive_supervision.py` | Smoke test for supervised state endpoints |
+| `docs/interactive-ux/` | Supplied interactive UX docs aligned to current implementation |
 | `docs/` | Review, timing, API, UI, feature, task, progress, and remaining-work documentation |
 | `legacy/` | Previous Gradio apps retained for reference |
 - `samples/*.wav`
 - `review/hits/*.wav`
 - `manifest.json`
+- `supervision_state.json`
 Generated runtime directories are ignored by git:

app.py CHANGED Viewed

@@ -19,20 +19,33 @@ from pathlib import Path
 from threading import Lock
 from typing import Any
-from fastapi import FastAPI, File, Form, HTTPException, UploadFile
 from fastapi.middleware.cors import CORSMiddleware
 from fastapi.responses import FileResponse, JSONResponse, StreamingResponse
 from fastapi.staticfiles import StaticFiles
 from pipeline_runner import PipelineParams, clear_disk_cache, initial_stages, run_extraction_pipeline
 from sample_extractor import DEMUCS_MODELS, DEMUCS_STEMS, cache_clear
 ROOT = Path(__file__).resolve().parent
 WEB_DIR = ROOT / "web"
 RUNS_DIR = ROOT / ".runs"
 RUNS_DIR.mkdir(exist_ok=True)
-app = FastAPI(title="Drum Sample Extractor", version="11.1.0")
 app.add_middleware(
     CORSMiddleware,
     allow_origins=["*"],
@@ -155,6 +168,7 @@ def _run_job(job_id: str) -> None:
     try:
         result = run_extraction_pipeline(input_path, output_dir, PipelineParams.from_mapping(params), progress_cb=progress)
         _update_job(job_id, status="complete", result=asdict(result), error=None)
     except Exception as exc:  # deliberately explicit for UI diagnostics
         _update_job(job_id, status="error", error=str(exc), traceback=traceback.format_exc())
@@ -248,6 +262,33 @@ def get_job(job_id: str) -> dict[str, Any]:
     raise HTTPException(status_code=404, detail="Job not found")
 @app.get("/api/jobs/{job_id}/events")
 def get_job_events(job_id: str) -> StreamingResponse:
     with jobs_lock:
@@ -286,6 +327,127 @@ def get_job_events(job_id: str) -> StreamingResponse:
     )
 @app.get("/api/jobs/{job_id}/files/{relative_path:path}")
 def get_job_file(job_id: str, relative_path: str) -> FileResponse:
     root = (RUNS_DIR / job_id / "output").resolve()

 from threading import Lock
 from typing import Any
+from fastapi import Body, FastAPI, File, Form, HTTPException, UploadFile
 from fastapi.middleware.cors import CORSMiddleware
 from fastapi.responses import FileResponse, JSONResponse, StreamingResponse
 from fastapi.staticfiles import StaticFiles
 from pipeline_runner import PipelineParams, clear_disk_cache, initial_stages, run_extraction_pipeline
 from sample_extractor import DEMUCS_MODELS, DEMUCS_STEMS, cache_clear
+from supervised_state import (
+    accept_suggestion,
+    explain_cluster as build_cluster_explanation,
+    load_or_create_state,
+    lock_cluster as apply_cluster_lock,
+    move_hit as apply_hit_move,
+    public_state,
+    pull_hit_to_new_cluster,
+    reject_suggestion,
+    set_hit_review_status,
+    suppress_hit as apply_hit_suppression,
+    undo_last as apply_undo,
+)
 ROOT = Path(__file__).resolve().parent
 WEB_DIR = ROOT / "web"
 RUNS_DIR = ROOT / ".runs"
 RUNS_DIR.mkdir(exist_ok=True)
+app = FastAPI(title="Drum Sample Extractor", version="12.0.0")
 app.add_middleware(
     CORSMiddleware,
     allow_origins=["*"],
     try:
         result = run_extraction_pipeline(input_path, output_dir, PipelineParams.from_mapping(params), progress_cb=progress)
+        load_or_create_state(job_id, output_dir)
         _update_job(job_id, status="complete", result=asdict(result), error=None)
     except Exception as exc:  # deliberately explicit for UI diagnostics
         _update_job(job_id, status="error", error=str(exc), traceback=traceback.format_exc())
     raise HTTPException(status_code=404, detail="Job not found")
+def _job_output_dir(job_id: str) -> Path:
+    with jobs_lock:
+        job = jobs.get(job_id)
+        if job and job.get("output_dir"):
+            return Path(job["output_dir"])
+    manifest = _manifest_path(job_id)
+    if manifest.exists():
+        return manifest.parent
+    raise HTTPException(status_code=404, detail="Job not found")
+def _state_payload(job_id: str) -> dict[str, Any]:
+    out = _job_output_dir(job_id)
+    try:
+        state = load_or_create_state(job_id, out)
+    except FileNotFoundError as exc:
+        raise HTTPException(status_code=409, detail="Job has no manifest yet; wait until extraction completes") from exc
+    except Exception as exc:
+        raise HTTPException(status_code=500, detail=str(exc)) from exc
+    return public_state(state, url_for=lambda rel: _job_url(job_id, rel))
+def _json_patch(payload: dict[str, Any] | None) -> dict[str, Any]:
+    return dict(payload or {})
 @app.get("/api/jobs/{job_id}/events")
 def get_job_events(job_id: str) -> StreamingResponse:
     with jobs_lock:
     )
+@app.get("/api/jobs/{job_id}/state")
+def get_job_state(job_id: str) -> dict[str, Any]:
+    return _state_payload(job_id)
+@app.post("/api/jobs/{job_id}/hits/{hit_id}/move")
+def post_move_hit(job_id: str, hit_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
+    target_cluster_id = _json_patch(payload).get("target_cluster_id")
+    if not target_cluster_id:
+        raise HTTPException(status_code=400, detail="target_cluster_id is required")
+    try:
+        apply_hit_move(_job_output_dir(job_id), job_id, hit_id, str(target_cluster_id))
+    except KeyError as exc:
+        raise HTTPException(status_code=404, detail=str(exc)) from exc
+    except Exception as exc:
+        raise HTTPException(status_code=500, detail=str(exc)) from exc
+    return _state_payload(job_id)
+@app.post("/api/jobs/{job_id}/hits/{hit_id}/pull-out")
+def post_pull_hit(job_id: str, hit_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
+    label = _json_patch(payload).get("label")
+    try:
+        pull_hit_to_new_cluster(_job_output_dir(job_id), job_id, hit_id, label=label)
+    except KeyError as exc:
+        raise HTTPException(status_code=404, detail=str(exc)) from exc
+    except Exception as exc:
+        raise HTTPException(status_code=500, detail=str(exc)) from exc
+    return _state_payload(job_id)
+@app.post("/api/jobs/{job_id}/hits/{hit_id}/suppress")
+def post_suppress_hit(job_id: str, hit_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
+    reason = str(_json_patch(payload).get("reason") or "bleed")
+    try:
+        apply_hit_suppression(_job_output_dir(job_id), job_id, hit_id, reason=reason)
+    except KeyError as exc:
+        raise HTTPException(status_code=404, detail=str(exc)) from exc
+    except Exception as exc:
+        raise HTTPException(status_code=500, detail=str(exc)) from exc
+    return _state_payload(job_id)
+@app.post("/api/jobs/{job_id}/hits/{hit_id}/review")
+def post_review_hit(job_id: str, hit_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
+    status = str(_json_patch(payload).get("status") or "accepted")
+    try:
+        set_hit_review_status(_job_output_dir(job_id), job_id, hit_id, status=status)
+    except KeyError as exc:
+        raise HTTPException(status_code=404, detail=str(exc)) from exc
+    except ValueError as exc:
+        raise HTTPException(status_code=400, detail=str(exc)) from exc
+    except Exception as exc:
+        raise HTTPException(status_code=500, detail=str(exc)) from exc
+    return _state_payload(job_id)
+@app.post("/api/jobs/{job_id}/clusters/{cluster_id:path}/lock")
+def post_lock_cluster(job_id: str, cluster_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
+    locked = bool(_json_patch(payload).get("locked", True))
+    try:
+        apply_cluster_lock(_job_output_dir(job_id), job_id, cluster_id, locked=locked)
+    except KeyError as exc:
+        raise HTTPException(status_code=404, detail=str(exc)) from exc
+    except Exception as exc:
+        raise HTTPException(status_code=500, detail=str(exc)) from exc
+    return _state_payload(job_id)
+@app.get("/api/jobs/{job_id}/suggestions")
+def get_suggestions(job_id: str) -> dict[str, Any]:
+    state = _state_payload(job_id)
+    return {"suggestions": state.get("suggestions", []), "summary": state.get("summary", {})}
+@app.post("/api/jobs/{job_id}/suggestions/{suggestion_id}/accept")
+def post_accept_suggestion(job_id: str, suggestion_id: str) -> dict[str, Any]:
+    try:
+        accept_suggestion(_job_output_dir(job_id), job_id, suggestion_id)
+    except KeyError as exc:
+        raise HTTPException(status_code=404, detail=str(exc)) from exc
+    except ValueError as exc:
+        raise HTTPException(status_code=400, detail=str(exc)) from exc
+    except Exception as exc:
+        raise HTTPException(status_code=500, detail=str(exc)) from exc
+    return _state_payload(job_id)
+@app.post("/api/jobs/{job_id}/suggestions/{suggestion_id}/reject")
+def post_reject_suggestion(job_id: str, suggestion_id: str) -> dict[str, Any]:
+    try:
+        reject_suggestion(_job_output_dir(job_id), job_id, suggestion_id)
+    except KeyError as exc:
+        raise HTTPException(status_code=404, detail=str(exc)) from exc
+    except Exception as exc:
+        raise HTTPException(status_code=500, detail=str(exc)) from exc
+    return _state_payload(job_id)
+@app.get("/api/jobs/{job_id}/explain/cluster/{cluster_id:path}")
+def get_cluster_explanation(job_id: str, cluster_id: str) -> dict[str, Any]:
+    out = _job_output_dir(job_id)
+    state = load_or_create_state(job_id, out)
+    try:
+        explanation = build_cluster_explanation(state, cluster_id)
+    except KeyError as exc:
+        raise HTTPException(status_code=404, detail=str(exc)) from exc
+    return explanation
+@app.post("/api/jobs/{job_id}/undo")
+def post_undo(job_id: str) -> dict[str, Any]:
+    try:
+        apply_undo(_job_output_dir(job_id), job_id)
+    except Exception as exc:
+        raise HTTPException(status_code=500, detail=str(exc)) from exc
+    return _state_payload(job_id)
 @app.get("/api/jobs/{job_id}/files/{relative_path:path}")
 def get_job_file(job_id: str, relative_path: str) -> FileResponse:
     root = (RUNS_DIR / job_id / "output").resolve()

docs/API.md CHANGED Viewed

@@ -212,3 +212,170 @@ Defined in `pipeline_runner.PipelineParams`.
 | `subdivision` | `16` | MIDI grid subdivision. |
 | `device` | `cpu` | Torch device for Demucs. |
 | `use_disk_cache` | `true` | Cache decoded full mix/stems by source digest and extraction settings. |

 | `subdivision` | `16` | MIDI grid subdivision. |
 | `device` | `cpu` | Torch device for Demucs. |
 | `use_disk_cache` | `true` | Cache decoded full mix/stems by source digest and extraction settings. |
+## Interactive supervision API
+The interactive supervision API is backed by `supervised_state.py` and persists state as:
+```text
+.runs/<job_id>/output/supervision_state.json
+```
+The batch `manifest.json` remains immutable. Supervised edits currently update semantic state only; they do not yet regenerate WAV/MIDI/ZIP artifacts.
+### `GET /api/jobs/{job_id}/state`
+Returns the supervised state for a completed job. If the state file does not exist yet, it is created from the batch manifest.
+Response keys:
+| Key | Meaning |
+|---|---|
+| `summary` | Counts for hits, clusters, constraints, events, suggestions, suppressed hits, locked clusters, undo availability. |
+| `hits` | Semantic hit rows with confidence, suppression/favorite/review flags, file URLs, and current cluster assignment. |
+| `clusters` | Semantic clusters with hit IDs, representative hit, confidence, locked state, and suppressed count. |
+| `review_queue` | Low-confidence/high-priority hits sorted for review. |
+| `constraints` | Recent replayable constraints. |
+| `events` | Recent state mutation events. |
+| `suggestions` | Open move/split/suppress suggestions. |
+```bash
+curl http://127.0.0.1:7860/api/jobs/<job-id>/state
+```
+### `POST /api/jobs/{job_id}/hits/{hit_id}/move`
+Moves a hit into an existing target cluster.
+Body:
+```json
+{"target_cluster_id":"cluster:0"}
+```
+Effects:
+- updates hit membership in `supervision_state.json`,
+- creates `force-cluster`,
+- creates `must-link` to the target representative when possible,
+- appends events,
+- recomputes confidence/review queue,
+- may create similar-hit move suggestions,
+- pushes an undo snapshot.
+Example:
+```bash
+curl -X POST http://127.0.0.1:7860/api/jobs/<job-id>/hits/hit%3A00003/move \
+  -H 'Content-Type: application/json' \
+  -d '{"target_cluster_id":"cluster:0"}'
+```
+### `POST /api/jobs/{job_id}/hits/{hit_id}/pull-out`
+Pulls a hit into a new user cluster.
+Optional body:
+```json
+{"label":"snare_user_1"}
+```
+Effects:
+- creates a new `cluster:user:*` cluster,
+- creates `cannot-link` from the source representative when possible,
+- creates `force-cluster`,
+- may create split suggestions,
+- pushes an undo snapshot.
+### `POST /api/jobs/{job_id}/hits/{hit_id}/suppress`
+Marks a hit as bleed/noise/non-sample material.
+Body:
+```json
+{"reason":"bleed"}
+```
+Effects:
+- marks the hit `suppressed`,
+- creates `suppress-pattern`,
+- may create similar suppression suggestions,
+- recomputes confidence and review priority.
+### `POST /api/jobs/{job_id}/hits/{hit_id}/review`
+Stores a review decision for a hit.
+Body:
+```json
+{"status":"accepted"}
+```
+Supported statuses:
+| Status | Meaning |
+|---|---|
+| `unreviewed` | Clear explicit review status. |
+| `accepted` | Mark the hit as reviewed/accepted. |
+| `favorite` | Mark as favorite and pin as semantic representative for its cluster. |
+### `POST /api/jobs/{job_id}/clusters/{cluster_id}/lock`
+Locks or unlocks a cluster.
+Body:
+```json
+{"locked":true}
+```
+Lock state is persisted and shown in the cluster board. It does not yet alter future full pipeline reruns.
+### `GET /api/jobs/{job_id}/suggestions`
+Returns open suggestions and the state summary.
+```bash
+curl http://127.0.0.1:7860/api/jobs/<job-id>/suggestions
+```
+### `POST /api/jobs/{job_id}/suggestions/{suggestion_id}/accept`
+Applies a suggestion and records accepted constraints/examples.
+Supported suggestion types:
+- `move-hits`,
+- `split-hits`,
+- `suppress-hits`.
+### `POST /api/jobs/{job_id}/suggestions/{suggestion_id}/reject`
+Marks a suggestion rejected and records an event.
+### `GET /api/jobs/{job_id}/explain/cluster/{cluster_id}`
+Returns explanation data for one cluster:
+- label,
+- locked state,
+- confidence and reasons,
+- representative hit,
+- hit counts,
+- label distribution,
+- lowest-confidence outliers,
+- relevant constraints,
+- summary string.
+### `POST /api/jobs/{job_id}/undo`
+Restores the previous semantic state snapshot if available.
+```bash
+curl -X POST http://127.0.0.1:7860/api/jobs/<job-id>/undo
+```

docs/FEATURES.md CHANGED Viewed

@@ -60,3 +60,25 @@ Turn an input audio file into a practical drum sample pack: detected hits, group
 - Realtime Demucs. It is not realistic for this use-case and should remain offline/cached.
 - Perfect source separation. Stem quality depends on model choice and input material.
 - Full DAW/sample-editor UX. This pass creates the workstation foundation; detailed editing is next.

 - Realtime Demucs. It is not realistic for this use-case and should remain offline/cached.
 - Perfect source separation. Stem quality depends on model choice and input material.
 - Full DAW/sample-editor UX. This pass creates the workstation foundation; detailed editing is next.
+## Interactive supervised UX features
+| Area | Feature | Status | Notes |
+|---|---|---|---|
+| Supervision | Supplied UX docs embedded | Implemented | Added and aligned under `docs/interactive-ux/`. |
+| Supervision | Persistent semantic state | Implemented | `supervision_state.json` is created beside each run manifest. |
+| Supervision | Hit/cluster state model | Implemented | State tracks current cluster assignment, confidence, suppression, favorite/review flags, and representatives. |
+| Supervision | Constraint store | Implemented | Stores `force-cluster`, `must-link`, `cannot-link`, `lock-cluster`, `suppress-pattern`, and `pin-representative`. |
+| Supervision | Event log | Implemented | State changes append replay/audit events. |
+| Supervision | Undo stack | Implemented | Last semantic edit can be undone. |
+| Supervision | Confidence scoring | Partial | Heuristic and deterministic; does not yet use cached mel/transient feature margins. |
+| Supervision | Outlier-first review queue | Implemented | UI prioritizes low-confidence/singleton/unstable hits. |
+| Supervision | Move hit to cluster | Implemented | Creates supervision constraints and may produce suggestions. |
+| Supervision | Pull hit into new cluster | Implemented | Creates a user cluster and cannot-link/force-cluster constraints. |
+| Supervision | Lock cluster | Implemented | Lock state persists and updates confidence/UI. |
+| Supervision | Suppress hit as bleed | Implemented | Marks hit suppressed, stores suppress-pattern, may suggest similar suppressions. |
+| Supervision | Favorite representative | Partial | Pins semantic representative; supervised export does not yet honor it. |
+| Supervision | Suggestion inbox | Partial | Move/split/suppress suggestions can be accepted/rejected; exact diff preview is not implemented. |
+| Supervision | Cluster explanation | Implemented | Backend and UI show confidence reasons, label distribution, outliers, and constraints. |
+| Supervision | Edited artifact re-export | Not implemented | Semantic edits do not yet regenerate sample WAVs, MIDI, reconstruction, or ZIP. |
+| Supervision | Force-onset from waveform | Not implemented | Waveform click currently auditions nearest existing hit only. |

docs/PROGRESS.md CHANGED Viewed

@@ -81,3 +81,67 @@ Completed in this pass:
 Outcome:
 The app now supports a real review loop for inspecting what the onset detector and clustering produced. Users can audition individual detected slices, representative samples, stem audio, and reconstruction audio from one screen. Progress updates are lower-latency and less wasteful via SSE while still remaining robust in browsers that need polling fallback.

 Outcome:
 The app now supports a real review loop for inspecting what the onset detector and clustering produced. Users can audition individual detected slices, representative samples, stem audio, and reconstruction audio from one screen. Progress updates are lower-latency and less wasteful via SSE while still remaining robust in browsers that need polling fallback.
+## Pass 4: interactive supervised UX foundation
+Completed in this pass:
+1. Added the supplied interactive UX document set under `docs/interactive-ux/`.
+2. Read and aligned the UX documents with the project as currently implemented.
+3. Added `supervised_state.py` for persistent semantic state beside each completed run manifest.
+4. Added `supervision_state.json` generation after each successful extraction.
+5. Added state schema for hits, clusters, constraints, events, suggestions, confidence, review queue, and undo snapshots.
+6. Added supervised editing endpoints:
+   - `GET /api/jobs/{job_id}/state`
+   - `POST /api/jobs/{job_id}/hits/{hit_id}/move`
+   - `POST /api/jobs/{job_id}/hits/{hit_id}/pull-out`
+   - `POST /api/jobs/{job_id}/hits/{hit_id}/suppress`
+   - `POST /api/jobs/{job_id}/hits/{hit_id}/review`
+   - `POST /api/jobs/{job_id}/clusters/{cluster_id}/lock`
+   - `GET /api/jobs/{job_id}/suggestions`
+   - `POST /api/jobs/{job_id}/suggestions/{suggestion_id}/accept`
+   - `POST /api/jobs/{job_id}/suggestions/{suggestion_id}/reject`
+   - `GET /api/jobs/{job_id}/explain/cluster/{cluster_id}`
+   - `POST /api/jobs/{job_id}/undo`
+7. Added an interactive supervision UI panel with:
+   - state summary,
+   - selected-hit actions,
+   - target cluster picker,
+   - outlier-first review queue,
+   - cluster board,
+   - suggestion inbox,
+   - constraint/event log,
+   - cluster explanation drawer.
+8. Added `scripts/test_interactive_supervision.py` to verify the supervised API loop.
+Outcome:
+The app is now an extraction and supervised-review workstation at the semantic-state level. User corrections are persisted as constraints/events and can be inspected, suggested from, and undone. The next required step is edited-state export so these decisions affect downloadable artifacts.
+## Current assessment after Pass 4
+The project now satisfies the first interactive UX milestone for replayable supervised state:
+```text
+analyze audio
+→ inspect hits/clusters
+→ move/pull/suppress/favorite/lock
+→ persist constraints/events
+→ update confidence and review queue
+→ generate/accept/reject suggestions
+→ explain clusters
+→ undo semantic edits
+→ reload completed run with decisions intact
+```
+It does not yet satisfy the full workstation loop because edited semantic state is not yet rendered into updated sample WAVs, MIDI, reconstruction, or ZIP output.
+## Next recommended pass after Pass 4
+1. Add supervised re-export endpoint.
+2. Exclude suppressed hits from supervised exports.
+3. Honor favorite/pinned representatives in supervised sample WAVs.
+4. Add force-onset endpoint using cached `stem.wav`.
+5. Add add-onset mode to the waveform UI.
+6. Add restore suppressed hit and batch restore.
+7. Add feature-vector cache for true local reclustering.

docs/REMAINING_WORK.md CHANGED Viewed

@@ -33,3 +33,24 @@ The project is now a usable extraction workstation, not a complete interactive s
 5. Add lower-level progress hooks inside expensive stages where practical.
 6. Convert frontend to TypeScript and add UI tests.
 7. Add an in-app benchmark/parameter profile panel.

 5. Add lower-level progress hooks inside expensive stages where practical.
 6. Convert frontend to TypeScript and add UI tests.
 7. Add an in-app benchmark/parameter profile panel.
+## Remaining after interactive UX foundation
+Completed since the previous remaining-work snapshot:
+- Supplied `docs/interactive-ux` document set embedded and aligned.
+- Persistent supervised state added via `supervised_state.py` and `supervision_state.json`.
+- Constraint store and event log added.
+- Hit/cluster confidence and outlier-first review queue added.
+- Move hit, pull-out, lock/unlock, suppress, review/favorite, suggestions, explanations, and undo endpoints added.
+- Interactive supervision UI panel added.
+Highest-priority remaining work now:
+1. **Supervised artifact export**: regenerate edited sample WAVs, MIDI, reconstruction, manifest, and ZIP from `supervision_state.json` without rerunning Demucs/onset detection.
+2. **Force-onset correction**: add an onset by clicking/shift-clicking the waveform, slice from cached `stem.wav`, classify, assign, and store a `force-onset` constraint.
+3. **Suppression restore**: restore suppressed hits individually and in batches.
+4. **Real constrained local reclustering**: cache hit feature vectors and recompute affected neighborhoods after edits.
+5. **Suggestion diff preview**: show exact before/after membership changes before accepting a suggestion.
+6. **Constraint violation detection**: explicitly report conflicting user constraints.
+7. **Frontend tests and TypeScript migration**: harden the increasingly stateful UI.

docs/TASKS.md CHANGED Viewed

@@ -58,3 +58,32 @@ Last updated: 2026-05-12
 - [ ] Add side-by-side run comparison.
 - [ ] Convert frontend to TypeScript with a small Vite build once UX stabilizes.
 - [ ] Add automated browser-level UI tests.

 - [ ] Add side-by-side run comparison.
 - [ ] Convert frontend to TypeScript with a small Vite build once UX stabilizes.
 - [ ] Add automated browser-level UI tests.
+## Interactive UX continuation tasks
+| Task | Status | Evidence |
+|---|---:|---|
+| Add supplied interactive UX docs under `docs/interactive-ux/` | Done | `docs/interactive-ux/*.md`. |
+| Read and align UX docs with current implementation | Done | Status sections updated in every interactive UX document. |
+| Add persistent semantic job state | Done | `supervised_state.py`, `supervision_state.json`. |
+| Add event log and constraint store | Done | `supervised_state.py`; tested by `scripts/test_interactive_supervision.py`. |
+| Add hit/cluster confidence and review queue | Done/Partial | Heuristic confidence and review queue implemented; feature-margin confidence remains open. |
+| Add move hit to cluster | Done | `POST /api/jobs/{job_id}/hits/{hit_id}/move`. |
+| Add pull hit into new cluster | Done | `POST /api/jobs/{job_id}/hits/{hit_id}/pull-out`. |
+| Add cluster lock/unlock | Done | `POST /api/jobs/{job_id}/clusters/{cluster_id}/lock`. |
+| Add suppress hit as bleed/noise | Done | `POST /api/jobs/{job_id}/hits/{hit_id}/suppress`. |
+| Add accept/favorite hit action | Done/Partial | `POST /api/jobs/{job_id}/hits/{hit_id}/review`; artifact re-export still open. |
+| Add suggestion inbox | Done/Partial | UI/API supports accept/reject; exact diff preview still open. |
+| Add cluster explanation drawer | Done | `GET /api/jobs/{job_id}/explain/cluster/{cluster_id}` plus UI drawer. |
+| Add semantic undo | Done | `POST /api/jobs/{job_id}/undo`. |
+| Add supervised export from edited state | Todo | Needed so corrections affect ZIP/MIDI/WAV outputs. |
+| Add click-to-add missed onset | Todo | Needed for `force-onset` constraints and direct onset correction. |
+| Add suppressed-hit restore | Todo | Needed as the safety counterpart to suppression. |
+| Add true local feature-neighborhood reclustering | Todo | Requires cached feature vectors and constraint-aware assignment. |
+## Latest validation tasks
+- [x] `python3 -m py_compile app.py pipeline_runner.py sample_extractor.py supervised_state.py scripts/*.py`
+- [x] `node --check web/app.js`
+- [x] `python3 scripts/test_sse_and_review_hits.py`
+- [x] `python3 scripts/test_interactive_supervision.py`

docs/interactive-ux/ARCHITECTURE_NOTES.md ADDED Viewed

	@@ -0,0 +1,204 @@

+# Architecture notes for supervised interactive extraction
+## Required shift
+Original extraction flow:
+```text
+audio → pipeline → result artifacts
+```
+Current interactive foundation:
+```text
+audio/cache → immutable manifest/artifacts → supervision_state.json → reactive UI → user constraints/events/suggestions
+```
+The current implementation deliberately keeps the batch extraction artifacts immutable. Interactive edits mutate `supervision_state.json`, not the original `manifest.json`, hit WAVs, representative WAVs, MIDI, reconstruction, or ZIP. This keeps edits cheap and reversible, but supervised re-export is the next architectural step.
+## Implemented modules
+| Module/file | Responsibility |
+|---|---|
+| `pipeline_runner.py` | Batch extraction, timing, manifests, review-hit WAV exports |
+| `sample_extractor.py` | Audio analysis, classification, batch/online clustering, export helpers |
+| `supervised_state.py` | Persistent semantic job state, constraints, events, confidence, suggestions, undo |
+| `app.py` | FastAPI endpoints for batch jobs and supervised state mutations |
+| `web/app.js` | Browser state rendering, review queue, cluster board, suggestions, actions |
+| `web/index.html` | Workstation layout and interactive supervision panel |
+| `web/styles.css` | Visual treatment for low confidence, suppression, locks, panels |
+## Core entities as implemented
+`supervised_state.py` stores JSON dictionaries equivalent to these shapes:
+```ts
+type Hit = {
+  id: string;
+  index: number;
+  label: string;
+  cluster_id: string;
+  original_cluster_id: string;
+  cluster_label: string;
+  onset_sec: number;
+  duration_ms: number;
+  rms_energy: number;
+  spectral_centroid_hz: number;
+  file: string;
+  is_representative: boolean;
+  source: "detected" | "forced";
+  suppressed: boolean;
+  favorite: boolean;
+  review_status: "unreviewed" | "accepted" | "favorite" | "suppressed";
+  confidence: number;
+  confidence_reasons: string[];
+  explicit: boolean;
+};
+type Cluster = {
+  id: string;
+  label: string;
+  classification: string;
+  hit_ids: string[];
+  representative_hit_id: string | null;
+  locked: boolean;
+  user_named: boolean;
+  confidence: number;
+  confidence_reasons: string[];
+  suppressed_count: number;
+  original_id: string | null;
+};
+type Constraint =
+  | { id: string; type: "must-link"; a: string; b: string; source: string }
+  | { id: string; type: "cannot-link"; a: string; b: string; source: string }
+  | { id: string; type: "force-cluster"; hit_id: string; cluster_id: string; source: string }
+  | { id: string; type: "lock-cluster"; cluster_id: string; locked: boolean; source: string }
+  | { id: string; type: "suppress-pattern"; example_hit_id: string; reason: string; source: string }
+  | { id: string; type: "pin-representative"; hit_id: string; cluster_id: string; source: string };
+```
+## Current state file
+Each completed run now gets:
+```text
+.runs/<job_id>/output/manifest.json
+.runs/<job_id>/output/supervision_state.json
+```
+`manifest.json` is the immutable batch result. `supervision_state.json` is the mutable, replayable semantic state.
+## Implemented API additions
+```text
+GET    /api/jobs/{job_id}/state
+POST   /api/jobs/{job_id}/hits/{hit_id}/move
+POST   /api/jobs/{job_id}/hits/{hit_id}/pull-out
+POST   /api/jobs/{job_id}/hits/{hit_id}/suppress
+POST   /api/jobs/{job_id}/hits/{hit_id}/review
+POST   /api/jobs/{job_id}/clusters/{cluster_id}/lock
+GET    /api/jobs/{job_id}/suggestions
+POST   /api/jobs/{job_id}/suggestions/{suggestion_id}/accept
+POST   /api/jobs/{job_id}/suggestions/{suggestion_id}/reject
+GET    /api/jobs/{job_id}/explain/cluster/{cluster_id}
+POST   /api/jobs/{job_id}/undo
+```
+## Current confidence scoring
+Initial confidence is heuristic and deterministic. It combines:
+- cluster size,
+- cluster label purity,
+- representative presence,
+- lock state,
+- hit label agreement with cluster label,
+- energy rank,
+- rough duration reasonableness,
+- representative/favorite/explicit assignment state,
+- suppression state.
+This is good enough to drive the review queue but should be replaced or supplemented by cached feature-vector margins.
+## Current suggestion engine
+Implemented suggestion types:
+```ts
+type Suggestion =
+  | { type: "move-hits"; hit_ids: string[]; target_cluster_id: string; confidence: number; reason: string }
+  | { type: "split-hits"; hit_ids: string[]; source_cluster_id: string; target_cluster_id: string; confidence: number; reason: string }
+  | { type: "suppress-hits"; hit_ids: string[]; confidence: number; reason: string };
+```
+Suggestion generation currently uses label, spectral centroid, and RMS-energy similarity. Accepted suggestions become explicit constraints/examples.
+## Event log
+Examples now emitted:
+```text
+job.state.created
+constraint.created
+hit.moved
+hit.pulled_out
+cluster.locked
+cluster.unlocked
+hit.suppressed
+hit.reviewed
+suggestion.created
+suggestion.accepted
+suggestion.rejected
+state.undo
+```
+The UI renders recent events and constraints in the supervision panel.
+## Local recomputation boundary
+Implemented now:
+```text
+semantic edit
+→ update hit/cluster membership in supervision_state.json
+→ append constraints/events
+→ generate heuristic suggestions
+→ recompute hit/cluster confidence and review queue
+```
+Not implemented yet:
+```text
+semantic edit
+→ load cached feature vectors
+→ choose affected neighborhood
+→ run constrained local reclustering
+→ create preview diff
+→ optionally apply/re-export artifacts
+```
+## UI state implications
+Implemented panels:
+- waveform + onset audition,
+- representative samples,
+- detected hit review,
+- outlier-first review queue,
+- cluster board,
+- suggestion inbox,
+- constraint/history inspector,
+- cluster explanation drawer,
+- export/download panel for the original batch run.
+Still missing:
+- edited export panel,
+- force-onset mode,
+- suppression restore UI,
+- side-by-side before/after diff preview.
+## Implementation warning
+Automatic propagation must stay conservative. The current implementation follows this by creating suggestions rather than silently moving/suppressing batches. Every semantic edit is undoable.

docs/interactive-ux/FEASIBILITY_MATRIX.md ADDED Viewed

	@@ -0,0 +1,100 @@

+# Feature feasibility matrix
+## Scoring convention
+All scores are between `0.0` and `1.0`.
+For positive dimensions, higher is better:
+- **Feasibility**: likelihood this can be implemented robustly with the current project direction.
+- **Value UX**: how much nicer/faster the application feels.
+- **Value quality**: expected improvement in extraction correctness.
+- **Realtime fit**: how well the interaction can update from cached features without full reruns.
+- **MVP fit**: whether it belongs in the first serious interactive version.
+For cost dimensions, higher is harder/more expensive:
+- **Complexity**: algorithmic/product complexity.
+- **Effort**: implementation effort.
+## Matrix
+| Interaction | Feasibility | Complexity | Effort | Value UX | Value quality | Realtime fit | MVP fit | Verdict |
+|---|---:|---:|---:|---:|---:|---:|---:|---|
+| Move sample to cluster → auto-tune clustering | 0.90 | 0.55 | 0.45 | 0.95 | 0.90 | 0.85 | 0.95 | Build early |
+| Pull sample out → protect distinction | 0.90 | 0.55 | 0.45 | 0.90 | 0.90 | 0.85 | 0.95 | Build early |
+| Lock confirmed cluster identity | 0.95 | 0.25 | 0.20 | 0.80 | 0.75 | 0.95 | 1.00 | Build immediately |
+| Outlier-first review queue | 0.95 | 0.30 | 0.25 | 0.90 | 0.80 | 0.95 | 1.00 | Build immediately |
+| Low-confidence visual emphasis | 0.95 | 0.25 | 0.20 | 0.85 | 0.65 | 0.95 | 0.95 | Build immediately |
+| Bleed brush/suppression | 0.85 | 0.55 | 0.50 | 0.90 | 0.85 | 0.80 | 0.85 | Build early |
+| Click-to-add missed onset | 0.90 | 0.45 | 0.40 | 0.90 | 0.80 | 0.90 | 0.90 | Build early |
+| Cluster naming as reusable semantic hint | 0.75 | 0.65 | 0.55 | 0.75 | 0.65 | 0.70 | 0.40 | Useful, not first |
+| Star/favorite sample → optimize around it | 0.95 | 0.35 | 0.30 | 0.75 | 0.70 | 0.90 | 0.75 | Build early |
+| Explain this cluster | 0.85 | 0.50 | 0.45 | 0.80 | 0.65 | 0.85 | 0.80 | Build early |
+| Live counterfactual parameter previews | 0.80 | 0.70 | 0.65 | 0.90 | 0.75 | 0.75 | 0.55 | High value, later |
+| Temporal pattern supervision | 0.75 | 0.70 | 0.65 | 0.80 | 0.85 | 0.70 | 0.45 | Later |
+| Reconstruction-error-driven correction | 0.70 | 0.80 | 0.75 | 0.85 | 0.90 | 0.55 | 0.35 | Later, powerful |
+| Multi-resolution semantic clustering | 0.80 | 0.70 | 0.65 | 0.85 | 0.80 | 0.75 | 0.50 | Later |
+| Auto-clean this family | 0.80 | 0.60 | 0.55 | 0.75 | 0.80 | 0.75 | 0.50 | Later |
+| Drag clusters in semantic space | 0.70 | 0.75 | 0.75 | 0.90 | 0.65 | 0.65 | 0.30 | Cool, not first |
+| Cluster gravity / physics metaphor | 0.60 | 0.80 | 0.80 | 0.80 | 0.50 | 0.60 | 0.20 | Risky novelty |
+| Context-aware classification | 0.65 | 0.80 | 0.75 | 0.65 | 0.80 | 0.55 | 0.25 | Researchy |
+| Teach mode across songs | 0.70 | 0.85 | 0.80 | 0.85 | 0.85 | 0.60 | 0.25 | Later platform feature |
+| Predictive batch questions | 0.85 | 0.55 | 0.50 | 0.90 | 0.80 | 0.80 | 0.70 | Build after uncertainty scoring |
+## Highest ROI set
+| Rank | Feature | Why |
+|---:|---|---|
+| 1 | Lock confirmed cluster identity | Easy and prevents frustrating reclustering drift |
+| 2 | Outlier-first review queue | Huge UX gain from simple uncertainty ranking |
+| 3 | Move sample to cluster as supervision | Core differentiator; directly improves results |
+| 4 | Pull sample out / protect distinction | Required counterpart to positive supervision |
+| 5 | Click-to-add missed onset | Direct correction beats indirect threshold tweaking |
+| 6 | Bleed brush | Removes many false positives quickly |
+| 7 | Explain cluster | Makes the system debuggable and trustworthy |
+| 8 | Predictive batch questions | Multiplies the effect of each correction |
+| 9 | Counterfactual previews | Makes advanced tuning understandable |
+| 10 | Reconstruction-error correction | Very powerful, but architecturally heavier |
+## Technical conclusion
+The most feasible "magic" is not heavy ML. It is constraint-aware clustering, cached feature vectors, uncertainty scoring, and local recomputation.
+That foundation should be implemented before adding higher-risk semantic-space or personalized-model features.
+## Implementation alignment as of 2026-05-12
+| Interaction | Current status | Notes |
+|---|---|---|
+| Move sample to cluster → auto-tune clustering | partial | Implemented as semantic state mutation with `force-cluster`/`must-link` constraints and heuristic move suggestions. True constrained local reclustering is still open. |
+| Pull sample out → protect distinction | partial | Implemented as a new user cluster plus `cannot-link` and split suggestions. True constrained reclustering is still open. |
+| Lock confirmed cluster identity | done | Lock/unlock persists in `supervision_state.json` and appears in the UI. Replay into future reruns is still open. |
+| Outlier-first review queue | done | Implemented with heuristic confidence/priority. Feature-margin and reconstruction-impact ranking remain open. |
+| Low-confidence visual emphasis | done | Low-confidence and suppressed hits are visually distinguished in the hit table and review queue. |
+| Bleed brush/suppression | partial | Suppress selected hit and similar suppression suggestions are implemented. Region brush and restore are still open. |
+| Click-to-add missed onset | todo | Waveform click currently auditions nearest existing hit only. |
+| Cluster naming as reusable semantic hint | todo | User clusters receive generated labels; explicit rename/semantic hinting is not implemented. |
+| Star/favorite sample → optimize around it | partial | Favorite pins representative in semantic state; artifact re-export does not yet honor it. |
+| Explain this cluster | done | Explanation endpoint and UI drawer are implemented. |
+| Predictive batch questions | partial | Suggestion inbox exists; exact diff previews and richer question phrasing are open. |
+| Live counterfactual parameter previews | todo | Not implemented. |
+| Reconstruction-error-driven correction | todo | Not implemented. |
+| Multi-resolution semantic clustering | todo | Not implemented. |
+| Auto-clean this family | todo | Not implemented. |
+| Drag clusters in semantic space | backlog | Not implemented. |
+| Cluster gravity / physics metaphor | backlog | Not implemented. |
+| Context-aware classification | backlog | Not implemented. |
+| Teach mode across songs | backlog | Not implemented. |
+## Revised highest-ROI next set
+| Rank | Feature | Why now |
+|---:|---|---|
+| 1 | Supervised re-export | Makes current semantic edits affect the downloadable sample pack. |
+| 2 | Force-onset from waveform | Adds the missing direct correction primitive for missed hits. |
+| 3 | Suppression restore | Required safety counterpart to suppression. |
+| 4 | Cached feature refs | Unlocks real local reclustering and better confidence. |
+| 5 | Diff preview for suggestions | Makes batch suggestions safer and more trustworthy. |
+| 6 | Constraint violation detection | Prevents silent conflicts once constraints become richer. |
+| 7 | Browser tests | Protects the increasingly stateful UI from regressions. |

docs/interactive-ux/FEATURE_REQUIREMENTS.md ADDED Viewed

	@@ -0,0 +1,162 @@

+# Interactive UX feature requirements
+## Goal
+Add interactions that make extraction faster and more accurate by converting user actions into reusable supervision signals.
+The application should progressively converge toward the user's intended drum vocabulary with minimal manual cleanup.
+## Success criteria
+- User corrections affect more than the single edited item when safe.
+- Explicit user intent is preserved across reclustering and reloads.
+- The system surfaces uncertain/high-leverage items before stable ones.
+- The user can understand why items are grouped or separated.
+- Cached stem/source analysis can be reused while downstream parameters update quickly.
+- The UI supports fast audition, correction, locking, and export loops.
+## Current implementation summary
+Implemented now:
+- Persistent semantic state per run: `supervision_state.json`.
+- Event log and constraint store.
+- Confidence-weighted hit/cluster state.
+- Outlier-first review queue.
+- Move hit to cluster.
+- Pull hit into new cluster.
+- Lock/unlock cluster.
+- Suppress hit as bleed/noise.
+- Accept/favorite selected hit.
+- Suggestion inbox with accept/reject.
+- Cluster explanation endpoint and UI drawer.
+- Undo for the last semantic edit.
+Partially implemented:
+- Local recomputation is currently semantic-state recomputation, not full feature-neighborhood reclustering.
+- Suggestions are heuristic and preview-count based, not full diff previews.
+- Favorite/pin changes semantic representative but does not yet regenerate the sample pack.
+- Confidence scoring is heuristic, not feature-margin/stability based.
+Not implemented yet:
+- Click-to-add missed onset.
+- Restore suppressed hit.
+- Supervised re-export from edited state.
+- Counterfactual parameter previews.
+- Reconstruction-error correction.
+- Teach mode across songs.
+## Functional requirements and status
+### FR-001: Cluster move as positive supervision
+Status: **implemented as semantic-state edit**.
+When a user moves a hit/sample into a cluster, the backend creates `force-cluster` and, when a representative exists, `must-link`. State confidence is recomputed and similar hit suggestions may be generated.
+Remaining gap: true local feature-neighborhood reclustering and exact before/after diff preview.
+### FR-002: Pull-out as negative supervision
+Status: **implemented as semantic-state edit**.
+Pulling a hit out creates a new user cluster, records a `cannot-link` to the source representative when possible, and stores a `force-cluster` assignment for the new cluster.
+Remaining gap: automatic split suggestions are heuristic and do not yet run constrained reclustering.
+### FR-003: Lock confirmed cluster identity
+Status: **implemented**.
+Clusters can be locked/unlocked through the API/UI. Lock state is persisted and influences confidence.
+Remaining gap: future full reruns do not yet replay locks into batch clustering.
+### FR-004: Outlier-first review queue
+Status: **implemented**.
+The backend returns a `review_queue` sorted by low confidence, singleton status, review status, and suppression state. The UI renders the queue and lets the user select/audition items.
+Remaining gap: expected-impact ranking should eventually use feature margin and reconstruction contribution.
+### FR-005: Confidence-weighted visual emphasis
+Status: **implemented**.
+Hit rows display confidence and flags. Low-confidence rows get emphasis; suppressed rows visually recede.
+### FR-006: Click-to-add missed onset
+Status: **not implemented**.
+Current waveform click auditions the nearest existing hit only.
+Required next behavior:
+- Add `force-onset` constraint at selected time.
+- Slice candidate hit from cached `stem.wav`.
+- Classify and assign locally.
+- Store it as a forced hit in `supervision_state.json`.
+### FR-007: Bleed brush / false-positive suppression
+Status: **implemented for selected hits; brush region not implemented**.
+Selected hits can be suppressed as bleed/noise. The system stores `suppress-pattern` and proposes similar suppressions.
+Remaining gap: region brush, restore, and supervised export exclusion.
+### FR-008: Favorite/pin sample optimization
+Status: **partial**.
+Favorite action records `pin-representative` and updates the semantic representative. Exported WAV/ZIP does not yet change.
+### FR-009: Explain cluster
+Status: **implemented**.
+Cluster explanation includes representative, hit counts, confidence reasons, label distribution, outliers, and relevant constraints.
+### FR-010: Predictive batch questions
+Status: **partial**.
+Suggestions exist for move/split/suppress patterns and can be accepted/rejected. They do not yet show exact diff previews.
+### FR-011: Live counterfactual parameter preview
+Status: **not implemented**.
+### FR-012: Reconstruction-error correction
+Status: **not implemented**.
+## Non-functional requirements and status
+### NFR-001: Reversibility
+Status: **implemented for semantic edits** via undo stack.
+### NFR-002: Explainability
+Status: **partial**. Events, constraints, confidence reasons, and cluster explanations are visible. Suggestion diff previews are not yet implemented.
+### NFR-003: Local recomputation first
+Status: **partial**. Current recomputation is cheap semantic-state recomputation. True local feature reclustering remains open.
+### NFR-004: Cached preprocessing
+Status: **partial**. Stems/source loads are cached and hit audio is exported. Feature-vector caching is still open.
+### NFR-005: Deterministic replay
+Status: **partial**. Constraints/events persist and can be reloaded. A dedicated replay command/export pipeline is still open.
+### NFR-006: No silent override of explicit user intent
+Status: **implemented in current semantic layer**. Explicit moves, locks, suppressions, and favorites persist unless undone or explicitly changed.

docs/interactive-ux/PROGRESS.md ADDED Viewed

	@@ -0,0 +1,87 @@

+# Interactive UX progress
+## Last updated
+2026-05-12
+## Current phase
+Implementation is now in **Phase 1–3 foundation**: persistent state, constraints, events, confidence/review queue, and supervised cluster interactions are implemented at the semantic-state layer.
+## Completed in this pass
+| Item | Status | Notes |
+|---|---|---|
+| Add supplied docs to `docs/interactive-ux/` | done | All supplied Markdown docs were copied into this directory and aligned with the implemented project. |
+| Persistent job state | done | `supervision_state.json` is created beside `manifest.json` for each completed run. |
+| Hit/cluster state schema | done | Implemented in `supervised_state.py` using JSON-serializable dictionaries. |
+| Event log | done | State mutations append `job.state.created`, `constraint.created`, `hit.moved`, `hit.pulled_out`, `cluster.locked`, `hit.suppressed`, `suggestion.created`, etc. |
+| Constraint store | done | Supports `force-cluster`, `must-link`, `cannot-link`, `lock-cluster`, `suppress-pattern`, and `pin-representative`. |
+| Confidence scoring | partial | Heuristic scores based on cluster size, label agreement, energy rank, representative/favorite/explicit state, suppression, and lock state. Feature-vector margin scoring is not implemented yet. |
+| Outlier-first review queue | done | Backend computes `review_queue`; UI renders it and lets the user jump to the selected hit. |
+| Move hit to cluster | done | Endpoint creates constraints and updates state; it also proposes similar move suggestions. |
+| Pull hit into new cluster | done | Endpoint creates `cannot-link` and `force-cluster` constraints and a user cluster. |
+| Lock cluster | done | Endpoint toggles lock state and records a lock constraint. |
+| Suppress hit as bleed/noise | done | Endpoint creates `suppress-pattern`, marks the hit suppressed, and proposes similar suppressions. |
+| Favorite sample / pin representative | partial | Endpoint supports `review` status `favorite`, records `pin-representative`, and updates the representative hit in semantic state. Audio artifact selection is not re-exported yet. |
+| Suggestion inbox | partial | Open suggestions render in the UI and can be accepted/rejected. Suggestion generation is heuristic and limited to move/split/suppress patterns. |
+| Cluster explanation drawer | done | Endpoint and UI show representative, confidence reasons, outliers, relevant constraints, and label distribution. |
+| Undo | done | Last semantic edit can be restored using an undo snapshot stack. |
+| Validation script | done | Added `scripts/test_interactive_supervision.py`. |
+## Not yet implemented
+- Real cached feature-vector store for local reclustering.
+- Artifact re-export after semantic edits.
+- Waveform click-to-add missed onset.
+- Restore suppressed hit/batch restore.
+- Real local neighborhood reclustering that changes assignments beyond explicit move/suggestion acceptance.
+- Constraint violation detection and reporting.
+- Predictive diff preview before accepting suggestions.
+- Reconstruction-error-driven correction.
+- Multi-resolution/hierarchical clusters.
+- User correction profiles / teach mode across songs.
+- Frontend TypeScript migration and browser automation tests.
+## Current risks
+| Risk | Impact | Mitigation |
+|---|---|---|
+| Semantic edits do not rewrite exports yet | User may expect moved/suppressed hits to affect ZIP/MIDI immediately | Next task should be edited-state export. |
+| Confidence scores are heuristic | Review queue may sometimes prioritize the wrong hits | Add cached mel/transient features and margin-to-next-cluster scoring. |
+| Suggestions are simple | May over-suggest or under-suggest | Keep them previewable, explicit, and undoable; never silently apply. |
+| Locks are semantic only | Batch reruns do not yet replay constraints | Add deterministic replay/local recluster using constraints. |
+| No browser tests | UI regressions are easy | Add Playwright or lightweight DOM tests. |
+## Next implementation milestone
+Milestone: **edited-state export and force-onset correction**.
+Minimum deliverables:
+1. Add `POST /api/jobs/{job_id}/export/supervised` that creates a ZIP/MIDI/manifest from `supervision_state.json`.
+2. Exclude suppressed hits/clusters from the supervised export.
+3. Honor favorite/pinned representatives in exported samples.
+4. Add force-onset endpoint that slices a new hit from cached `stem.wav`.
+5. Add waveform shift-click or add-onset mode in the UI.
+6. Add tests proving semantic edits change the supervised export without rerunning stem extraction.
+## Definition of done for the current foundation
+This loop now works:
+```text
+analyze audio
+→ inspect clusters
+→ load semantic state
+→ move one wrong hit
+→ store constraints/events
+→ see confidence/review queue update
+→ lock corrected cluster
+→ suppress bleed
+→ inspect explanations and suggestions
+→ undo semantic edits
+→ reload the job and preserve explicit decisions
+```
+The remaining missing piece is that edited semantic state is not yet reflected in a regenerated sample pack.

docs/interactive-ux/README.md ADDED Viewed

	@@ -0,0 +1,59 @@

+# Interactive supervised extraction UX
+This directory contains the supplied interactive-UX design documents, aligned with the implementation as of 2026-05-12.
+The product direction is to turn drum sample extraction from a one-shot batch process into an interactive, supervised, progressively improving workflow. User edits should become constraints, preferences, and examples that improve clustering, review priority, labeling, and cleanup instead of only changing a visible table row.
+## Current implementation status
+The project now has a first supervised-editing foundation layered on top of the immutable extraction manifest:
+- `supervised_state.py` persists `supervision_state.json` beside each completed run manifest.
+- The state contains hits, clusters, confidence scores, review queue entries, constraints, events, suggestions, and undo snapshots.
+- The FastAPI backend exposes state, move, pull-out, lock, suppress, review/favorite, suggestion, explanation, and undo endpoints.
+- The browser UI includes an interactive supervision panel with a review queue, cluster board, suggestion inbox, constraint/event log, and cluster explanation drawer.
+- The current supervised layer updates semantic state only. It does not yet rewrite sample WAVs, MIDI, reconstruction audio, or the ZIP after edits.
+## Documents
+| Document | Purpose | Alignment status |
+|---|---|---|
+| [`FEATURE_REQUIREMENTS.md`](./FEATURE_REQUIREMENTS.md) | Functional requirements for interaction-driven quality improvements | Updated with implemented/partial/not-started status |
+| [`SCOPE.md`](./SCOPE.md) | MVP, v2, and research/backlog boundaries | Updated to reflect delivered MVP foundation |
+| [`FEASIBILITY_MATRIX.md`](./FEASIBILITY_MATRIX.md) | Feasibility, complexity, effort, UX value, quality value table | Kept as design prioritization with implementation notes |
+| [`TASKS.md`](./TASKS.md) | Implementation task breakdown with dependencies and acceptance criteria | Updated with current task statuses |
+| [`PROGRESS.md`](./PROGRESS.md) | Current status, completed work, open work, next actions | Updated after this development pass |
+| [`ARCHITECTURE_NOTES.md`](./ARCHITECTURE_NOTES.md) | Required state model and technical approach | Updated with actual module/API mapping |
+## Principle
+Good interaction:
+```text
+user moves one hit into another cluster
+→ system creates force-cluster/must-link constraints
+→ semantic state recomputes confidence and review priority
+→ similar ambiguous hits are proposed as suggestions
+→ the UI explains the state/event/constraint changes
+```
+Bad interaction:
+```text
+user moves one hit
+→ only that one visible DOM row changes
+→ future clustering ignores the correction
+```
+## Current build recommendation
+Next implementation should close the gap between semantic edits and artifact output:
+1. Re-export edited sample pack from `supervision_state.json` without rerunning Demucs/onset detection.
+2. Add waveform click-to-add missed onset backed by `force-onset` constraints.
+3. Add restore for suppressed hits and batch accepted suggestions.
+4. Make move/pull-out trigger a real local reclustering pass using cached feature vectors.
+5. Add visible diff previews before accepting grouped suggestions.
+6. Add browser-level tests for the interactive supervision panel.
+The strongest technical foundation remains constraint-aware clustering plus uncertainty-driven review. The current pass implements the persistent state and UI/API shell needed for that foundation.

docs/interactive-ux/SCOPE.md ADDED Viewed

	@@ -0,0 +1,173 @@

+# Interactive UX scope
+## Scope model
+Features are grouped by implementation risk and product leverage.
+- **MVP**: should be built first; high value, technically feasible, foundational.
+- **V2**: valuable after the core supervised editing loop exists.
+- **Research/backlog**: plausible, but higher risk or dependent on stronger state/model infrastructure.
+## MVP scope and status
+### 1. Constraint-aware cluster editing
+Status: **partial / foundation implemented**.
+Implemented:
+- Move hit/sample to cluster.
+- Pull hit/sample out into a new cluster.
+- Create `must-link`, `cannot-link`, and `force-cluster` constraints.
+- Preserve explicit edits in `supervision_state.json` across reload.
+- Generate basic suggestions after edits.
+Remaining:
+- True local neighborhood reclustering from cached feature vectors.
+- Constraint violation detection.
+- Edited artifact re-export.
+### 2. Lock confirmed clusters
+Status: **implemented**.
+Implemented:
+- Lock/unlock cluster.
+- Persist lock state.
+- Show lock state in the cluster board and target cluster control.
+- Confidence scoring accounts for locked state.
+Remaining:
+- Prevent future batch reruns from violating locked cluster identity through deterministic replay.
+### 3. Outlier-first review queue
+Status: **implemented**.
+Implemented:
+- Confidence score per hit and cluster.
+- Review list sorted by uncertainty priority.
+- Quick actions: accept, favorite, move, pull out, suppress.
+Remaining:
+- Split/restore/batch actions.
+- Expected-impact scoring using feature margin and reconstruction contribution.
+### 4. Confidence-weighted UI
+Status: **implemented**.
+Implemented:
+- Confidence per hit and cluster.
+- Low-confidence row emphasis.
+- Suppressed rows recede.
+- State summary badges.
+### 5. Click-to-add missed onset
+Status: **not implemented**.
+Current waveform clicks audition existing hits. Add-onset mode is the next direct correction primitive.
+### 6. Bleed suppression examples
+Status: **partial**.
+Implemented:
+- Mark selected hit as bleed/noise.
+- Store `suppress-pattern`.
+- Generate similar suppression suggestions.
+Remaining:
+- Region brush.
+- Restore suppressed hits.
+- Exclude suppressed hits from supervised export.
+### 7. Explain this cluster
+Status: **implemented**.
+Implemented:
+- Representative ID.
+- Members/outliers.
+- Confidence reasons.
+- Label distribution.
+- Relevant constraints.
+- Locked/suppressed counts.
+## V2 scope
+### 1. Predictive batch questions
+Status: **partial**.
+Basic suggestions exist. Exact diff previews, grouped approval UX, and richer reasoning are not complete.
+### 2. Live counterfactual previews
+Status: **not implemented**.
+### 3. Auto-clean this family
+Status: **not implemented**.
+### 4. Multi-resolution semantic clustering
+Status: **not implemented**.
+### 5. Run-to-run preference memory
+Status: **not implemented**.
+## Research/backlog scope
+Unchanged from design intent:
+- Reconstruction-error-driven correction.
+- Drag clusters in semantic space.
+- Cluster gravity / physics metaphor.
+- Context-aware classification.
+- Teach mode across songs.
+## Explicitly out of scope for MVP
+- Realtime Demucs.
+- Training neural models from scratch.
+- Cloud multi-user collaboration.
+- DAW plugin integration.
+- Full Ableton/Logic export format support.
+- Automatic perfect drum transcription claims.
+## MVP acceptance criteria status
+Target loop:
+```text
+cached stem is analyzed
+→ clusters appear
+→ uncertain items are prioritized
+→ user moves one wrong hit
+→ system stores a constraint
+→ affected state recomputes
+→ related mistakes are suggested
+→ locked clusters remain stable in state
+→ semantic decisions reload reproducibly
+```
+Status: **mostly achieved for semantic state**.
+Not achieved yet:
+```text
+→ edited state can be exported reproducibly as updated WAV/MIDI/ZIP artifacts
+→ local feature reclustering updates related hits without manually accepting suggestions
+```

docs/interactive-ux/TASKS.md ADDED Viewed

	@@ -0,0 +1,105 @@

+# Interactive UX tasks
+## Status legend
+| Status | Meaning |
+|---|---|
+| `todo` | Not started |
+| `doing` | In progress |
+| `partial` | Implemented as a useful first version but not complete |
+| `done` | Completed for the current architecture |
+| `blocked` | Waiting on prerequisite decision/work |
+## Phase 0: Documentation and design capture
+| ID | Task | Status | Acceptance criteria |
+|---|---|---|---|
+| UX-000 | Capture feature requirements | done | Requirements documented in `FEATURE_REQUIREMENTS.md` |
+| UX-001 | Capture scope boundaries | done | MVP/V2/backlog documented in `SCOPE.md` |
+| UX-002 | Capture feasibility matrix | done | Scores documented in `FEASIBILITY_MATRIX.md` |
+| UX-003 | Capture implementation tasks | done | Task backlog documented here |
+| UX-004 | Capture progress | done | Progress documented in `PROGRESS.md` |
+| UX-005 | Align supplied docs with implemented project | done | All documents updated to reflect current code/API/UI behavior |
+## Phase 1: State model and persistence
+| ID | Task | Status | Acceptance criteria |
+|---|---|---|---|
+| UX-101 | Define persisted job state schema | done | `supervised_state.py` creates `supervision_state.json` with hits, clusters, constraints, events, suggestions, and undo state |
+| UX-102 | Add event log | done | User/system changes append events and the event log is stored with job output |
+| UX-103 | Add constraint store | done | `must-link`, `cannot-link`, `force-cluster`, `lock-cluster`, `suppress-pattern`, and `pin-representative` are saved |
+| UX-104 | Add artifact/cache references | partial | State stores hit audio file refs and manifest fingerprint; feature refs and content-addressed feature cache are not implemented yet |
+| UX-105 | Add deterministic replay command | todo | A job can be regenerated from source digest, params, and constraints |
+| UX-106 | Add semantic undo stack | done | `POST /api/jobs/{job_id}/undo` restores the previous semantic state snapshot |
+## Phase 2: Confidence and review queue
+| ID | Task | Status | Acceptance criteria |
+|---|---|---|---|
+| UX-201 | Compute hit assignment confidence | partial | Each hit has confidence from cluster confidence, label agreement, energy rank, representative/explicit/suppressed state; feature-vector margin is not implemented |
+| UX-202 | Compute cluster confidence | partial | Each cluster has confidence from purity, size, representative, lock state; true feature cohesion/stability is not implemented |
+| UX-203 | Add uncertainty ranking | done | Backend returns `review_queue` sorted by priority |
+| UX-204 | Add low-confidence UI emphasis | done | Hit rows and review queue emphasize low-confidence and suppressed items |
+| UX-205 | Add review quick actions | partial | Accept, favorite, move, pull-out, suppress are available; split/restore/batch actions are still incomplete |
+## Phase 3: Constraint-aware editing
+| ID | Task | Status | Acceptance criteria |
+|---|---|---|---|
+| UX-301 | Move hit to cluster endpoint | done | Moving a hit creates `force-cluster` and usually `must-link` constraints |
+| UX-302 | Pull hit into new cluster endpoint | done | Pulling a hit creates a new user cluster plus `cannot-link`/`force-cluster` constraints |
+| UX-303 | Lock cluster endpoint | done | Locked cluster state persists and is shown in the UI |
+| UX-304 | Local neighborhood recomputation | partial | Semantic state and confidence are recomputed; true cached feature-neighborhood reclustering is not implemented |
+| UX-305 | Constraint violation detection | todo | Backend reports attempted changes that violate user constraints |
+| UX-306 | Undo last interaction | done | User can reverse the previous semantic edit |
+| UX-307 | Suggest similar moves from a hit move | partial | Heuristic suggestions are generated using label, centroid, and energy similarity |
+## Phase 4: Missed onset and bleed interactions
+| ID | Task | Status | Acceptance criteria |
+|---|---|---|---|
+| UX-401 | Add force-onset endpoint | todo | User can add onset by time; hit is sliced/classified/clustered |
+| UX-402 | Add waveform click interaction | partial | Existing waveform clicks audition nearest hit; add-onset mode is not implemented |
+| UX-403 | Add bleed suppression endpoint | done | User can mark hit as bleed/noise and create a suppress-pattern example |
+| UX-404 | Implement similar-false-positive search | partial | Heuristic suppress suggestions are generated from energy/centroid/label similarity |
+| UX-405 | Add suppression restore | todo | Suppressed hits can be restored individually or in batches |
+## Phase 5: Suggestions and explanations
+| ID | Task | Status | Acceptance criteria |
+|---|---|---|---|
+| UX-501 | Add suggestion model | done | Suggestions have type, confidence, reason, preview count, status, accept/reject state |
+| UX-502 | Generate move suggestions from cluster edits | partial | Basic suggestions generated after move edits |
+| UX-503 | Generate split suggestions from cannot-link edits | partial | Basic split suggestions generated after pull-out edits |
+| UX-504 | Generate suppression suggestions | partial | Basic suppression suggestions generated after suppress edits |
+| UX-505 | Add suggestion inbox UI | done | Suggestions are visible and can be accepted/rejected |
+| UX-506 | Add explain-cluster endpoint | done | Backend returns representative, confidence reasons, outliers, constraints, label distribution |
+| UX-507 | Add explanation drawer UI | done | UI renders explanation JSON for the selected cluster |
+| UX-508 | Add diff previews before suggestion acceptance | todo | Suggestions show exact before/after cluster membership changes before acceptance |
+## Phase 6: Counterfactuals and advanced quality features
+| ID | Task | Status | Acceptance criteria |
+|---|---|---|---|
+| UX-601 | Add parameter diff estimator | todo | UI can preview approximate effect of parameter changes |
+| UX-602 | Add cached local parameter recompute | todo | Parameter changes reuse hit features where possible |
+| UX-603 | Add reconstruction error map | todo | Backend computes original-vs-reconstruction mismatch by region |
+| UX-604 | Add reconstruction correction workflow | todo | User can select bad region and get likely causes/fixes |
+| UX-605 | Add multi-resolution cluster hierarchy | todo | Clusters can be browsed coarse-to-fine |
+## Phase 7: Preference memory / teach mode
+| ID | Task | Status | Acceptance criteria |
+|---|---|---|---|
+| UX-701 | Persist user correction profiles | todo | Reusable preference profile stores accepted correction patterns |
+| UX-702 | Apply profile to new jobs | todo | New jobs can opt into prior preferences |
+| UX-703 | Add profile safety controls | todo | User can inspect, disable, delete, and scope learned preferences |
+| UX-704 | Evaluate profile impact | todo | Benchmarks compare profile-on vs profile-off results |
+## Current recommended next task
+Start with `UX-401` plus supervised export.
+Reason:
+The project now has a replayable state/events/constraints foundation. The largest UX gap is that semantic edits do not yet regenerate edited artifacts. Force-onset is the next direct correction primitive after move/pull/lock/suppress.

scripts/test_interactive_supervision.py ADDED Viewed

	@@ -0,0 +1,112 @@

+#!/usr/bin/env python3
+"""Smoke-test manifest-backed interactive supervision endpoints."""
+from __future__ import annotations
+import io
+import json
+import sys
+import time
+from pathlib import Path
+from urllib.parse import quote
+import soundfile as sf
+from fastapi.testclient import TestClient
+sys.path.insert(0, str(Path(__file__).resolve().parents[1]))
+from app import app  # noqa: E402
+from synth_generator import generate_test_song  # noqa: E402
+def wait_for_job(client: TestClient, job_id: str) -> dict:
+    for _ in range(80):
+        payload = client.get(f"/api/jobs/{job_id}").json()
+        if payload["status"] in {"complete", "error"}:
+            return payload
+        time.sleep(0.15)
+    raise TimeoutError(job_id)
+def post_json(client: TestClient, path: str, body: dict | None = None) -> dict:
+    response = client.post(path, json=body or {})
+    response.raise_for_status()
+    return response.json()
+def main() -> int:
+    song = generate_test_song(pattern_name="funk", bars=1, bpm=124, add_bass=False)
+    buf = io.BytesIO()
+    sf.write(buf, song.drums_only, song.sr, format="WAV")
+    buf.seek(0)
+    client = TestClient(app)
+    response = client.post(
+        "/api/jobs",
+        files={"file": ("interactive.wav", buf, "audio/wav")},
+        data={"params": json.dumps({"stem": "all", "clustering_mode": "online_preview", "target_min": 3, "target_max": 10})},
+    )
+    response.raise_for_status()
+    job_id = response.json()["id"]
+    job = wait_for_job(client, job_id)
+    assert job["status"] == "complete", job.get("error")
+    state = client.get(f"/api/jobs/{job_id}/state").json()
+    assert state["summary"]["hit_count"] > 0
+    assert state["summary"]["cluster_count"] > 0
+    assert state["review_queue"], "expected uncertainty review queue"
+    hit_id = state["hits"][0]["id"]
+    cluster_id = state["clusters"][0]["id"]
+    q_hit = quote(hit_id, safe="")
+    q_cluster = quote(cluster_id, safe="")
+    state = post_json(client, f"/api/jobs/{job_id}/clusters/{q_cluster}/lock", {"locked": True})
+    assert state["summary"]["locked_cluster_count"] >= 1
+    state = post_json(client, f"/api/jobs/{job_id}/hits/{q_hit}/review", {"status": "favorite"})
+    assert state["summary"]["constraint_count"] >= 1
+    explanation = client.get(f"/api/jobs/{job_id}/explain/cluster/{q_cluster}")
+    explanation.raise_for_status()
+    assert explanation.json()["cluster_id"] == cluster_id
+    state = post_json(client, f"/api/jobs/{job_id}/hits/{q_hit}/pull-out", {})
+    assert state["summary"]["cluster_count"] >= 1
+    assert state["summary"]["undo_available"] is True
+    assert any(c["type"] in {"cannot-link", "force-cluster"} for c in state["constraints"])
+    state = post_json(client, f"/api/jobs/{job_id}/undo", {})
+    assert state["summary"]["hit_count"] > 0
+    if len(state["clusters"]) > 1:
+        target = next(c for c in state["clusters"] if c["id"] != state["hits"][0]["cluster_id"])
+        state = post_json(
+            client,
+            f"/api/jobs/{job_id}/hits/{q_hit}/move",
+            {"target_cluster_id": target["id"]},
+        )
+        assert any(c["type"] == "force-cluster" for c in state["constraints"])
+    if len(state["hits"]) > 1:
+        suppress_hit = quote(state["hits"][1]["id"], safe="")
+        state = post_json(client, f"/api/jobs/{job_id}/hits/{suppress_hit}/suppress", {"reason": "bleed"})
+        assert state["summary"]["suppressed_hit_count"] >= 1
+    suggestions = client.get(f"/api/jobs/{job_id}/suggestions")
+    suggestions.raise_for_status()
+    print(json.dumps({
+        "status": "ok",
+        "job_id": job_id,
+        "hit_count": state["summary"]["hit_count"],
+        "cluster_count": state["summary"]["cluster_count"],
+        "constraints": state["summary"]["constraint_count"],
+        "events": state["summary"]["event_count"],
+        "suggestions": state["summary"]["open_suggestion_count"],
+    }, indent=2))
+    return 0
+if __name__ == "__main__":
+    raise SystemExit(main())

supervised_state.py ADDED Viewed

	@@ -0,0 +1,675 @@

+#!/usr/bin/env python3
+"""Persistent supervised-editing state for interactive extraction jobs.
+The extraction pipeline produces immutable audio artifacts and a batch manifest.
+This module layers replayable semantic state on top of that manifest: hits,
+clusters, constraints, events, suggestions, confidence, and undo snapshots.
+The first implementation intentionally avoids rewriting audio artifacts. It makes
+supervised edits cheap, explicit, inspectable, and reproducible, then leaves
+artifact re-export as a later step.
+"""
+from __future__ import annotations
+import copy
+import json
+import math
+import time
+import uuid
+from pathlib import Path
+from typing import Any, Callable
+STATE_VERSION = "interactive-state-v1"
+STATE_FILENAME = "supervision_state.json"
+MAX_UNDO = 30
+def now() -> float:
+    return round(time.time(), 6)
+def state_path(output_dir: str | Path) -> Path:
+    return Path(output_dir) / STATE_FILENAME
+def manifest_path(output_dir: str | Path) -> Path:
+    return Path(output_dir) / "manifest.json"
+def load_manifest(output_dir: str | Path) -> dict[str, Any]:
+    path = manifest_path(output_dir)
+    if not path.exists():
+        raise FileNotFoundError(f"manifest.json not found in {Path(output_dir)}")
+    return json.loads(path.read_text(encoding="utf-8"))
+def _hit_id(hit: dict[str, Any]) -> str:
+    return f"hit:{int(hit.get('index', 0)):05d}"
+def _cluster_id(raw: Any) -> str:
+    return f"cluster:{raw}"
+def _base_label(label: str) -> str:
+    text = str(label or "other")
+    return text.rsplit("_", 1)[0] if "_" in text else text
+def _new_id(prefix: str) -> str:
+    return f"{prefix}:{uuid.uuid4().hex[:10]}"
+def _safe_float(value: Any, default: float = 0.0) -> float:
+    try:
+        out = float(value)
+        if math.isfinite(out):
+            return out
+    except Exception:
+        pass
+    return default
+def _snapshot(state: dict[str, Any]) -> dict[str, Any]:
+    snap = copy.deepcopy(state)
+    snap["undo_stack"] = []
+    return snap
+def _push_undo(state: dict[str, Any]) -> None:
+    stack = list(state.get("undo_stack") or [])
+    stack.append(_snapshot(state))
+    del stack[:-MAX_UNDO]
+    state["undo_stack"] = stack
+def _event(state: dict[str, Any], event_type: str, payload: dict[str, Any] | None = None, source: str = "system") -> dict[str, Any]:
+    event = {
+        "id": _new_id("event"),
+        "type": event_type,
+        "source": source,
+        "created_at": now(),
+        "payload": payload or {},
+    }
+    state.setdefault("events", []).append(event)
+    return event
+def _constraint(state: dict[str, Any], constraint_type: str, payload: dict[str, Any], source: str = "user") -> dict[str, Any]:
+    constraint = {
+        "id": _new_id("constraint"),
+        "type": constraint_type,
+        "source": source,
+        "created_at": now(),
+        **payload,
+    }
+    state.setdefault("constraints", []).append(constraint)
+    _event(state, "constraint.created", {"constraint_id": constraint["id"], "type": constraint_type}, source=source)
+    return constraint
+def _write_state(output_dir: str | Path, state: dict[str, Any]) -> dict[str, Any]:
+    state["updated_at"] = now()
+    path = state_path(output_dir)
+    path.write_text(json.dumps(state, indent=2, sort_keys=True), encoding="utf-8")
+    return state
+def _cluster_label_for_hit(hit: dict[str, Any]) -> str:
+    return str(hit.get("cluster_label") or f"{hit.get('label', 'other')}_{hit.get('cluster_id', '0')}")
+def build_initial_state(job_id: str, manifest: dict[str, Any]) -> dict[str, Any]:
+    hits_by_id: dict[str, dict[str, Any]] = {}
+    clusters: dict[str, dict[str, Any]] = {}
+    raw_hits = list(manifest.get("hits") or [])
+    if not raw_hits:
+        # Older manifests may only contain samples. Keep state valid even then.
+        raw_hits = []
+    for hit in raw_hits:
+        hid = _hit_id(hit)
+        cid = _cluster_id(hit.get("cluster_id", "unclustered"))
+        cluster_label = _cluster_label_for_hit(hit)
+        hits_by_id[hid] = {
+            "id": hid,
+            "index": int(hit.get("index", len(hits_by_id))),
+            "label": str(hit.get("label") or "other"),
+            "cluster_id": cid,
+            "original_cluster_id": cid,
+            "cluster_label": cluster_label,
+            "onset_sec": _safe_float(hit.get("onset_sec")),
+            "duration_ms": _safe_float(hit.get("duration_ms")),
+            "rms_energy": _safe_float(hit.get("rms_energy")),
+            "spectral_centroid_hz": _safe_float(hit.get("spectral_centroid_hz")),
+            "file": hit.get("file"),
+            "is_representative": bool(hit.get("is_representative")),
+            "source": "detected",
+            "suppressed": False,
+            "favorite": False,
+            "review_status": "unreviewed",
+            "confidence": 0.0,
+            "confidence_reasons": [],
+            "explicit": False,
+        }
+        clusters.setdefault(
+            cid,
+            {
+                "id": cid,
+                "label": cluster_label,
+                "classification": _base_label(cluster_label),
+                "hit_ids": [],
+                "representative_hit_id": None,
+                "locked": False,
+                "user_named": False,
+                "confidence": 0.0,
+                "confidence_reasons": [],
+                "suppressed_count": 0,
+                "original_id": cid,
+            },
+        )["hit_ids"].append(hid)
+        if bool(hit.get("is_representative")):
+            clusters[cid]["representative_hit_id"] = hid
+    for cid, cluster in clusters.items():
+        if cluster["representative_hit_id"] is None and cluster["hit_ids"]:
+            cluster["representative_hit_id"] = cluster["hit_ids"][0]
+    state = {
+        "version": STATE_VERSION,
+        "job_id": job_id,
+        "created_at": now(),
+        "updated_at": now(),
+        "manifest_fingerprint": _manifest_fingerprint(manifest),
+        "hits": hits_by_id,
+        "clusters": clusters,
+        "constraints": [],
+        "events": [],
+        "suggestions": [],
+        "undo_stack": [],
+        "counters": {"user_clusters": 0},
+    }
+    recompute_scores(state)
+    _event(
+        state,
+        "job.state.created",
+        {
+            "hit_count": len(hits_by_id),
+            "cluster_count": len(clusters),
+            "manifest_fingerprint": state["manifest_fingerprint"],
+        },
+    )
+    return state
+def _manifest_fingerprint(manifest: dict[str, Any]) -> str:
+    import hashlib
+    payload = {
+        "params": manifest.get("params"),
+        "hit_count": manifest.get("hit_count"),
+        "cluster_count": manifest.get("cluster_count"),
+        "files": manifest.get("files"),
+        "hits": [
+            {
+                "index": h.get("index"),
+                "cluster_id": h.get("cluster_id"),
+                "file": h.get("file"),
+                "onset_sec": h.get("onset_sec"),
+            }
+            for h in manifest.get("hits", [])
+        ],
+    }
+    return hashlib.sha256(json.dumps(payload, sort_keys=True).encode("utf-8")).hexdigest()
+def load_or_create_state(job_id: str, output_dir: str | Path) -> dict[str, Any]:
+    path = state_path(output_dir)
+    if path.exists():
+        state = json.loads(path.read_text(encoding="utf-8"))
+        if state.get("version") != STATE_VERSION:
+            raise ValueError(f"Unsupported supervision state version: {state.get('version')}")
+        return state
+    manifest = load_manifest(output_dir)
+    state = build_initial_state(job_id, manifest)
+    return _write_state(output_dir, state)
+def _active_hits(state: dict[str, Any], cluster: dict[str, Any]) -> list[dict[str, Any]]:
+    hits = state.get("hits", {})
+    return [hits[hid] for hid in cluster.get("hit_ids", []) if hid in hits and not hits[hid].get("suppressed")]
+def recompute_scores(state: dict[str, Any]) -> None:
+    hits = state.get("hits", {})
+    clusters = state.get("clusters", {})
+    energies = sorted(_safe_float(hit.get("rms_energy")) for hit in hits.values())
+    def energy_rank(value: float) -> float:
+        if not energies:
+            return 0.5
+        less = sum(1 for item in energies if item <= value)
+        return less / max(1, len(energies))
+    for cluster in clusters.values():
+        members = [hits[hid] for hid in cluster.get("hit_ids", []) if hid in hits]
+        active = [hit for hit in members if not hit.get("suppressed")]
+        if not members:
+            confidence = 0.15
+            reasons = ["empty cluster"]
+        else:
+            labels: dict[str, int] = {}
+            for hit in active:
+                labels[hit.get("label", "other")] = labels.get(hit.get("label", "other"), 0) + 1
+            majority = max(labels.values()) if labels else 0
+            purity = majority / max(1, len(active))
+            size_score = min(1.0, math.log2(len(active) + 1) / 4.0)
+            representative_bonus = 0.12 if cluster.get("representative_hit_id") in cluster.get("hit_ids", []) else 0.0
+            lock_bonus = 0.12 if cluster.get("locked") else 0.0
+            confidence = (0.42 * purity) + (0.34 * size_score) + representative_bonus + lock_bonus
+            reasons = []
+            if len(active) <= 1:
+                reasons.append("singleton cluster")
+            if purity < 0.75:
+                reasons.append("mixed labels")
+            if cluster.get("locked"):
+                reasons.append("user locked")
+            if representative_bonus:
+                reasons.append("has representative")
+        cluster["confidence"] = round(max(0.0, min(1.0, confidence)), 4)
+        cluster["confidence_reasons"] = reasons or ["cohesive cluster"]
+        cluster["suppressed_count"] = sum(1 for hit in members if hit.get("suppressed"))
+    for hit in hits.values():
+        cluster = clusters.get(hit.get("cluster_id"), {})
+        active_count = len(_active_hits(state, cluster)) if cluster else 0
+        label_match = _base_label(str(cluster.get("label", ""))) == str(hit.get("label", ""))
+        energy = energy_rank(_safe_float(hit.get("rms_energy")))
+        duration_ms = _safe_float(hit.get("duration_ms"))
+        duration_score = 0.65 if duration_ms <= 0 else max(0.0, min(1.0, 1.0 - abs(duration_ms - 180.0) / 700.0))
+        cluster_conf = _safe_float(cluster.get("confidence"), 0.2)
+        confidence = (0.42 * cluster_conf) + (0.18 * min(1.0, active_count / 4.0)) + (0.18 if label_match else 0.0) + (0.12 * energy) + (0.10 * duration_score)
+        reasons = []
+        if active_count <= 1:
+            reasons.append("singleton")
+        if not label_match:
+            reasons.append("label differs from cluster")
+        if energy < 0.2:
+            reasons.append("low energy")
+        if hit.get("is_representative"):
+            confidence += 0.08
+            reasons.append("representative")
+        if hit.get("explicit"):
+            confidence += 0.10
+            reasons.append("explicit user assignment")
+        if hit.get("suppressed"):
+            confidence = min(confidence, 0.25)
+            reasons.append("suppressed")
+        hit["confidence"] = round(max(0.0, min(1.0, confidence)), 4)
+        hit["confidence_reasons"] = reasons or ["consistent assignment"]
+        hit["cluster_label"] = cluster.get("label", hit.get("cluster_label", "unclustered"))
+def review_queue(state: dict[str, Any], limit: int = 30) -> list[dict[str, Any]]:
+    rows = []
+    clusters = state.get("clusters", {})
+    for hit in state.get("hits", {}).values():
+        cluster = clusters.get(hit.get("cluster_id"), {})
+        score = 1.0 - _safe_float(hit.get("confidence"), 0.0)
+        if len(cluster.get("hit_ids", [])) <= 1:
+            score += 0.15
+        if hit.get("suppressed"):
+            score -= 0.35
+        if hit.get("review_status") == "accepted":
+            score -= 0.25
+        rows.append(
+            {
+                "hit_id": hit["id"],
+                "hit_index": hit.get("index"),
+                "label": hit.get("label"),
+                "cluster_id": hit.get("cluster_id"),
+                "cluster_label": cluster.get("label"),
+                "confidence": hit.get("confidence", 0.0),
+                "priority": round(max(0.0, score), 4),
+                "reasons": hit.get("confidence_reasons", []),
+                "suppressed": bool(hit.get("suppressed")),
+                "file": hit.get("file"),
+            }
+        )
+    rows.sort(key=lambda item: (-item["priority"], item["hit_index"] or 0))
+    return rows[: max(1, min(int(limit), 200))]
+def _find_similar_hits(state: dict[str, Any], hit_id: str, *, exclude_cluster: str | None = None, include_suppressed: bool = False, limit: int = 12) -> list[tuple[dict[str, Any], float]]:
+    hits = state.get("hits", {})
+    src = hits[hit_id]
+    src_centroid = _safe_float(src.get("spectral_centroid_hz"))
+    src_energy = _safe_float(src.get("rms_energy"))
+    scored: list[tuple[dict[str, Any], float]] = []
+    for candidate in hits.values():
+        if candidate["id"] == hit_id:
+            continue
+        if exclude_cluster and candidate.get("cluster_id") == exclude_cluster:
+            continue
+        if candidate.get("suppressed") and not include_suppressed:
+            continue
+        label_score = 1.0 if candidate.get("label") == src.get("label") else 0.35
+        centroid_delta = abs(_safe_float(candidate.get("spectral_centroid_hz")) - src_centroid)
+        centroid_score = max(0.0, 1.0 - centroid_delta / 6000.0)
+        energy_delta = abs(_safe_float(candidate.get("rms_energy")) - src_energy)
+        energy_score = max(0.0, 1.0 - energy_delta / max(src_energy, 1e-4, _safe_float(candidate.get("rms_energy"))))
+        score = (0.48 * label_score) + (0.34 * centroid_score) + (0.18 * energy_score)
+        if score >= 0.62:
+            scored.append((candidate, round(score, 4)))
+    scored.sort(key=lambda item: (-item[1], item[0].get("index", 0)))
+    return scored[:limit]
+def _add_suggestion(state: dict[str, Any], suggestion_type: str, payload: dict[str, Any], confidence: float, reason: str) -> dict[str, Any]:
+    suggestion = {
+        "id": _new_id("suggestion"),
+        "type": suggestion_type,
+        "status": "open",
+        "created_at": now(),
+        "confidence": round(max(0.0, min(1.0, confidence)), 4),
+        "reason": reason,
+        **payload,
+    }
+    state.setdefault("suggestions", []).append(suggestion)
+    _event(state, "suggestion.created", {"suggestion_id": suggestion["id"], "type": suggestion_type, "reason": reason})
+    return suggestion
+def _rebuild_cluster_labels(state: dict[str, Any]) -> None:
+    hits = state.get("hits", {})
+    for cluster in state.get("clusters", {}).values():
+        for hid in cluster.get("hit_ids", []):
+            if hid in hits:
+                hits[hid]["cluster_label"] = cluster.get("label", "unclustered")
+def move_hit(output_dir: str | Path, job_id: str, hit_id: str, target_cluster_id: str, source: str = "user") -> dict[str, Any]:
+    state = load_or_create_state(job_id, output_dir)
+    hits = state.get("hits", {})
+    clusters = state.get("clusters", {})
+    if hit_id not in hits:
+        raise KeyError(f"Unknown hit: {hit_id}")
+    if target_cluster_id not in clusters:
+        raise KeyError(f"Unknown cluster: {target_cluster_id}")
+    hit = hits[hit_id]
+    source_cluster_id = hit.get("cluster_id")
+    if source_cluster_id == target_cluster_id:
+        hit["review_status"] = "accepted"
+        recompute_scores(state)
+        return _write_state(output_dir, state)
+    _push_undo(state)
+    if source_cluster_id in clusters:
+        clusters[source_cluster_id]["hit_ids"] = [hid for hid in clusters[source_cluster_id].get("hit_ids", []) if hid != hit_id]
+    clusters[target_cluster_id].setdefault("hit_ids", [])
+    if hit_id not in clusters[target_cluster_id]["hit_ids"]:
+        clusters[target_cluster_id]["hit_ids"].append(hit_id)
+    hit["cluster_id"] = target_cluster_id
+    hit["cluster_label"] = clusters[target_cluster_id].get("label", target_cluster_id)
+    hit["explicit"] = True
+    hit["review_status"] = "accepted"
+    target_rep = clusters[target_cluster_id].get("representative_hit_id")
+    _constraint(state, "force-cluster", {"hit_id": hit_id, "cluster_id": target_cluster_id}, source=source)
+    if target_rep and target_rep != hit_id:
+        _constraint(state, "must-link", {"a": hit_id, "b": target_rep}, source=source)
+    _event(state, "hit.moved", {"hit_id": hit_id, "from_cluster_id": source_cluster_id, "to_cluster_id": target_cluster_id}, source=source)
+    similar = _find_similar_hits(state, hit_id, exclude_cluster=target_cluster_id, limit=10)
+    suggested_ids = [item[0]["id"] for item in similar if item[1] >= 0.72]
+    if suggested_ids:
+        avg = sum(score for _, score in similar if _["id"] in suggested_ids) / len(suggested_ids)
+        _add_suggestion(
+            state,
+            "move-hits",
+            {"hit_ids": suggested_ids, "target_cluster_id": target_cluster_id, "preview_count": len(suggested_ids)},
+            avg,
+            f"Similar label/spectral/energy profile to {hit_id}",
+        )
+    _rebuild_cluster_labels(state)
+    recompute_scores(state)
+    return _write_state(output_dir, state)
+def pull_hit_to_new_cluster(output_dir: str | Path, job_id: str, hit_id: str, label: str | None = None, source: str = "user") -> dict[str, Any]:
+    state = load_or_create_state(job_id, output_dir)
+    hits = state.get("hits", {})
+    clusters = state.get("clusters", {})
+    if hit_id not in hits:
+        raise KeyError(f"Unknown hit: {hit_id}")
+    hit = hits[hit_id]
+    source_cluster_id = hit.get("cluster_id")
+    source_rep = clusters.get(source_cluster_id, {}).get("representative_hit_id")
+    _push_undo(state)
+    state.setdefault("counters", {})["user_clusters"] = int(state.get("counters", {}).get("user_clusters", 0)) + 1
+    base = label or f"{hit.get('label', 'hit')}_user_{state['counters']['user_clusters']}"
+    new_cluster_id = _new_id("cluster:user")
+    if source_cluster_id in clusters:
+        clusters[source_cluster_id]["hit_ids"] = [hid for hid in clusters[source_cluster_id].get("hit_ids", []) if hid != hit_id]
+    clusters[new_cluster_id] = {
+        "id": new_cluster_id,
+        "label": base,
+        "classification": _base_label(base),
+        "hit_ids": [hit_id],
+        "representative_hit_id": hit_id,
+        "locked": False,
+        "user_named": bool(label),
+        "confidence": 0.0,
+        "confidence_reasons": [],
+        "suppressed_count": 0,
+        "original_id": None,
+    }
+    hit["cluster_id"] = new_cluster_id
+    hit["cluster_label"] = base
+    hit["explicit"] = True
+    hit["review_status"] = "accepted"
+    if source_rep and source_rep != hit_id:
+        _constraint(state, "cannot-link", {"a": hit_id, "b": source_rep}, source=source)
+    _constraint(state, "force-cluster", {"hit_id": hit_id, "cluster_id": new_cluster_id}, source=source)
+    _event(state, "hit.pulled_out", {"hit_id": hit_id, "from_cluster_id": source_cluster_id, "to_cluster_id": new_cluster_id}, source=source)
+    similar = _find_similar_hits(state, hit_id, exclude_cluster=new_cluster_id, limit=8)
+    split_ids = [item[0]["id"] for item in similar if item[0].get("cluster_id") == source_cluster_id and item[1] >= 0.70]
+    if split_ids:
+        _add_suggestion(
+            state,
+            "split-hits",
+            {"hit_ids": split_ids, "source_cluster_id": source_cluster_id, "target_cluster_id": new_cluster_id, "preview_count": len(split_ids)},
+            0.76,
+            f"Similar to pulled-out hit {hit_id}; preview split from original cluster",
+        )
+    _rebuild_cluster_labels(state)
+    recompute_scores(state)
+    return _write_state(output_dir, state)
+def lock_cluster(output_dir: str | Path, job_id: str, cluster_id: str, locked: bool = True, source: str = "user") -> dict[str, Any]:
+    state = load_or_create_state(job_id, output_dir)
+    clusters = state.get("clusters", {})
+    if cluster_id not in clusters:
+        raise KeyError(f"Unknown cluster: {cluster_id}")
+    _push_undo(state)
+    clusters[cluster_id]["locked"] = bool(locked)
+    _constraint(state, "lock-cluster", {"cluster_id": cluster_id, "locked": bool(locked)}, source=source)
+    _event(state, "cluster.locked" if locked else "cluster.unlocked", {"cluster_id": cluster_id}, source=source)
+    recompute_scores(state)
+    return _write_state(output_dir, state)
+def suppress_hit(output_dir: str | Path, job_id: str, hit_id: str, reason: str = "bleed", source: str = "user") -> dict[str, Any]:
+    state = load_or_create_state(job_id, output_dir)
+    hits = state.get("hits", {})
+    if hit_id not in hits:
+        raise KeyError(f"Unknown hit: {hit_id}")
+    _push_undo(state)
+    hit = hits[hit_id]
+    hit["suppressed"] = True
+    hit["review_status"] = "suppressed"
+    hit["explicit"] = True
+    _constraint(state, "suppress-pattern", {"example_hit_id": hit_id, "reason": reason}, source=source)
+    _event(state, "hit.suppressed", {"hit_id": hit_id, "reason": reason}, source=source)
+    similar = _find_similar_hits(state, hit_id, include_suppressed=False, limit=16)
+    suggested_ids = [item[0]["id"] for item in similar if item[1] >= 0.72 and _safe_float(item[0].get("rms_energy")) <= _safe_float(hit.get("rms_energy")) * 1.35]
+    if suggested_ids:
+        _add_suggestion(
+            state,
+            "suppress-hits",
+            {"hit_ids": suggested_ids, "reason_code": reason, "preview_count": len(suggested_ids)},
+            0.74,
+            f"Similar low-energy profile to suppressed {reason} example {hit_id}",
+        )
+    recompute_scores(state)
+    return _write_state(output_dir, state)
+def set_hit_review_status(output_dir: str | Path, job_id: str, hit_id: str, status: str = "accepted", source: str = "user") -> dict[str, Any]:
+    if status not in {"unreviewed", "accepted", "favorite"}:
+        raise ValueError("status must be unreviewed, accepted, or favorite")
+    state = load_or_create_state(job_id, output_dir)
+    if hit_id not in state.get("hits", {}):
+        raise KeyError(f"Unknown hit: {hit_id}")
+    _push_undo(state)
+    hit = state["hits"][hit_id]
+    hit["review_status"] = status
+    if status == "favorite":
+        hit["favorite"] = True
+        cid = hit.get("cluster_id")
+        if cid in state.get("clusters", {}):
+            state["clusters"][cid]["representative_hit_id"] = hit_id
+        _constraint(state, "pin-representative", {"hit_id": hit_id, "cluster_id": cid}, source=source)
+    _event(state, "hit.reviewed", {"hit_id": hit_id, "status": status}, source=source)
+    recompute_scores(state)
+    return _write_state(output_dir, state)
+def accept_suggestion(output_dir: str | Path, job_id: str, suggestion_id: str) -> dict[str, Any]:
+    state = load_or_create_state(job_id, output_dir)
+    suggestion = next((s for s in state.get("suggestions", []) if s.get("id") == suggestion_id), None)
+    if not suggestion:
+        raise KeyError(f"Unknown suggestion: {suggestion_id}")
+    if suggestion.get("status") != "open":
+        return state
+    _push_undo(state)
+    stype = suggestion.get("type")
+    if stype in {"move-hits", "split-hits"}:
+        target = suggestion.get("target_cluster_id")
+        for hid in suggestion.get("hit_ids", []):
+            if hid in state.get("hits", {}) and target in state.get("clusters", {}):
+                current = state["hits"][hid].get("cluster_id")
+                if current in state["clusters"]:
+                    state["clusters"][current]["hit_ids"] = [x for x in state["clusters"][current].get("hit_ids", []) if x != hid]
+                state["clusters"][target].setdefault("hit_ids", [])
+                if hid not in state["clusters"][target]["hit_ids"]:
+                    state["clusters"][target]["hit_ids"].append(hid)
+                state["hits"][hid]["cluster_id"] = target
+                state["hits"][hid]["explicit"] = True
+                _constraint(state, "force-cluster", {"hit_id": hid, "cluster_id": target}, source="accepted-suggestion")
+    elif stype == "suppress-hits":
+        for hid in suggestion.get("hit_ids", []):
+            if hid in state.get("hits", {}):
+                state["hits"][hid]["suppressed"] = True
+                state["hits"][hid]["review_status"] = "suppressed"
+                _constraint(state, "suppress-pattern", {"example_hit_id": hid, "reason": suggestion.get("reason_code", "bleed")}, source="accepted-suggestion")
+    else:
+        raise ValueError(f"Unsupported suggestion type: {stype}")
+    suggestion["status"] = "accepted"
+    suggestion["resolved_at"] = now()
+    _event(state, "suggestion.accepted", {"suggestion_id": suggestion_id, "type": stype}, source="user")
+    _rebuild_cluster_labels(state)
+    recompute_scores(state)
+    return _write_state(output_dir, state)
+def reject_suggestion(output_dir: str | Path, job_id: str, suggestion_id: str) -> dict[str, Any]:
+    state = load_or_create_state(job_id, output_dir)
+    suggestion = next((s for s in state.get("suggestions", []) if s.get("id") == suggestion_id), None)
+    if not suggestion:
+        raise KeyError(f"Unknown suggestion: {suggestion_id}")
+    _push_undo(state)
+    suggestion["status"] = "rejected"
+    suggestion["resolved_at"] = now()
+    _event(state, "suggestion.rejected", {"suggestion_id": suggestion_id, "type": suggestion.get("type")}, source="user")
+    return _write_state(output_dir, state)
+def undo_last(output_dir: str | Path, job_id: str) -> dict[str, Any]:
+    state = load_or_create_state(job_id, output_dir)
+    stack = list(state.get("undo_stack") or [])
+    if not stack:
+        return state
+    restored = stack.pop()
+    restored["undo_stack"] = stack
+    _event(restored, "state.undo", {"restored_for_job_id": job_id}, source="user")
+    recompute_scores(restored)
+    return _write_state(output_dir, restored)
+def explain_cluster(state: dict[str, Any], cluster_id: str) -> dict[str, Any]:
+    clusters = state.get("clusters", {})
+    hits = state.get("hits", {})
+    if cluster_id not in clusters:
+        raise KeyError(f"Unknown cluster: {cluster_id}")
+    cluster = clusters[cluster_id]
+    members = [hits[hid] for hid in cluster.get("hit_ids", []) if hid in hits]
+    active = [h for h in members if not h.get("suppressed")]
+    constraints = [c for c in state.get("constraints", []) if c.get("cluster_id") == cluster_id or c.get("hit_id") in cluster.get("hit_ids", []) or c.get("a") in cluster.get("hit_ids", []) or c.get("b") in cluster.get("hit_ids", [])]
+    outliers = sorted(active, key=lambda h: h.get("confidence", 0.0))[:8]
+    labels: dict[str, int] = {}
+    for hit in active:
+        labels[hit.get("label", "other")] = labels.get(hit.get("label", "other"), 0) + 1
+    return {
+        "cluster_id": cluster_id,
+        "label": cluster.get("label"),
+        "locked": bool(cluster.get("locked")),
+        "confidence": cluster.get("confidence"),
+        "confidence_reasons": cluster.get("confidence_reasons", []),
+        "representative_hit_id": cluster.get("representative_hit_id"),
+        "hit_count": len(members),
+        "active_hit_count": len(active),
+        "suppressed_count": sum(1 for hit in members if hit.get("suppressed")),
+        "label_distribution": labels,
+        "outliers": [{"hit_id": h["id"], "hit_index": h.get("index"), "confidence": h.get("confidence"), "reasons": h.get("confidence_reasons", [])} for h in outliers],
+        "constraints": constraints[-20:],
+        "summary": f"{cluster.get('label')} has {len(active)} active hits, confidence {cluster.get('confidence')}, and {len(constraints)} relevant constraints.",
+    }
+def public_state(state: dict[str, Any], url_for: Callable[[str], str] | None = None, review_limit: int = 30) -> dict[str, Any]:
+    recompute_scores(state)
+    hits = copy.deepcopy(list(state.get("hits", {}).values()))
+    clusters = copy.deepcopy(list(state.get("clusters", {}).values()))
+    for hit in hits:
+        if url_for and hit.get("file"):
+            hit["url"] = url_for(hit["file"])
+    clusters.sort(key=lambda c: (-len(c.get("hit_ids", [])), c.get("label", "")))
+    hits.sort(key=lambda h: h.get("index", 0))
+    open_suggestions = [s for s in state.get("suggestions", []) if s.get("status") == "open"]
+    open_suggestions.sort(key=lambda s: (-_safe_float(s.get("confidence")), s.get("created_at", 0)))
+    return {
+        "version": state.get("version"),
+        "job_id": state.get("job_id"),
+        "created_at": state.get("created_at"),
+        "updated_at": state.get("updated_at"),
+        "summary": {
+            "hit_count": len(hits),
+            "cluster_count": len(clusters),
+            "constraint_count": len(state.get("constraints", [])),
+            "event_count": len(state.get("events", [])),
+            "open_suggestion_count": len(open_suggestions),
+            "suppressed_hit_count": sum(1 for h in hits if h.get("suppressed")),
+            "locked_cluster_count": sum(1 for c in clusters if c.get("locked")),
+            "undo_available": bool(state.get("undo_stack")),
+        },
+        "hits": hits,
+        "clusters": clusters,
+        "constraints": state.get("constraints", [])[-100:],
+        "events": state.get("events", [])[-120:],
+        "suggestions": open_suggestions[:50],
+        "review_queue": review_queue(state, review_limit),
+    }

web/app.js CHANGED Viewed

@@ -12,6 +12,8 @@ let selectedFile = null;
 let activePoll = null;
 let activeEvents = null;
 let lastResult = null;
 let selectedHitIndex = null;
 function esc(value) {
@@ -47,6 +49,60 @@ async function api(path, options = {}) {
   return response.json();
 }
 function setSelectOptions(select, values, labels = null) {
   select.innerHTML = "";
   for (const value of values) {
@@ -159,15 +215,19 @@ function playAudio(el, url) {
 function selectHit(index, shouldPlay = true) {
   if (!lastResult) return;
-  const hit = (lastResult.hits ?? []).find((item) => Number(item.index) === Number(index));
-  if (!hit) return;
   selectedHitIndex = hit.index;
-  $("selectedHitMeta").textContent = `#${hit.index} · ${hit.label} · ${hit.cluster_label} · ${hit.onset_sec}s · ${hit.duration_ms} ms${hit.is_representative ? " · representative" : ""}`;
   if (shouldPlay) playAudio($("hitAudio"), hit.url);
   for (const row of document.querySelectorAll("[data-hit-index]")) {
     row.classList.toggle("selected", Number(row.dataset.hitIndex) === Number(hit.index));
   }
   drawWaveform(lastResult.overview);
 }
 function auditionSample(sample) {
@@ -200,19 +260,26 @@ function renderSamples(result) {
 function renderHits(result) {
   const tbody = $("hitsTable").querySelector("tbody");
-  const hits = result.hits ?? [];
-  tbody.innerHTML = hits.map((hit) => `
-    <tr data-hit-index="${esc(hit.index)}" class="${Number(hit.index) === Number(selectedHitIndex) ? "selected" : ""}">
-      <td><button class="mini-button" type="button" data-hit-audition="${esc(hit.index)}">Audition</button></td>
-      <td>${esc(hit.index)}</td>
-      <td>${esc(hit.label)}${hit.is_representative ? " ★" : ""}</td>
-      <td>${esc(hit.cluster_label)}</td>
-      <td>${esc(hit.onset_sec)} s</td>
-      <td>${esc(hit.duration_ms)} ms</td>
-      <td>${esc(hit.rms_energy)}</td>
-      <td><a href="${esc(hit.url)}" download>WAV</a></td>
-    </tr>
-  `).join("");
   for (const row of tbody.querySelectorAll("[data-hit-index]")) {
     row.addEventListener("click", () => selectHit(row.dataset.hitIndex));
   }
@@ -225,9 +292,143 @@ function renderHits(result) {
   if (hits.length && selectedHitIndex === null) selectHit(hits[0].index, false);
 }
 function renderResult(job) {
   const result = job.result;
   if (!result) return;
   lastResult = result;
   if (!(result.hits ?? []).some((hit) => Number(hit.index) === Number(selectedHitIndex))) {
     selectedHitIndex = (result.hits ?? [])[0]?.index ?? null;
@@ -245,6 +446,9 @@ function renderResult(job) {
   renderSamples(result);
   renderHits(result);
   drawWaveform(result.overview);
 }
 function renderJob(job) {
@@ -345,6 +549,8 @@ async function runExtraction() {
   if (!selectedFile) return;
   selectedHitIndex = null;
   lastResult = null;
   $("runButton").disabled = true;
   $("jobPill").textContent = "uploading";
   $("logs").textContent = "Uploading source and starting extraction…";
@@ -426,6 +632,16 @@ $("clearCacheButton").addEventListener("click", async () => {
     $("logs").textContent = error.message;
   }
 });
 $("waveform").addEventListener("click", selectNearestWaveformHit);
 const dropzone = $("dropzone");

 let activePoll = null;
 let activeEvents = null;
 let lastResult = null;
+let lastSupervisionState = null;
+let activeJobId = null;
 let selectedHitIndex = null;
 function esc(value) {
   return response.json();
 }
+async function jsonApi(path, body = {}, method = "POST") {
+  return api(path, {
+    method,
+    headers: { "Content-Type": "application/json" },
+    body: JSON.stringify(body),
+  });
+}
+function hitIdFromIndex(index) {
+  if (index === null || index === undefined) return null;
+  return `hit:${String(Number(index)).padStart(5, "0")}`;
+}
+function stateHitByIndex(index) {
+  const id = hitIdFromIndex(index);
+  return (lastSupervisionState?.hits ?? []).find((hit) => hit.id === id) ?? null;
+}
+function decorateHit(hit) {
+  const stateHit = stateHitByIndex(hit.index);
+  return {
+    ...hit,
+    state_hit_id: stateHit?.id ?? hitIdFromIndex(hit.index),
+    cluster_ref: stateHit?.cluster_id ?? `cluster:${hit.cluster_id}`,
+    cluster_label: stateHit?.cluster_label ?? hit.cluster_label,
+    confidence: stateHit?.confidence,
+    confidence_reasons: stateHit?.confidence_reasons ?? [],
+    suppressed: Boolean(stateHit?.suppressed),
+    favorite: Boolean(stateHit?.favorite),
+    review_status: stateHit?.review_status ?? "unreviewed",
+  };
+}
+function currentTargetCluster() {
+  const id = $("targetClusterSelect")?.value;
+  return (lastSupervisionState?.clusters ?? []).find((cluster) => cluster.id === id) ?? null;
+}
+function setActionButtons() {
+  const hasState = Boolean(activeJobId && lastSupervisionState);
+  const hasHit = hasState && selectedHitIndex !== null;
+  for (const id of ["moveHitButton", "pullHitButton", "acceptHitButton", "favoriteHitButton", "suppressHitButton"]) {
+    const button = $(id);
+    if (button) button.disabled = !hasHit;
+  }
+  for (const id of ["refreshStateButton", "undoButton", "lockClusterButton", "explainClusterButton"]) {
+    const button = $(id);
+    if (button) button.disabled = !hasState;
+  }
+  const target = currentTargetCluster();
+  if ($("lockClusterButton")) $("lockClusterButton").textContent = target?.locked ? "Unlock target cluster" : "Lock target cluster";
+  if ($("undoButton") && lastSupervisionState) $("undoButton").disabled = !lastSupervisionState.summary?.undo_available;
+}
 function setSelectOptions(select, values, labels = null) {
   select.innerHTML = "";
   for (const value of values) {
 function selectHit(index, shouldPlay = true) {
   if (!lastResult) return;
+  const rawHit = (lastResult.hits ?? []).find((item) => Number(item.index) === Number(index));
+  if (!rawHit) return;
+  const hit = decorateHit(rawHit);
   selectedHitIndex = hit.index;
+  const confidence = hit.confidence === undefined ? "—" : `${Math.round(Number(hit.confidence) * 100)}%`;
+  const flags = [hit.is_representative ? "representative" : null, hit.favorite ? "favorite" : null, hit.suppressed ? "suppressed" : null, hit.review_status !== "unreviewed" ? hit.review_status : null].filter(Boolean).join(" · ");
+  $("selectedHitMeta").textContent = `#${hit.index} · ${hit.label} · ${hit.cluster_label} · ${hit.onset_sec}s · ${hit.duration_ms} ms · confidence ${confidence}${flags ? ` · ${flags}` : ""}`;
   if (shouldPlay) playAudio($("hitAudio"), hit.url);
   for (const row of document.querySelectorAll("[data-hit-index]")) {
     row.classList.toggle("selected", Number(row.dataset.hitIndex) === Number(hit.index));
   }
   drawWaveform(lastResult.overview);
+  setActionButtons();
 }
 function auditionSample(sample) {
 function renderHits(result) {
   const tbody = $("hitsTable").querySelector("tbody");
+  const hits = (result.hits ?? []).map(decorateHit);
+  tbody.innerHTML = hits.map((hit) => {
+    const confidence = hit.confidence === undefined ? "—" : `${Math.round(Number(hit.confidence) * 100)}%`;
+    const flags = [hit.is_representative ? "rep" : null, hit.favorite ? "fav" : null, hit.suppressed ? "suppressed" : null, hit.review_status !== "unreviewed" ? hit.review_status : null].filter(Boolean);
+    const classes = [Number(hit.index) === Number(selectedHitIndex) ? "selected" : "", hit.suppressed ? "suppressed" : "", Number(hit.confidence ?? 1) < 0.55 ? "low-confidence" : ""].filter(Boolean).join(" ");
+    return `
+      <tr data-hit-index="${esc(hit.index)}" class="${esc(classes)}">
+        <td><button class="mini-button" type="button" data-hit-audition="${esc(hit.index)}">Audition</button></td>
+        <td>${esc(hit.index)}</td>
+        <td>${esc(hit.label)}${hit.is_representative || hit.favorite ? " ★" : ""}</td>
+        <td>${esc(hit.cluster_label)}</td>
+        <td>${esc(confidence)}</td>
+        <td>${esc(flags.join(", ") || "—")}</td>
+        <td>${esc(hit.onset_sec)} s</td>
+        <td>${esc(hit.duration_ms)} ms</td>
+        <td>${esc(hit.rms_energy)}</td>
+        <td><a href="${esc(hit.url)}" download>WAV</a></td>
+      </tr>
+    `;
+  }).join("");
   for (const row of tbody.querySelectorAll("[data-hit-index]")) {
     row.addEventListener("click", () => selectHit(row.dataset.hitIndex));
   }
   if (hits.length && selectedHitIndex === null) selectHit(hits[0].index, false);
 }
+function renderSupervisionState(state) {
+  lastSupervisionState = state;
+  const summary = state.summary ?? {};
+  $("supervisionSummary").innerHTML = `
+    <span>${esc(summary.hit_count ?? 0)} hits</span>
+    <span>${esc(summary.cluster_count ?? 0)} clusters</span>
+    <span>${esc(summary.constraint_count ?? 0)} constraints</span>
+    <span>${esc(summary.open_suggestion_count ?? 0)} suggestions</span>
+    <span>${esc(summary.suppressed_hit_count ?? 0)} suppressed</span>
+    <span>${esc(summary.locked_cluster_count ?? 0)} locked</span>
+  `;
+  const currentTarget = $("targetClusterSelect").value;
+  $("targetClusterSelect").innerHTML = (state.clusters ?? []).map((cluster) => `
+    <option value="${esc(cluster.id)}">${cluster.locked ? "locked · " : ""}${esc(cluster.label)} · ${esc(cluster.hit_ids?.length ?? 0)} hits · ${Math.round(Number(cluster.confidence ?? 0) * 100)}%</option>
+  `).join("");
+  if ((state.clusters ?? []).some((cluster) => cluster.id === currentTarget)) $("targetClusterSelect").value = currentTarget;
+  $("reviewQueue").innerHTML = (state.review_queue ?? []).slice(0, 14).map((item) => `
+    <button class="compact-row ${item.suppressed ? "suppressed" : ""}" type="button" data-review-hit="${esc(item.hit_index)}">
+      <span><strong>#${esc(item.hit_index)} · ${esc(item.label)}</strong><small>${esc(item.cluster_label)} · ${Math.round(Number(item.confidence ?? 0) * 100)}% · ${esc((item.reasons ?? []).join(", "))}</small></span>
+      <span>${Math.round(Number(item.priority ?? 0) * 100)}</span>
+    </button>
+  `).join("") || `<p class="empty">No review items.</p>`;
+  for (const button of $("reviewQueue").querySelectorAll("[data-review-hit]")) {
+    button.addEventListener("click", () => selectHit(button.dataset.reviewHit));
+  }
+  $("clusterBoard").innerHTML = (state.clusters ?? []).map((cluster) => `
+    <button class="compact-row ${cluster.locked ? "locked" : ""}" type="button" data-cluster-select="${esc(cluster.id)}">
+      <span><strong>${cluster.locked ? "Locked · " : ""}${esc(cluster.label)}</strong><small>${esc(cluster.hit_ids?.length ?? 0)} hits · ${Math.round(Number(cluster.confidence ?? 0) * 100)}% · ${esc((cluster.confidence_reasons ?? []).join(", "))}</small></span>
+      <span>${esc(cluster.suppressed_count ?? 0)} suppr.</span>
+    </button>
+  `).join("") || `<p class="empty">No clusters.</p>`;
+  for (const button of $("clusterBoard").querySelectorAll("[data-cluster-select]")) {
+    button.addEventListener("click", () => {
+      $("targetClusterSelect").value = button.dataset.clusterSelect;
+      setActionButtons();
+      explainTargetCluster().catch(() => {});
+    });
+  }
+  $("suggestionInbox").innerHTML = (state.suggestions ?? []).map((suggestion) => `
+    <div class="suggestion-row">
+      <div><strong>${esc(suggestion.type)}</strong><small>${esc(suggestion.reason)} · ${Math.round(Number(suggestion.confidence ?? 0) * 100)}% · ${esc(suggestion.preview_count ?? suggestion.hit_ids?.length ?? 0)} hits</small></div>
+      <div class="row-actions">
+        <button class="mini-button" type="button" data-accept-suggestion="${esc(suggestion.id)}">Accept</button>
+        <button class="mini-button" type="button" data-reject-suggestion="${esc(suggestion.id)}">Reject</button>
+      </div>
+    </div>
+  `).join("") || `<p class="empty">No open suggestions.</p>`;
+  for (const button of $("suggestionInbox").querySelectorAll("[data-accept-suggestion]")) {
+    button.addEventListener("click", () => acceptSuggestion(button.dataset.acceptSuggestion));
+  }
+  for (const button of $("suggestionInbox").querySelectorAll("[data-reject-suggestion]")) {
+    button.addEventListener("click", () => rejectSuggestion(button.dataset.rejectSuggestion));
+  }
+  const eventRows = (state.events ?? []).slice(-12).reverse().map((event) => `<div class="log-row"><strong>${esc(event.type)}</strong><small>${esc(event.source)} · ${fmtDate(event.created_at)}</small></div>`);
+  const constraintRows = (state.constraints ?? []).slice(-8).reverse().map((constraint) => `<div class="log-row constraint"><strong>${esc(constraint.type)}</strong><small>${esc(constraint.source)} · ${fmtDate(constraint.created_at)}</small></div>`);
+  $("stateLog").innerHTML = [...eventRows, ...constraintRows].join("") || `<p class="empty">No state events yet.</p>`;
+  setActionButtons();
+  if (lastResult) renderHits(lastResult);
+}
+async function fetchState() {
+  if (!activeJobId) return null;
+  const state = await api(`/api/jobs/${encodeURIComponent(activeJobId)}/state`);
+  renderSupervisionState(state);
+  return state;
+}
+async function applyStateAction(path, body = {}) {
+  if (!activeJobId) return;
+  const state = await jsonApi(path, body);
+  renderSupervisionState(state);
+}
+async function moveSelectedHit() {
+  const hitId = hitIdFromIndex(selectedHitIndex);
+  const target = $("targetClusterSelect").value;
+  if (!activeJobId || !hitId || !target) return;
+  await applyStateAction(`/api/jobs/${encodeURIComponent(activeJobId)}/hits/${encodeURIComponent(hitId)}/move`, { target_cluster_id: target });
+}
+async function pullSelectedHit() {
+  const hitId = hitIdFromIndex(selectedHitIndex);
+  if (!activeJobId || !hitId) return;
+  await applyStateAction(`/api/jobs/${encodeURIComponent(activeJobId)}/hits/${encodeURIComponent(hitId)}/pull-out`, {});
+}
+async function suppressSelectedHit() {
+  const hitId = hitIdFromIndex(selectedHitIndex);
+  if (!activeJobId || !hitId) return;
+  await applyStateAction(`/api/jobs/${encodeURIComponent(activeJobId)}/hits/${encodeURIComponent(hitId)}/suppress`, { reason: "bleed" });
+}
+async function reviewSelectedHit(status) {
+  const hitId = hitIdFromIndex(selectedHitIndex);
+  if (!activeJobId || !hitId) return;
+  await applyStateAction(`/api/jobs/${encodeURIComponent(activeJobId)}/hits/${encodeURIComponent(hitId)}/review`, { status });
+}
+async function toggleTargetClusterLock() {
+  const target = currentTargetCluster();
+  if (!activeJobId || !target) return;
+  await applyStateAction(`/api/jobs/${encodeURIComponent(activeJobId)}/clusters/${encodeURIComponent(target.id)}/lock`, { locked: !target.locked });
+}
+async function explainTargetCluster() {
+  const target = currentTargetCluster();
+  if (!activeJobId || !target) return;
+  const explanation = await api(`/api/jobs/${encodeURIComponent(activeJobId)}/explain/cluster/${encodeURIComponent(target.id)}`);
+  $("clusterExplanation").classList.remove("empty");
+  $("clusterExplanation").textContent = JSON.stringify(explanation, null, 2);
+}
+async function acceptSuggestion(id) {
+  if (!activeJobId) return;
+  await applyStateAction(`/api/jobs/${encodeURIComponent(activeJobId)}/suggestions/${encodeURIComponent(id)}/accept`, {});
+}
+async function rejectSuggestion(id) {
+  if (!activeJobId) return;
+  await applyStateAction(`/api/jobs/${encodeURIComponent(activeJobId)}/suggestions/${encodeURIComponent(id)}/reject`, {});
+}
+async function undoLastEdit() {
+  if (!activeJobId) return;
+  await applyStateAction(`/api/jobs/${encodeURIComponent(activeJobId)}/undo`, {});
+}
 function renderResult(job) {
   const result = job.result;
   if (!result) return;
+  activeJobId = job.id;
   lastResult = result;
   if (!(result.hits ?? []).some((hit) => Number(hit.index) === Number(selectedHitIndex))) {
     selectedHitIndex = (result.hits ?? [])[0]?.index ?? null;
   renderSamples(result);
   renderHits(result);
   drawWaveform(result.overview);
+  if (activeJobId) fetchState().catch((error) => {
+    $("supervisionSummary").textContent = error.message;
+  });
 }
 function renderJob(job) {
   if (!selectedFile) return;
   selectedHitIndex = null;
   lastResult = null;
+  lastSupervisionState = null;
+  activeJobId = null;
   $("runButton").disabled = true;
   $("jobPill").textContent = "uploading";
   $("logs").textContent = "Uploading source and starting extraction…";
     $("logs").textContent = error.message;
   }
 });
+$("refreshStateButton").addEventListener("click", () => fetchState().catch((error) => { $("supervisionSummary").textContent = error.message; }));
+$("undoButton").addEventListener("click", () => undoLastEdit().catch((error) => { $("clusterExplanation").textContent = error.message; }));
+$("moveHitButton").addEventListener("click", () => moveSelectedHit().catch((error) => { $("clusterExplanation").textContent = error.message; }));
+$("pullHitButton").addEventListener("click", () => pullSelectedHit().catch((error) => { $("clusterExplanation").textContent = error.message; }));
+$("acceptHitButton").addEventListener("click", () => reviewSelectedHit("accepted").catch((error) => { $("clusterExplanation").textContent = error.message; }));
+$("favoriteHitButton").addEventListener("click", () => reviewSelectedHit("favorite").catch((error) => { $("clusterExplanation").textContent = error.message; }));
+$("suppressHitButton").addEventListener("click", () => suppressSelectedHit().catch((error) => { $("clusterExplanation").textContent = error.message; }));
+$("lockClusterButton").addEventListener("click", () => toggleTargetClusterLock().catch((error) => { $("clusterExplanation").textContent = error.message; }));
+$("explainClusterButton").addEventListener("click", () => explainTargetCluster().catch((error) => { $("clusterExplanation").textContent = error.message; }));
+$("targetClusterSelect").addEventListener("change", setActionButtons);
 $("waveform").addEventListener("click", selectNearestWaveformHit);
 const dropzone = $("dropzone");

web/index.html CHANGED Viewed

@@ -187,6 +187,51 @@
               <audio id="sampleAudio" controls></audio>
             </article>
           </div>
           <div class="result-columns">
             <section>
               <h3>Representative samples</h3>
@@ -208,7 +253,7 @@
                 <table id="hitsTable">
                   <thead>
                     <tr>
-                      <th>Audition</th><th>#</th><th>Label</th><th>Cluster</th><th>Onset</th><th>Duration</th><th>Energy</th><th>File</th>
                     </tr>
                   </thead>
                   <tbody></tbody>

               <audio id="sampleAudio" controls></audio>
             </article>
           </div>
+          <section class="supervision-panel" aria-live="polite">
+            <div class="supervision-header">
+              <div>
+                <h3>Interactive supervision</h3>
+                <p class="subtle">Moves, locks, suppressions, favorites, and accepted suggestions are saved as replayable semantic state next to the run manifest.</p>
+              </div>
+              <div class="supervision-actions">
+                <button id="refreshStateButton" class="ghost-button" type="button">Refresh state</button>
+                <button id="undoButton" class="ghost-button" type="button" disabled>Undo edit</button>
+              </div>
+            </div>
+            <div id="supervisionSummary" class="state-summary">No interactive state loaded.</div>
+            <div class="supervision-tools">
+              <label>Target cluster
+                <select id="targetClusterSelect"></select>
+              </label>
+              <button id="moveHitButton" class="secondary-button" type="button" disabled>Move selected hit</button>
+              <button id="pullHitButton" class="secondary-button" type="button" disabled>Pull into new cluster</button>
+              <button id="acceptHitButton" class="secondary-button" type="button" disabled>Accept hit</button>
+              <button id="favoriteHitButton" class="secondary-button" type="button" disabled>Favorite as representative</button>
+              <button id="suppressHitButton" class="secondary-button danger-button" type="button" disabled>Suppress as bleed</button>
+              <button id="lockClusterButton" class="secondary-button" type="button" disabled>Lock target cluster</button>
+              <button id="explainClusterButton" class="secondary-button" type="button" disabled>Explain target cluster</button>
+            </div>
+            <div class="supervision-grid">
+              <article>
+                <h4>Outlier-first review queue</h4>
+                <div id="reviewQueue" class="compact-list"></div>
+              </article>
+              <article>
+                <h4>Cluster board</h4>
+                <div id="clusterBoard" class="compact-list"></div>
+              </article>
+              <article>
+                <h4>Suggestion inbox</h4>
+                <div id="suggestionInbox" class="compact-list"></div>
+              </article>
+              <article>
+                <h4>Constraint / event log</h4>
+                <div id="stateLog" class="compact-list"></div>
+              </article>
+            </div>
+            <pre id="clusterExplanation" class="explanation empty">Select a cluster and click Explain.</pre>
+          </section>
           <div class="result-columns">
             <section>
               <h3>Representative samples</h3>
                 <table id="hitsTable">
                   <thead>
                     <tr>
+                      <th>Audition</th><th>#</th><th>Label</th><th>Cluster</th><th>Confidence</th><th>Flags</th><th>Onset</th><th>Duration</th><th>Energy</th><th>File</th>
                     </tr>
                   </thead>
                   <tbody></tbody>

web/styles.css CHANGED Viewed

@@ -99,3 +99,24 @@ tr.selected td { background: rgba(139,211,255,.12); }
 tr[data-hit-index] { cursor: pointer; }
 tr[data-hit-index]:hover td { background: rgba(255,255,255,.045); }
 @media (max-width: 760px) { .review-grid { grid-template-columns: 1fr; } }

 tr[data-hit-index] { cursor: pointer; }
 tr[data-hit-index]:hover td { background: rgba(255,255,255,.045); }
 @media (max-width: 760px) { .review-grid { grid-template-columns: 1fr; } }
+.supervision-panel { border: 1px solid var(--line); border-radius: 24px; background: rgba(0,0,0,.14); padding: 16px; margin: 0 0 20px; }
+.supervision-header { display: flex; align-items: flex-start; justify-content: space-between; gap: 16px; margin-bottom: 14px; }
+.supervision-header h3, .supervision-grid h4 { margin: 0; }
+.supervision-actions, .row-actions { display: flex; gap: 8px; flex-wrap: wrap; justify-content: flex-end; }
+.state-summary { display: flex; flex-wrap: wrap; gap: 8px; margin-bottom: 14px; color: #dbe5f7; }
+.state-summary span { border: 1px solid var(--line); border-radius: 999px; background: rgba(255,255,255,.06); padding: 7px 10px; font-size: 12px; font-weight: 800; }
+.supervision-tools { display: grid; grid-template-columns: minmax(220px, 1fr) repeat(7, auto); gap: 10px; align-items: end; margin-bottom: 16px; }
+.danger-button { border-color: rgba(255,109,122,.35); color: #ffd4d8; }
+.supervision-grid { display: grid; grid-template-columns: repeat(4, minmax(0, 1fr)); gap: 12px; }
+.compact-list { display: grid; gap: 8px; max-height: 300px; overflow: auto; }
+.compact-row, .suggestion-row, .log-row { width: 100%; display: grid; grid-template-columns: minmax(0, 1fr) auto; gap: 10px; align-items: center; border: 1px solid var(--line); border-radius: 14px; padding: 10px; background: rgba(0,0,0,.14); color: var(--text); text-align: left; }
+.compact-row strong, .suggestion-row strong, .log-row strong { display: block; font-size: 13px; overflow: hidden; text-overflow: ellipsis; white-space: nowrap; }
+.compact-row small, .suggestion-row small, .log-row small { display: block; color: var(--muted); font-size: 11px; margin-top: 3px; line-height: 1.35; }
+.compact-row.locked { border-color: rgba(85,230,165,.45); background: rgba(85,230,165,.08); }
+.compact-row.suppressed, tr.suppressed td { opacity: .62; text-decoration: line-through; }
+.log-row.constraint { border-color: rgba(200,165,255,.26); }
+.explanation { min-height: 120px; max-height: 320px; overflow: auto; border: 1px solid var(--line); border-radius: 16px; background: #05070b; color: #b9d7e9; padding: 12px; margin: 14px 0 0; font-size: 12px; line-height: 1.45; }
+tr.low-confidence td { background: rgba(255,202,107,.06); }
+tr.low-confidence.selected td { background: rgba(139,211,255,.15); }
+@media (max-width: 1320px) { .supervision-tools { grid-template-columns: repeat(3, minmax(0, 1fr)); } .supervision-grid { grid-template-columns: repeat(2, minmax(0, 1fr)); } }
+@media (max-width: 760px) { .supervision-header { display: block; } .supervision-actions { justify-content: flex-start; margin-top: 10px; } .supervision-tools, .supervision-grid { grid-template-columns: 1fr; } }