ChatGPT commited on
Commit
3703c4e
·
1 Parent(s): b8fa9bf

feat: add hit review and streaming progress

Browse files
README.md CHANGED
@@ -29,13 +29,16 @@ Implemented in the current development pass:
29
  - `online_preview`: prototype-based incremental assignment intended for near-realtime preview.
30
  - Disk cache for decoded full-mix/stem outputs keyed by source digest and extraction settings.
31
  - Run history panel indexing `.runs/*/output/manifest.json`.
32
- - Documentation for features, progress, tasks, API, timing, realtime suitability, UI, and remaining work.
 
 
 
33
  - Legacy Gradio apps preserved in `legacy/` for reference only.
34
 
35
  Not fully complete yet:
36
 
37
  - No interactive waveform editing of onsets/clusters.
38
- - No server-sent event stream or websocket progress channel.
39
  - No frontend TypeScript build/test harness.
40
  - Demucs remains offline/batch by design.
41
 
@@ -44,6 +47,7 @@ See:
44
  - `docs/FEATURES.md`
45
  - `docs/TASKS.md`
46
  - `docs/PROGRESS.md`
 
47
  - `docs/REMAINING_WORK.md`
48
 
49
  ## Run locally
@@ -68,6 +72,7 @@ That bypasses Demucs and uses the near-realtime clustering path.
68
 
69
  ```bash
70
  python3 scripts/benchmark_subprocesses.py --runs 2 --bars 4 --output docs/benchmark-subprocesses.json
 
71
  ```
72
 
73
  The benchmark uses synthetic drum fixtures and `stem=all` so the DSP stages are measured without Demucs model download/runtime noise.
@@ -101,7 +106,7 @@ curl http://127.0.0.1:7860/api/jobs
101
  | `app.py` | FastAPI app, static UI serving, job API, run history, artifact downloads |
102
  | `pipeline_runner.py` | Timed extraction pipeline, disk stem/source cache, batch/online clustering routing |
103
  | `sample_extractor.py` | Core DSP/sample extraction implementation |
104
- | `web/` | Custom no-build browser frontend |
105
  | `scripts/benchmark_subprocesses.py` | Synthetic benchmark runner for stage timings |
106
  | `docs/` | Review, timing, API, UI, feature, task, progress, and remaining-work documentation |
107
  | `legacy/` | Previous Gradio apps retained for reference |
@@ -115,6 +120,7 @@ Each run is stored under `.runs/<job-id>/output/`:
115
  - `reconstruction.mid`
116
  - `sample-pack.zip`
117
  - `samples/*.wav`
 
118
  - `manifest.json`
119
 
120
  Generated runtime directories are ignored by git:
 
29
  - `online_preview`: prototype-based incremental assignment intended for near-realtime preview.
30
  - Disk cache for decoded full-mix/stem outputs keyed by source digest and extraction settings.
31
  - Run history panel indexing `.runs/*/output/manifest.json`.
32
+ - Individual review WAVs for every detected hit under `review/hits/`.
33
+ - Click-to-audition workflow for waveform onsets, detected hit rows, and representative sample rows.
34
+ - Server-sent-events progress endpoint with frontend `EventSource` support and polling fallback.
35
+ - Documentation for features, progress, tasks, API, timing, hit review, realtime suitability, UI, and remaining work.
36
  - Legacy Gradio apps preserved in `legacy/` for reference only.
37
 
38
  Not fully complete yet:
39
 
40
  - No interactive waveform editing of onsets/clusters.
41
+ - No interactive onset/cluster editing yet.
42
  - No frontend TypeScript build/test harness.
43
  - Demucs remains offline/batch by design.
44
 
 
47
  - `docs/FEATURES.md`
48
  - `docs/TASKS.md`
49
  - `docs/PROGRESS.md`
50
+ - `docs/HIT_REVIEW_AND_STREAMING.md`
51
  - `docs/REMAINING_WORK.md`
52
 
53
  ## Run locally
 
72
 
73
  ```bash
74
  python3 scripts/benchmark_subprocesses.py --runs 2 --bars 4 --output docs/benchmark-subprocesses.json
75
+ python3 scripts/test_sse_and_review_hits.py
76
  ```
77
 
78
  The benchmark uses synthetic drum fixtures and `stem=all` so the DSP stages are measured without Demucs model download/runtime noise.
 
106
  | `app.py` | FastAPI app, static UI serving, job API, run history, artifact downloads |
107
  | `pipeline_runner.py` | Timed extraction pipeline, disk stem/source cache, batch/online clustering routing |
108
  | `sample_extractor.py` | Core DSP/sample extraction implementation |
109
+ | `web/` | Custom no-build browser frontend with waveform, hit review, and sample audition |
110
  | `scripts/benchmark_subprocesses.py` | Synthetic benchmark runner for stage timings |
111
  | `docs/` | Review, timing, API, UI, feature, task, progress, and remaining-work documentation |
112
  | `legacy/` | Previous Gradio apps retained for reference |
 
120
  - `reconstruction.mid`
121
  - `sample-pack.zip`
122
  - `samples/*.wav`
123
+ - `review/hits/*.wav`
124
  - `manifest.json`
125
 
126
  Generated runtime directories are ignored by git:
app.py CHANGED
@@ -7,6 +7,7 @@ Run with:
7
 
8
  from __future__ import annotations
9
 
 
10
  import json
11
  import shutil
12
  import time
@@ -20,7 +21,7 @@ from typing import Any
20
 
21
  from fastapi import FastAPI, File, Form, HTTPException, UploadFile
22
  from fastapi.middleware.cors import CORSMiddleware
23
- from fastapi.responses import FileResponse, JSONResponse
24
  from fastapi.staticfiles import StaticFiles
25
 
26
  from pipeline_runner import PipelineParams, clear_disk_cache, initial_stages, run_extraction_pipeline
@@ -31,7 +32,7 @@ WEB_DIR = ROOT / "web"
31
  RUNS_DIR = ROOT / ".runs"
32
  RUNS_DIR.mkdir(exist_ok=True)
33
 
34
- app = FastAPI(title="Drum Sample Extractor", version="11.0.0")
35
  app.add_middleware(
36
  CORSMiddleware,
37
  allow_origins=["*"],
@@ -58,6 +59,10 @@ def _serialise_job(job: dict[str, Any]) -> dict[str, Any]:
58
  {**sample, "url": _job_url(job["id"], sample["file"])}
59
  for sample in result.get("samples", [])
60
  ]
 
 
 
 
61
  payload["result"] = result
62
  return payload
63
 
@@ -243,11 +248,53 @@ def get_job(job_id: str) -> dict[str, Any]:
243
  raise HTTPException(status_code=404, detail="Job not found")
244
 
245
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
246
  @app.get("/api/jobs/{job_id}/files/{relative_path:path}")
247
  def get_job_file(job_id: str, relative_path: str) -> FileResponse:
248
  root = (RUNS_DIR / job_id / "output").resolve()
249
  path = (root / relative_path).resolve()
250
- if not str(path).startswith(str(root)) or not path.exists() or not path.is_file():
 
 
 
 
251
  raise HTTPException(status_code=404, detail="File not found")
252
  return FileResponse(path)
253
 
 
7
 
8
  from __future__ import annotations
9
 
10
+ import asyncio
11
  import json
12
  import shutil
13
  import time
 
21
 
22
  from fastapi import FastAPI, File, Form, HTTPException, UploadFile
23
  from fastapi.middleware.cors import CORSMiddleware
24
+ from fastapi.responses import FileResponse, JSONResponse, StreamingResponse
25
  from fastapi.staticfiles import StaticFiles
26
 
27
  from pipeline_runner import PipelineParams, clear_disk_cache, initial_stages, run_extraction_pipeline
 
32
  RUNS_DIR = ROOT / ".runs"
33
  RUNS_DIR.mkdir(exist_ok=True)
34
 
35
+ app = FastAPI(title="Drum Sample Extractor", version="11.1.0")
36
  app.add_middleware(
37
  CORSMiddleware,
38
  allow_origins=["*"],
 
59
  {**sample, "url": _job_url(job["id"], sample["file"])}
60
  for sample in result.get("samples", [])
61
  ]
62
+ result["hits"] = [
63
+ {**hit, "url": _job_url(job["id"], hit["file"])}
64
+ for hit in result.get("hits", [])
65
+ ]
66
  payload["result"] = result
67
  return payload
68
 
 
248
  raise HTTPException(status_code=404, detail="Job not found")
249
 
250
 
251
+ @app.get("/api/jobs/{job_id}/events")
252
+ def get_job_events(job_id: str) -> StreamingResponse:
253
+ with jobs_lock:
254
+ exists_in_memory = job_id in jobs
255
+ exists_on_disk = _read_manifest_job(job_id) is not None
256
+ if not exists_in_memory and not exists_on_disk:
257
+ raise HTTPException(status_code=404, detail="Job not found")
258
+
259
+ async def event_stream():
260
+ last_payload: str | None = None
261
+ while True:
262
+ with jobs_lock:
263
+ memory_job = jobs.get(job_id)
264
+ job = dict(memory_job) if memory_job else None
265
+ if job is None:
266
+ job = _read_manifest_job(job_id)
267
+ if job is None:
268
+ payload = {"id": job_id, "status": "error", "error": "Job disappeared"}
269
+ else:
270
+ payload = _serialise_job(job)
271
+ encoded = json.dumps(payload, sort_keys=True)
272
+ if encoded != last_payload:
273
+ yield f"event: job\ndata: {encoded}\n\n"
274
+ last_payload = encoded
275
+ if payload.get("status") in {"complete", "error"}:
276
+ break
277
+ await asyncio.sleep(0.5)
278
+
279
+ return StreamingResponse(
280
+ event_stream(),
281
+ media_type="text/event-stream",
282
+ headers={
283
+ "Cache-Control": "no-cache",
284
+ "X-Accel-Buffering": "no",
285
+ },
286
+ )
287
+
288
+
289
  @app.get("/api/jobs/{job_id}/files/{relative_path:path}")
290
  def get_job_file(job_id: str, relative_path: str) -> FileResponse:
291
  root = (RUNS_DIR / job_id / "output").resolve()
292
  path = (root / relative_path).resolve()
293
+ try:
294
+ path.relative_to(root)
295
+ except ValueError as exc:
296
+ raise HTTPException(status_code=404, detail="File not found") from exc
297
+ if not path.exists() or not path.is_file():
298
  raise HTTPException(status_code=404, detail="File not found")
299
  return FileResponse(path)
300
 
docs/API.md CHANGED
@@ -131,10 +131,28 @@ Completed jobs contain:
131
  | `hit_count` | Number of accepted onsets/hits. |
132
  | `cluster_count` | Number of sample clusters. |
133
  | `stages` | Per-stage timing/status/detail list. |
134
- | `samples` | Sample rows with score, duration, first onset, and download URL. |
135
- | `overview` | Decimated envelope and onset markers for waveform display. |
 
136
  | `files` | Relative artifact paths. |
137
- | `file_urls` | Direct API URLs for artifacts. |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
138
 
139
  ## `GET /api/jobs/{job_id}/files/{relative_path}`
140
 
@@ -146,9 +164,10 @@ Examples:
146
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/sample-pack.zip
147
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/reconstruction.mid
148
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/samples/hihat_open_0.wav
 
149
  ```
150
 
151
- The endpoint prevents path traversal by resolving downloads under `.runs/<job-id>/output/`.
152
 
153
  ## `POST /api/cache/clear`
154
 
 
131
  | `hit_count` | Number of accepted onsets/hits. |
132
  | `cluster_count` | Number of sample clusters. |
133
  | `stages` | Per-stage timing/status/detail list. |
134
+ | `samples` | Representative sample rows with score, duration, first onset, and playback/download URL. |
135
+ | `hits` | Per-detected-hit review rows with onset, duration, label, cluster, representative flag, and playback/download URL. |
136
+ | `overview` | Decimated envelope and clickable onset markers for waveform display. |
137
  | `files` | Relative artifact paths. |
138
+ | `file_urls` | Direct API URLs for top-level artifacts. |
139
+
140
+ ## `GET /api/jobs/{job_id}/events`
141
+
142
+ Streams job snapshots as server-sent events. This is the preferred progress channel for the frontend; polling remains supported via `GET /api/jobs/{job_id}`.
143
+
144
+ ```bash
145
+ curl -N http://127.0.0.1:7860/api/jobs/58ca0db4ac74/events
146
+ ```
147
+
148
+ Event shape:
149
+
150
+ ```text
151
+ event: job
152
+ data: {"id":"58ca0db4ac74","status":"running","stages":[...]}
153
+ ```
154
+
155
+ The stream closes after `complete` or `error`. Completed historical jobs emit one final `job` event and close.
156
 
157
  ## `GET /api/jobs/{job_id}/files/{relative_path}`
158
 
 
164
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/sample-pack.zip
165
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/reconstruction.mid
166
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/samples/hihat_open_0.wav
167
+ curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/review/hits/hit_00000_kick.wav
168
  ```
169
 
170
+ The endpoint prevents path traversal by resolving downloads under `.runs/<job-id>/output/` and requiring the final path to remain relative to that output root.
171
 
172
  ## `POST /api/cache/clear`
173
 
docs/FEATURES.md CHANGED
@@ -14,12 +14,14 @@ Turn an input audio file into a practical drum sample pack: detected hits, group
14
  | UI | Drag/drop audio upload | Implemented | Uses multipart upload to `POST /api/jobs`. |
15
  | UI | Source preview | Implemented | Browser `<audio>` preview before extraction. |
16
  | UI | Pipeline controls | Implemented | Stem/model/onset/clustering/MIDI/synthesis/cache controls. |
17
- | UI | Live-ish progress | Implemented | Polls stage state and logs every 800 ms. |
18
- | UI | Waveform/onset overview | Implemented | Canvas envelope plus onset markers from `manifest.json`. |
19
- | UI | Result downloads | Implemented | ZIP, MIDI, stem WAV, reconstruction WAV, individual sample WAVs. |
20
  | UI | Run history browser | Implemented | Lists completed `.runs/*/output/manifest.json` entries and reloads results. |
 
21
  | API | Health/config | Implemented | `GET /api/health`, `GET /api/config`. |
22
- | API | Job creation/polling | Implemented | `POST /api/jobs`, `GET /api/jobs/{id}`. |
 
23
  | API | Run listing | Implemented | `GET /api/jobs` returns active and completed runs. |
24
  | API | Safe artifact serving | Implemented | Path traversal is blocked by resolved output-root checks. |
25
  | API | Cache clear | Implemented | Clears in-memory DSP cache and disk stem/source cache. |
@@ -34,21 +36,23 @@ Turn an input audio file into a practical drum sample pack: detected hits, group
34
  | Pipeline | Optional synthesis | Implemented | Weighted aligned average for multi-hit clusters. |
35
  | Pipeline | MIDI export | Implemented | Quantized or unquantized reconstruction MIDI. |
36
  | Pipeline | Reconstruction render | Implemented | Renders MIDI-like reconstruction using selected samples. |
 
37
  | Pipeline | Sample pack ZIP | Implemented | Includes WAVs, index JSON, MIDI, rendered reconstruction. |
38
  | Docs | Project review | Implemented | `docs/PROJECT_REVIEW.md`. |
39
  | Docs | Timing/realtime analysis | Implemented | `docs/PIPELINE_TIMING_AND_REALTIME.md`. |
40
  | Docs | API docs | Implemented | `docs/API.md`. |
41
  | Docs | UI replacement docs | Implemented | `docs/UI_REPLACEMENT.md`. |
42
  | Docs | Feature/task/progress tracking | Implemented | This file, `TASKS.md`, `PROGRESS.md`. |
 
43
 
44
  ## Partially implemented features
45
 
46
  | Area | Feature | Current state | Needed to call it complete |
47
  |---|---|---|---|
48
- | Progress | Stage progress | Shows stage boundaries and logs | Add lower-level progress inside Demucs and clustering. |
49
  | Realtime | Online clustering | Implemented as batch-invoked prototype assignment | Add streaming/incremental audio analysis API for true realtime preview. |
50
  | Run history | Manifest browser | Lists and reloads completed runs | Add side-by-side comparison and filtering/search. |
51
- | Editing | Review workflow | Displays waveform and samples | Add click-to-audition hits, onset editing, cluster merge/split, label reassignment. |
52
  | Frontend quality | No-build JavaScript UI | Good enough for local app | Convert to TypeScript once interaction model stabilizes. |
53
 
54
  ## Explicit non-goals for this pass
 
14
  | UI | Drag/drop audio upload | Implemented | Uses multipart upload to `POST /api/jobs`. |
15
  | UI | Source preview | Implemented | Browser `<audio>` preview before extraction. |
16
  | UI | Pipeline controls | Implemented | Stem/model/onset/clustering/MIDI/synthesis/cache controls. |
17
+ | UI | Streaming progress | Implemented | Uses `EventSource` over `GET /api/jobs/{id}/events`, with polling fallback. |
18
+ | UI | Waveform/onset overview | Implemented | Canvas envelope plus clickable onset markers from `manifest.json`. |
19
+ | UI | Result downloads | Implemented | ZIP, MIDI, stem WAV, reconstruction WAV, individual sample WAVs, and per-hit review WAVs. |
20
  | UI | Run history browser | Implemented | Lists completed `.runs/*/output/manifest.json` entries and reloads results. |
21
+ | UI | Hit and sample audition | Implemented | Dedicated players for selected hit slices and representative sample WAVs. |
22
  | API | Health/config | Implemented | `GET /api/health`, `GET /api/config`. |
23
+ | API | Job creation/status | Implemented | `POST /api/jobs`, `GET /api/jobs/{id}`. |
24
+ | API | SSE job events | Implemented | `GET /api/jobs/{id}/events` streams job snapshots until complete/error. |
25
  | API | Run listing | Implemented | `GET /api/jobs` returns active and completed runs. |
26
  | API | Safe artifact serving | Implemented | Path traversal is blocked by resolved output-root checks. |
27
  | API | Cache clear | Implemented | Clears in-memory DSP cache and disk stem/source cache. |
 
36
  | Pipeline | Optional synthesis | Implemented | Weighted aligned average for multi-hit clusters. |
37
  | Pipeline | MIDI export | Implemented | Quantized or unquantized reconstruction MIDI. |
38
  | Pipeline | Reconstruction render | Implemented | Renders MIDI-like reconstruction using selected samples. |
39
+ | Pipeline | Per-hit review export | Implemented | Writes every accepted detected hit to `review/hits/*.wav` and records rows in the manifest. |
40
  | Pipeline | Sample pack ZIP | Implemented | Includes WAVs, index JSON, MIDI, rendered reconstruction. |
41
  | Docs | Project review | Implemented | `docs/PROJECT_REVIEW.md`. |
42
  | Docs | Timing/realtime analysis | Implemented | `docs/PIPELINE_TIMING_AND_REALTIME.md`. |
43
  | Docs | API docs | Implemented | `docs/API.md`. |
44
  | Docs | UI replacement docs | Implemented | `docs/UI_REPLACEMENT.md`. |
45
  | Docs | Feature/task/progress tracking | Implemented | This file, `TASKS.md`, `PROGRESS.md`. |
46
+ | Docs | Hit review and streaming docs | Implemented | `docs/HIT_REVIEW_AND_STREAMING.md`. |
47
 
48
  ## Partially implemented features
49
 
50
  | Area | Feature | Current state | Needed to call it complete |
51
  |---|---|---|---|
52
+ | Progress | Stage progress | SSE streams stage boundaries and logs | Add lower-level progress inside Demucs and clustering. |
53
  | Realtime | Online clustering | Implemented as batch-invoked prototype assignment | Add streaming/incremental audio analysis API for true realtime preview. |
54
  | Run history | Manifest browser | Lists and reloads completed runs | Add side-by-side comparison and filtering/search. |
55
+ | Editing | Review workflow | Click-to-audition for hits and samples is implemented | Add onset editing, cluster merge/split, label reassignment. |
56
  | Frontend quality | No-build JavaScript UI | Good enough for local app | Convert to TypeScript once interaction model stabilizes. |
57
 
58
  ## Explicit non-goals for this pass
docs/HIT_REVIEW_AND_STREAMING.md ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Hit review and progress streaming
2
+
3
+ Last updated: 2026-05-12
4
+
5
+ ## Purpose
6
+
7
+ This pass moves the app closer to a review workstation by making detected hits individually inspectable and by replacing frontend-only polling with a server-sent-events progress channel.
8
+
9
+ ## Implemented behavior
10
+
11
+ | Area | Implementation | Files |
12
+ |---|---|---|
13
+ | Review hit artifacts | Every accepted detected hit is written as an individual WAV under `review/hits/`. | `pipeline_runner.py` |
14
+ | Manifest hit rows | `manifest.json` now includes a top-level `hits` array with onset, duration, label, cluster, representative flag, and relative file path. | `pipeline_runner.py` |
15
+ | Hit URLs | API serialization adds direct download/playback URLs to every hit row. | `app.py` |
16
+ | Waveform selection | Clicking the waveform selects the nearest detected onset marker. | `web/app.js` |
17
+ | Hit audition | Clicking a hit row or waveform marker loads that hit into the selected-hit audio player. | `web/index.html`, `web/app.js` |
18
+ | Sample audition | Representative sample rows now have explicit Audition buttons and a dedicated selected-sample player. | `web/index.html`, `web/app.js` |
19
+ | SSE progress | `GET /api/jobs/{job_id}/events` streams job snapshots whenever state changes. | `app.py`, `web/app.js` |
20
+ | Poll fallback | The frontend falls back to polling if `EventSource` is unavailable or errors. | `web/app.js` |
21
+ | Artifact serving hardening | File downloads now use `Path.relative_to()` against the resolved run output directory. | `app.py` |
22
+
23
+ ## Manifest shape additions
24
+
25
+ Completed results now include:
26
+
27
+ ```json
28
+ {
29
+ "hits": [
30
+ {
31
+ "index": 0,
32
+ "label": "kick",
33
+ "cluster_id": 3,
34
+ "cluster_label": "kick_0",
35
+ "is_representative": true,
36
+ "onset_sec": 0.002993,
37
+ "duration_ms": 255.0,
38
+ "rms_energy": 0.141768,
39
+ "spectral_centroid_hz": 773.4,
40
+ "file": "review/hits/hit_00000_kick.wav"
41
+ }
42
+ ]
43
+ }
44
+ ```
45
+
46
+ API responses add `url` to each hit row, for example:
47
+
48
+ ```json
49
+ {
50
+ "file": "review/hits/hit_00000_kick.wav",
51
+ "url": "/api/jobs/<job-id>/files/review/hits/hit_00000_kick.wav"
52
+ }
53
+ ```
54
+
55
+ The `overview.onsets` entries now also carry `index` and `duration_sec`, allowing the waveform to map markers back to review hit rows.
56
+
57
+ ## Streaming endpoint
58
+
59
+ `GET /api/jobs/{job_id}/events` returns `text/event-stream`.
60
+
61
+ Each emitted event has type `job` and contains the same serialized shape as `GET /api/jobs/{job_id}`:
62
+
63
+ ```text
64
+ event: job
65
+ data: {"id":"...","status":"running",...}
66
+ ```
67
+
68
+ The stream ends after `complete` or `error`. Completed historical jobs stream one final event and then close.
69
+
70
+ ## Current limitations
71
+
72
+ - Hit review is read-only. It does not yet support delete/shift/relabel actions.
73
+ - Every accepted hit is exported as a WAV. This is correct for review UX, but large files with thousands of hits may produce many small artifacts.
74
+ - SSE streams job snapshots, not fine-grained internal Demucs progress.
75
+ - The waveform is an overview canvas, not an editable detailed waveform yet.
76
+
77
+ ## Next editor step
78
+
79
+ Add an edit state layer on top of the hit manifest:
80
+
81
+ 1. Mark hit deleted/restored.
82
+ 2. Shift onset and duration bounds.
83
+ 3. Reassign cluster label.
84
+ 4. Merge/split clusters.
85
+ 5. Re-render/repack from edited manifest without rerunning Demucs or onset detection.
docs/PIPELINE_TIMING_AND_REALTIME.md CHANGED
@@ -36,16 +36,16 @@ The checked-in benchmark files were refreshed on 2026-05-12 with synthetic 2-bar
36
 
37
  | Stage | Batch quality mean | Online preview mean |
38
  |---|---:|---:|
39
- | source load | 0.011 s | 0.012 s |
40
- | BPM detection | 0.185 s | 0.163 s |
41
- | onset detection + slicing | 1.943 s | 1.834 s |
42
- | classification | 0.019 s | 0.017 s |
43
- | clustering | 0.148 s | 0.045 s |
44
- | representative selection | 0.204 s | 0.115 s |
45
  | synthesis | 0.001 s | 0.001 s |
46
- | export/package | 0.156 s | 0.221 s |
47
 
48
- On these small fixtures, `online_preview` reduced clustering time by about 3× compared with `batch_quality`. The total run is still dominated by onset detection, so the next realtime optimization target is streaming/incremental onset analysis rather than only clustering.
49
 
50
  First cold runs can be much slower because imports and library initialization are paid up front.
51
 
@@ -126,6 +126,5 @@ The current `online_preview` mode is invoked by the batch job API after onset de
126
  1. A streaming/ranged audio analysis API.
127
  2. Incremental onset detector state.
128
  3. Incremental hit artifact writing.
129
- 4. SSE progress/results stream.
130
- 5. UI that appends hits/clusters as they arrive.
131
- 6. Optional final `batch_quality` consolidation pass.
 
36
 
37
  | Stage | Batch quality mean | Online preview mean |
38
  |---|---:|---:|
39
+ | source load | 0.010 s | 0.010 s |
40
+ | BPM detection | 0.155 s | 0.126 s |
41
+ | onset detection + slicing | 1.964 s | 1.763 s |
42
+ | classification | 0.042 s | 0.041 s |
43
+ | clustering | 0.046 s | 0.037 s |
44
+ | representative selection | 0.177 s | 0.158 s |
45
  | synthesis | 0.001 s | 0.001 s |
46
+ | export/package | 0.158 s | 0.291 s |
47
 
48
+ On these small fixtures, `online_preview` reduced clustering time compared with `batch_quality`, while export time increased because this pass now writes every accepted hit as a review WAV under `review/hits/`. The total run is still dominated by onset detection, so the next realtime optimization target is streaming/incremental onset analysis rather than only clustering.
49
 
50
  First cold runs can be much slower because imports and library initialization are paid up front.
51
 
 
126
  1. A streaming/ranged audio analysis API.
127
  2. Incremental onset detector state.
128
  3. Incremental hit artifact writing.
129
+ 4. UI that appends hits/clusters as they arrive instead of waiting for the completed manifest.
130
+ 5. Optional final `batch_quality` consolidation pass.
 
docs/PROGRESS.md CHANGED
@@ -40,17 +40,17 @@ The project now has a clearer product surface: final-quality batch extraction, f
40
 
41
  ## Current assessment
42
 
43
- The application is not “fully complete” as an editing workstation, but it is substantially implemented as an extraction workstation. The remaining gaps are concentrated around interactive correction/editing, richer progress streaming, run comparison, and frontend engineering hardening.
44
 
45
  ## Next recommended pass
46
 
47
  Implement the editing loop:
48
 
49
- 1. Click waveform onset marker or sample table row to audition.
50
- 2. Show selected hit metadata and audio snippet.
51
- 3. Allow onset shift, label change, cluster reassignment, merge, and split.
52
- 4. Re-export without rerunning Demucs/onset detection when only grouping changes.
53
- 5. Save edit decisions into the manifest.
54
 
55
  ## Validation performed in this pass
56
 
@@ -58,6 +58,26 @@ Implement the editing loop:
58
  - Ran FastAPI smoke job through `scripts/test_api_job.py`.
59
  - Ran an online-preview API smoke job with synthetic audio.
60
  - Verified `GET /api/jobs` history output and `POST /api/cache/clear` behavior.
 
61
  - Refreshed batch and online benchmark JSON files:
62
  - `docs/benchmark-subprocesses.json`
63
  - `docs/benchmark-online-preview.json`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
  ## Current assessment
42
 
43
+ The application is not “fully complete” as an editing workstation, but it is substantially implemented as an extraction and review workstation. The remaining gaps are concentrated around mutating corrections/editing, run comparison, and frontend engineering hardening.
44
 
45
  ## Next recommended pass
46
 
47
  Implement the editing loop:
48
 
49
+ 1. Add edit state for deleted/restored hits and shifted onsets.
50
+ 2. Add label change, cluster reassignment, merge, and split.
51
+ 3. Re-export without rerunning Demucs/onset detection when only grouping changes.
52
+ 4. Save edit decisions into the manifest.
53
+ 5. Add side-by-side run comparison for parameter tuning.
54
 
55
  ## Validation performed in this pass
56
 
 
58
  - Ran FastAPI smoke job through `scripts/test_api_job.py`.
59
  - Ran an online-preview API smoke job with synthetic audio.
60
  - Verified `GET /api/jobs` history output and `POST /api/cache/clear` behavior.
61
+ - Verified SSE completion and review-hit artifact serving.
62
  - Refreshed batch and online benchmark JSON files:
63
  - `docs/benchmark-subprocesses.json`
64
  - `docs/benchmark-online-preview.json`
65
+
66
+ ## Pass 3: hit review and streaming progress
67
+
68
+ Completed in this pass:
69
+
70
+ 1. Added `GET /api/jobs/{job_id}/events` as a server-sent-events progress stream.
71
+ 2. Updated the frontend to consume SSE via `EventSource`, with the existing polling loop retained as fallback.
72
+ 3. Added per-hit review artifact export under `review/hits/`.
73
+ 4. Added a top-level `hits` array to each run manifest with onset, duration, classification, cluster label, representative flag, and file path.
74
+ 5. Added API serialization for hit playback/download URLs.
75
+ 6. Added selected-hit and selected-sample audio players.
76
+ 7. Made waveform onset markers clickable by selecting the nearest detected hit.
77
+ 8. Added hit table and sample-table audition controls.
78
+ 9. Hardened artifact file serving by using resolved path containment via `Path.relative_to()`.
79
+ 10. Refreshed batch and online benchmark JSON files after the review-hit export change.
80
+
81
+ Outcome:
82
+
83
+ The app now supports a real review loop for inspecting what the onset detector and clustering produced. Users can audition individual detected slices, representative samples, stem audio, and reconstruction audio from one screen. Progress updates are lower-latency and less wasteful via SSE while still remaining robust in browsers that need polling fallback.
docs/REMAINING_WORK.md CHANGED
@@ -8,11 +8,11 @@ The project is now a usable extraction workstation, not a complete interactive s
8
 
9
  ## Highest-priority remaining gaps
10
 
11
- 1. **Hit audition and selection**: clicking an onset marker or sample row should audition that exact hit/sample.
12
- 2. **Waveform editing**: add onset adjustment, delete/add hit, and rerun-from-edited-onsets without redoing Demucs.
13
- 3. **Cluster editing**: allow merge, split, relabel, and manual reassignment of hits.
14
  4. **Run comparison**: compare two manifests side-by-side for parameter tuning.
15
- 5. **Progress streaming**: replace polling or supplement it with SSE for lower-latency logs/progress.
16
  6. **Frontend engineering hardening**: migrate the frontend to TypeScript after the UX stabilizes and add browser-level tests.
17
  7. **Benchmark panel**: add an in-app benchmark view that can run synthetic fixtures and compare parameter profiles.
18
 
@@ -26,10 +26,10 @@ The project is now a usable extraction workstation, not a complete interactive s
26
 
27
  ## Suggested implementation order
28
 
29
- 1. Add click-to-audition for sample table rows and waveform onsets.
30
- 2. Store detected hit snippets as individual review artifacts or expose ranged audio endpoints.
31
- 3. Add edit state to manifests: deleted hits, shifted onsets, labels, cluster overrides.
32
- 4. Add rerender/repack endpoint that starts from edited hit/cluster state.
33
- 5. Add run comparison view.
34
- 6. Add SSE progress streaming.
35
- 7. Convert frontend to TypeScript and add UI tests.
 
8
 
9
  ## Highest-priority remaining gaps
10
 
11
+ 1. **Waveform editing**: add onset adjustment, delete/add hit, and rerun-from-edited-onsets without redoing Demucs.
12
+ 2. **Cluster editing**: allow merge, split, relabel, and manual reassignment of hits.
13
+ 3. **Edited re-export**: regenerate samples/MIDI/ZIP from edited hit/cluster state without rerunning Demucs or onset detection.
14
  4. **Run comparison**: compare two manifests side-by-side for parameter tuning.
15
+ 5. **Lower-level progress**: expose internal Demucs/clustering progress where libraries make that possible.
16
  6. **Frontend engineering hardening**: migrate the frontend to TypeScript after the UX stabilizes and add browser-level tests.
17
  7. **Benchmark panel**: add an in-app benchmark view that can run synthetic fixtures and compare parameter profiles.
18
 
 
26
 
27
  ## Suggested implementation order
28
 
29
+ 1. Add edit state to manifests: deleted hits, shifted onsets, labels, cluster overrides.
30
+ 2. Add rerender/repack endpoint that starts from edited hit/cluster state.
31
+ 3. Add cluster merge/split/relabel actions in the UI.
32
+ 4. Add run comparison view.
33
+ 5. Add lower-level progress hooks inside expensive stages where practical.
34
+ 6. Convert frontend to TypeScript and add UI tests.
35
+ 7. Add an in-app benchmark/parameter profile panel.
docs/TASKS.md CHANGED
@@ -12,7 +12,7 @@ Last updated: 2026-05-12
12
  | Add documentation to project | Done | `docs/*.md`, updated `README.md`. |
13
  | Replace Gradio UI | Done | Active app is FastAPI + custom web UI; Gradio moved to `legacy/`. |
14
  | Document features, tasks, and progress | Done | `docs/FEATURES.md`, this file, `docs/PROGRESS.md`. |
15
- | Continue development while keeping docs up-to-date | In progress | This pass adds run history, disk cache, online clustering mode, and docs updates. |
16
 
17
  ## Completed implementation tasks
18
 
@@ -33,6 +33,13 @@ Last updated: 2026-05-12
33
  - [x] Add UI controls for clustering mode and disk cache.
34
  - [x] Fix duplicate sample writes in `build_archive`.
35
  - [x] Add feature, task, and progress docs.
 
 
 
 
 
 
 
36
 
37
  ## Validation tasks
38
 
@@ -40,15 +47,14 @@ Last updated: 2026-05-12
40
  - [x] FastAPI smoke test for health/config/job flow.
41
  - [x] Pipeline smoke test on synthetic audio.
42
  - [x] API history/cache smoke test.
 
43
  - [x] Git status reviewed before packaging.
44
  - [x] Project archive excludes `.runs/`, `.cache/`, and dependency folders.
45
 
46
  ## Remaining high-value tasks
47
 
48
- - [ ] Add click-to-audition onset markers and table rows.
49
  - [ ] Add onset adjustment and rerun-from-onsets flow.
50
  - [ ] Add cluster merge/split/relabel workflow.
51
  - [ ] Add side-by-side run comparison.
52
- - [ ] Add SSE progress stream for lower-latency updates.
53
  - [ ] Convert frontend to TypeScript with a small Vite build once UX stabilizes.
54
  - [ ] Add automated browser-level UI tests.
 
12
  | Add documentation to project | Done | `docs/*.md`, updated `README.md`. |
13
  | Replace Gradio UI | Done | Active app is FastAPI + custom web UI; Gradio moved to `legacy/`. |
14
  | Document features, tasks, and progress | Done | `docs/FEATURES.md`, this file, `docs/PROGRESS.md`. |
15
+ | Continue development while keeping docs up-to-date | In progress | Latest pass adds SSE progress, per-hit review artifacts, hit/sample audition, hardened artifact serving, and docs updates. |
16
 
17
  ## Completed implementation tasks
18
 
 
33
  - [x] Add UI controls for clustering mode and disk cache.
34
  - [x] Fix duplicate sample writes in `build_archive`.
35
  - [x] Add feature, task, and progress docs.
36
+ - [x] Add `GET /api/jobs/{id}/events` SSE progress stream.
37
+ - [x] Add per-hit review WAV export under `review/hits/`.
38
+ - [x] Add manifest `hits` rows with onset, duration, cluster, representative flag, and artifact path.
39
+ - [x] Add click-to-audition for waveform onset markers and detected hit rows.
40
+ - [x] Add sample-row audition controls.
41
+ - [x] Harden artifact path containment with `Path.relative_to()`.
42
+ - [x] Add hit review/streaming documentation.
43
 
44
  ## Validation tasks
45
 
 
47
  - [x] FastAPI smoke test for health/config/job flow.
48
  - [x] Pipeline smoke test on synthetic audio.
49
  - [x] API history/cache smoke test.
50
+ - [x] SSE and review-hit artifact smoke test via `scripts/test_sse_and_review_hits.py`.
51
  - [x] Git status reviewed before packaging.
52
  - [x] Project archive excludes `.runs/`, `.cache/`, and dependency folders.
53
 
54
  ## Remaining high-value tasks
55
 
 
56
  - [ ] Add onset adjustment and rerun-from-onsets flow.
57
  - [ ] Add cluster merge/split/relabel workflow.
58
  - [ ] Add side-by-side run comparison.
 
59
  - [ ] Convert frontend to TypeScript with a small Vite build once UX stabilizes.
60
  - [ ] Add automated browser-level UI tests.
docs/UI_REPLACEMENT.md CHANGED
@@ -65,7 +65,7 @@ Two modes are exposed:
65
  | `batch_quality` | Slower, final-quality clustering using all-pairs similarity plus agglomerative clustering. |
66
  | `online_preview` | Faster near-realtime-style clustering using prototype assignment. Best for quick iteration after bypassing Demucs. |
67
 
68
- ## Why polling instead of websockets/SSE
69
 
70
  Polling is the simplest robust option here because the current pipeline is CPU-heavy and mostly stage-based. The UI polls every 800 ms, which is enough to show stage transitions and logs without introducing websocket lifecycle complexity.
71
 
@@ -79,3 +79,15 @@ Future improvement: use Server-Sent Events for lower-latency log streaming once
79
  - Add downloadable timing report per job.
80
  - Add filters/search to the run history browser.
81
  - Convert the frontend to TypeScript when the UX stops moving quickly.
 
 
 
 
 
 
 
 
 
 
 
 
 
65
  | `batch_quality` | Slower, final-quality clustering using all-pairs similarity plus agglomerative clustering. |
66
  | `online_preview` | Faster near-realtime-style clustering using prototype assignment. Best for quick iteration after bypassing Demucs. |
67
 
68
+ ## Why SSE progress with polling fallback instead of websockets/SSE
69
 
70
  Polling is the simplest robust option here because the current pipeline is CPU-heavy and mostly stage-based. The UI polls every 800 ms, which is enough to show stage transitions and logs without introducing websocket lifecycle complexity.
71
 
 
79
  - Add downloadable timing report per job.
80
  - Add filters/search to the run history browser.
81
  - Convert the frontend to TypeScript when the UX stops moving quickly.
82
+
83
+ ## Latest review UI additions
84
+
85
+ The current UI now includes:
86
+
87
+ - Dedicated selected-hit and selected-sample audio players.
88
+ - Clickable waveform onset markers that select the nearest detected hit.
89
+ - A detected-hit review table backed by `review/hits/*.wav` artifacts.
90
+ - Audition buttons for representative sample rows.
91
+ - Server-sent-events job progress via `GET /api/jobs/{job_id}/events`, with polling fallback.
92
+
93
+ This still stops short of destructive editing. The next UI layer should store edits as manifest overlays, then call a re-export endpoint that reuses cached hit audio instead of rerunning Demucs/onset detection.
docs/benchmark-online-preview.json CHANGED
@@ -8,66 +8,66 @@
8
  "run_index": 0,
9
  "clustering_mode": "online_preview",
10
  "audio_duration_sec": 4.75,
11
- "total_duration_sec": 2.394493,
12
- "realtime_factor": 0.504104,
13
- "hit_count": 14,
14
  "cluster_count": 10,
15
  "stages": [
16
  {
17
  "key": "stem",
18
  "label": "Stem extraction / source load",
19
- "duration_sec": 0.01333964500008733,
20
  "status": "done",
21
  "detail": "loaded full mix \u00b7 cached"
22
  },
23
  {
24
  "key": "bpm",
25
  "label": "Tempo detection",
26
- "duration_sec": 0.18073730900005103,
27
  "status": "done",
28
  "detail": "120.2 BPM"
29
  },
30
  {
31
  "key": "onsets",
32
  "label": "Onset detection + slicing",
33
- "duration_sec": 1.8083914959997855,
34
  "status": "done",
35
- "detail": "14 hits"
36
  },
37
  {
38
  "key": "classification",
39
  "label": "Spectral rule classification",
40
- "duration_sec": 0.015553790000012668,
41
  "status": "done",
42
- "detail": "bright:5, hihat_open:8, kick:1"
43
  },
44
  {
45
  "key": "clustering",
46
  "label": "Mel fingerprint + transient NCC clustering",
47
- "duration_sec": 0.01717499700021108,
48
  "status": "done",
49
  "detail": "10 clusters \u00b7 online preview"
50
  },
51
  {
52
  "key": "selection",
53
  "label": "Best representative scoring",
54
- "duration_sec": 0.06853683399981492,
55
  "status": "done",
56
  "detail": "quality-scored representatives"
57
  },
58
  {
59
  "key": "synthesis",
60
  "label": "Optional sample synthesis",
61
- "duration_sec": 0.0004338460000781197,
62
  "status": "done",
63
  "detail": "2 synthesized alternates"
64
  },
65
  {
66
  "key": "export",
67
  "label": "MIDI, reconstruction, WAV, ZIP export",
68
- "duration_sec": 0.2898033520000354,
69
  "status": "done",
70
- "detail": "10 WAVs + MIDI + ZIP"
71
  }
72
  ]
73
  },
@@ -78,66 +78,66 @@
78
  "run_index": 0,
79
  "clustering_mode": "online_preview",
80
  "audio_duration_sec": 4.874989,
81
- "total_duration_sec": 2.422223,
82
- "realtime_factor": 0.496867,
83
- "hit_count": 30,
84
  "cluster_count": 12,
85
  "stages": [
86
  {
87
  "key": "stem",
88
  "label": "Stem extraction / source load",
89
- "duration_sec": 0.012654803000032189,
90
  "status": "done",
91
  "detail": "loaded full mix \u00b7 cached"
92
  },
93
  {
94
  "key": "bpm",
95
  "label": "Tempo detection",
96
- "duration_sec": 0.10868702200014013,
97
  "status": "done",
98
- "detail": "120.2 BPM"
99
  },
100
  {
101
  "key": "onsets",
102
  "label": "Onset detection + slicing",
103
- "duration_sec": 1.7981390029999602,
104
  "status": "done",
105
- "detail": "30 hits"
106
  },
107
  {
108
  "key": "classification",
109
  "label": "Spectral rule classification",
110
- "duration_sec": 0.020911717999979373,
111
  "status": "done",
112
- "detail": "bright:12, cymbal:2, hihat_closed:9, hihat_open:3, kick:1, mid:3"
113
  },
114
  {
115
  "key": "clustering",
116
  "label": "Mel fingerprint + transient NCC clustering",
117
- "duration_sec": 0.08173960800013447,
118
  "status": "done",
119
  "detail": "12 clusters \u00b7 online preview"
120
  },
121
  {
122
  "key": "selection",
123
  "label": "Best representative scoring",
124
- "duration_sec": 0.18588780100003532,
125
  "status": "done",
126
  "detail": "quality-scored representatives"
127
  },
128
  {
129
  "key": "synthesis",
130
  "label": "Optional sample synthesis",
131
- "duration_sec": 0.001146163000157685,
132
  "status": "done",
133
- "detail": "6 synthesized alternates"
134
  },
135
  {
136
  "key": "export",
137
  "label": "MIDI, reconstruction, WAV, ZIP export",
138
- "duration_sec": 0.21253995300003226,
139
  "status": "done",
140
- "detail": "12 WAVs + MIDI + ZIP"
141
  }
142
  ]
143
  },
@@ -148,66 +148,66 @@
148
  "run_index": 0,
149
  "clustering_mode": "online_preview",
150
  "audio_duration_sec": 4.874989,
151
- "total_duration_sec": 2.406563,
152
- "realtime_factor": 0.493655,
153
- "hit_count": 28,
154
  "cluster_count": 12,
155
  "stages": [
156
  {
157
  "key": "stem",
158
  "label": "Stem extraction / source load",
159
- "duration_sec": 0.009107656999958635,
160
  "status": "done",
161
  "detail": "loaded full mix \u00b7 cached"
162
  },
163
  {
164
  "key": "bpm",
165
  "label": "Tempo detection",
166
- "duration_sec": 0.19882379599994238,
167
  "status": "done",
168
- "detail": "118.8 BPM"
169
  },
170
  {
171
  "key": "onsets",
172
  "label": "Onset detection + slicing",
173
- "duration_sec": 1.8942657120001059,
174
  "status": "done",
175
- "detail": "28 hits"
176
  },
177
  {
178
  "key": "classification",
179
  "label": "Spectral rule classification",
180
- "duration_sec": 0.015083428000025378,
181
  "status": "done",
182
- "detail": "bright:5, cymbal:2, hihat_closed:19, hihat_open:2"
183
  },
184
  {
185
  "key": "clustering",
186
  "label": "Mel fingerprint + transient NCC clustering",
187
- "duration_sec": 0.036892447000127504,
188
  "status": "done",
189
  "detail": "12 clusters \u00b7 online preview"
190
  },
191
  {
192
  "key": "selection",
193
  "label": "Best representative scoring",
194
- "duration_sec": 0.0908485570000721,
195
  "status": "done",
196
  "detail": "quality-scored representatives"
197
  },
198
  {
199
  "key": "synthesis",
200
  "label": "Optional sample synthesis",
201
- "duration_sec": 0.0007993310000529164,
202
  "status": "done",
203
- "detail": "4 synthesized alternates"
204
  },
205
  {
206
  "key": "export",
207
  "label": "MIDI, reconstruction, WAV, ZIP export",
208
- "duration_sec": 0.1602465889998257,
209
  "status": "done",
210
- "detail": "12 WAVs + MIDI + ZIP"
211
  }
212
  ]
213
  }
@@ -215,59 +215,59 @@
215
  "summary": [
216
  {
217
  "stage": "stem",
218
- "mean_sec": 0.011701,
219
- "median_sec": 0.012655,
220
- "min_sec": 0.009108,
221
- "max_sec": 0.01334
222
  },
223
  {
224
  "stage": "bpm",
225
- "mean_sec": 0.162749,
226
- "median_sec": 0.180737,
227
- "min_sec": 0.108687,
228
- "max_sec": 0.198824
229
  },
230
  {
231
  "stage": "onsets",
232
- "mean_sec": 1.833599,
233
- "median_sec": 1.808391,
234
- "min_sec": 1.798139,
235
- "max_sec": 1.894266
236
  },
237
  {
238
  "stage": "classification",
239
- "mean_sec": 0.017183,
240
- "median_sec": 0.015554,
241
- "min_sec": 0.015083,
242
- "max_sec": 0.020912
243
  },
244
  {
245
  "stage": "clustering",
246
- "mean_sec": 0.045269,
247
- "median_sec": 0.036892,
248
- "min_sec": 0.017175,
249
- "max_sec": 0.08174
250
  },
251
  {
252
  "stage": "selection",
253
- "mean_sec": 0.115091,
254
- "median_sec": 0.090849,
255
- "min_sec": 0.068537,
256
- "max_sec": 0.185888
257
  },
258
  {
259
  "stage": "synthesis",
260
- "mean_sec": 0.000793,
261
- "median_sec": 0.000799,
262
- "min_sec": 0.000434,
263
- "max_sec": 0.001146
264
  },
265
  {
266
  "stage": "export",
267
- "mean_sec": 0.220863,
268
- "median_sec": 0.21254,
269
- "min_sec": 0.160247,
270
- "max_sec": 0.289803
271
  }
272
  ]
273
  }
 
8
  "run_index": 0,
9
  "clustering_mode": "online_preview",
10
  "audio_duration_sec": 4.75,
11
+ "total_duration_sec": 1.88646,
12
+ "realtime_factor": 0.397149,
13
+ "hit_count": 13,
14
  "cluster_count": 10,
15
  "stages": [
16
  {
17
  "key": "stem",
18
  "label": "Stem extraction / source load",
19
+ "duration_sec": 0.011189419999936945,
20
  "status": "done",
21
  "detail": "loaded full mix \u00b7 cached"
22
  },
23
  {
24
  "key": "bpm",
25
  "label": "Tempo detection",
26
+ "duration_sec": 0.09853705299974536,
27
  "status": "done",
28
  "detail": "120.2 BPM"
29
  },
30
  {
31
  "key": "onsets",
32
  "label": "Onset detection + slicing",
33
+ "duration_sec": 1.3858792310002173,
34
  "status": "done",
35
+ "detail": "13 hits"
36
  },
37
  {
38
  "key": "classification",
39
  "label": "Spectral rule classification",
40
+ "duration_sec": 0.014456886000061786,
41
  "status": "done",
42
+ "detail": "bright:5, hihat_open:7, kick:1"
43
  },
44
  {
45
  "key": "clustering",
46
  "label": "Mel fingerprint + transient NCC clustering",
47
+ "duration_sec": 0.016802669999833597,
48
  "status": "done",
49
  "detail": "10 clusters \u00b7 online preview"
50
  },
51
  {
52
  "key": "selection",
53
  "label": "Best representative scoring",
54
+ "duration_sec": 0.07535981499995614,
55
  "status": "done",
56
  "detail": "quality-scored representatives"
57
  },
58
  {
59
  "key": "synthesis",
60
  "label": "Optional sample synthesis",
61
+ "duration_sec": 0.00036268399981054245,
62
  "status": "done",
63
  "detail": "2 synthesized alternates"
64
  },
65
  {
66
  "key": "export",
67
  "label": "MIDI, reconstruction, WAV, ZIP export",
68
+ "duration_sec": 0.28339249200007544,
69
  "status": "done",
70
+ "detail": "10 samples + 13 review hits + MIDI + ZIP"
71
  }
72
  ]
73
  },
 
78
  "run_index": 0,
79
  "clustering_mode": "online_preview",
80
  "audio_duration_sec": 4.874989,
81
+ "total_duration_sec": 2.914241,
82
+ "realtime_factor": 0.597794,
83
+ "hit_count": 28,
84
  "cluster_count": 12,
85
  "stages": [
86
  {
87
  "key": "stem",
88
  "label": "Stem extraction / source load",
89
+ "duration_sec": 0.00999813099997482,
90
  "status": "done",
91
  "detail": "loaded full mix \u00b7 cached"
92
  },
93
  {
94
  "key": "bpm",
95
  "label": "Tempo detection",
96
+ "duration_sec": 0.10688103099982982,
97
  "status": "done",
98
+ "detail": "161.5 BPM"
99
  },
100
  {
101
  "key": "onsets",
102
  "label": "Onset detection + slicing",
103
+ "duration_sec": 2.1018096600000717,
104
  "status": "done",
105
+ "detail": "28 hits"
106
  },
107
  {
108
  "key": "classification",
109
  "label": "Spectral rule classification",
110
+ "duration_sec": 0.09064649800029656,
111
  "status": "done",
112
+ "detail": "bright:12, cymbal:1, hihat_closed:9, hihat_open:3, mid:3"
113
  },
114
  {
115
  "key": "clustering",
116
  "label": "Mel fingerprint + transient NCC clustering",
117
+ "duration_sec": 0.049414074000196706,
118
  "status": "done",
119
  "detail": "12 clusters \u00b7 online preview"
120
  },
121
  {
122
  "key": "selection",
123
  "label": "Best representative scoring",
124
+ "duration_sec": 0.23301379500026087,
125
  "status": "done",
126
  "detail": "quality-scored representatives"
127
  },
128
  {
129
  "key": "synthesis",
130
  "label": "Optional sample synthesis",
131
+ "duration_sec": 0.0012726520003525366,
132
  "status": "done",
133
+ "detail": "5 synthesized alternates"
134
  },
135
  {
136
  "key": "export",
137
  "label": "MIDI, reconstruction, WAV, ZIP export",
138
+ "duration_sec": 0.32063418000007005,
139
  "status": "done",
140
+ "detail": "12 samples + 28 review hits + MIDI + ZIP"
141
  }
142
  ]
143
  },
 
148
  "run_index": 0,
149
  "clustering_mode": "online_preview",
150
  "audio_duration_sec": 4.874989,
151
+ "total_duration_sec": 2.480844,
152
+ "realtime_factor": 0.508892,
153
+ "hit_count": 29,
154
  "cluster_count": 12,
155
  "stages": [
156
  {
157
  "key": "stem",
158
  "label": "Stem extraction / source load",
159
+ "duration_sec": 0.010305768999842257,
160
  "status": "done",
161
  "detail": "loaded full mix \u00b7 cached"
162
  },
163
  {
164
  "key": "bpm",
165
  "label": "Tempo detection",
166
+ "duration_sec": 0.1724793140001566,
167
  "status": "done",
168
+ "detail": "120.2 BPM"
169
  },
170
  {
171
  "key": "onsets",
172
  "label": "Onset detection + slicing",
173
+ "duration_sec": 1.8014776340000935,
174
  "status": "done",
175
+ "detail": "29 hits"
176
  },
177
  {
178
  "key": "classification",
179
  "label": "Spectral rule classification",
180
+ "duration_sec": 0.017559420999987196,
181
  "status": "done",
182
+ "detail": "bright:5, cymbal:1, hihat_closed:20, hihat_open:3"
183
  },
184
  {
185
  "key": "clustering",
186
  "label": "Mel fingerprint + transient NCC clustering",
187
+ "duration_sec": 0.043723993000185146,
188
  "status": "done",
189
  "detail": "12 clusters \u00b7 online preview"
190
  },
191
  {
192
  "key": "selection",
193
  "label": "Best representative scoring",
194
+ "duration_sec": 0.16425892699999167,
195
  "status": "done",
196
  "detail": "quality-scored representatives"
197
  },
198
  {
199
  "key": "synthesis",
200
  "label": "Optional sample synthesis",
201
+ "duration_sec": 0.0012976000002709043,
202
  "status": "done",
203
+ "detail": "8 synthesized alternates"
204
  },
205
  {
206
  "key": "export",
207
  "label": "MIDI, reconstruction, WAV, ZIP export",
208
+ "duration_sec": 0.2692134119997718,
209
  "status": "done",
210
+ "detail": "12 samples + 29 review hits + MIDI + ZIP"
211
  }
212
  ]
213
  }
 
215
  "summary": [
216
  {
217
  "stage": "stem",
218
+ "mean_sec": 0.010498,
219
+ "median_sec": 0.010306,
220
+ "min_sec": 0.009998,
221
+ "max_sec": 0.011189
222
  },
223
  {
224
  "stage": "bpm",
225
+ "mean_sec": 0.125966,
226
+ "median_sec": 0.106881,
227
+ "min_sec": 0.098537,
228
+ "max_sec": 0.172479
229
  },
230
  {
231
  "stage": "onsets",
232
+ "mean_sec": 1.763056,
233
+ "median_sec": 1.801478,
234
+ "min_sec": 1.385879,
235
+ "max_sec": 2.10181
236
  },
237
  {
238
  "stage": "classification",
239
+ "mean_sec": 0.040888,
240
+ "median_sec": 0.017559,
241
+ "min_sec": 0.014457,
242
+ "max_sec": 0.090646
243
  },
244
  {
245
  "stage": "clustering",
246
+ "mean_sec": 0.036647,
247
+ "median_sec": 0.043724,
248
+ "min_sec": 0.016803,
249
+ "max_sec": 0.049414
250
  },
251
  {
252
  "stage": "selection",
253
+ "mean_sec": 0.157544,
254
+ "median_sec": 0.164259,
255
+ "min_sec": 0.07536,
256
+ "max_sec": 0.233014
257
  },
258
  {
259
  "stage": "synthesis",
260
+ "mean_sec": 0.000978,
261
+ "median_sec": 0.001273,
262
+ "min_sec": 0.000363,
263
+ "max_sec": 0.001298
264
  },
265
  {
266
  "stage": "export",
267
+ "mean_sec": 0.29108,
268
+ "median_sec": 0.283392,
269
+ "min_sec": 0.269213,
270
+ "max_sec": 0.320634
271
  }
272
  ]
273
  }
docs/benchmark-subprocesses.json CHANGED
@@ -8,66 +8,66 @@
8
  "run_index": 0,
9
  "clustering_mode": "batch_quality",
10
  "audio_duration_sec": 4.75,
11
- "total_duration_sec": 2.416794,
12
- "realtime_factor": 0.508799,
13
- "hit_count": 14,
14
  "cluster_count": 7,
15
  "stages": [
16
  {
17
  "key": "stem",
18
  "label": "Stem extraction / source load",
19
- "duration_sec": 0.011517213000161064,
20
  "status": "done",
21
  "detail": "loaded full mix \u00b7 cached"
22
  },
23
  {
24
  "key": "bpm",
25
  "label": "Tempo detection",
26
- "duration_sec": 0.19438482000009571,
27
  "status": "done",
28
  "detail": "120.2 BPM"
29
  },
30
  {
31
  "key": "onsets",
32
  "label": "Onset detection + slicing",
33
- "duration_sec": 1.8062190609998652,
34
  "status": "done",
35
- "detail": "14 hits"
36
  },
37
  {
38
  "key": "classification",
39
  "label": "Spectral rule classification",
40
- "duration_sec": 0.016392102000054365,
41
  "status": "done",
42
- "detail": "bright:5, hihat_closed:1, hihat_open:7, kick:1"
43
  },
44
  {
45
  "key": "clustering",
46
  "label": "Mel fingerprint + transient NCC clustering",
47
- "duration_sec": 0.07352871200009758,
48
  "status": "done",
49
  "detail": "7 clusters \u00b7 batch quality"
50
  },
51
  {
52
  "key": "selection",
53
  "label": "Best representative scoring",
54
- "duration_sec": 0.096273950000068,
55
  "status": "done",
56
  "detail": "quality-scored representatives"
57
  },
58
  {
59
  "key": "synthesis",
60
  "label": "Optional sample synthesis",
61
- "duration_sec": 0.0006992359999458131,
62
  "status": "done",
63
  "detail": "2 synthesized alternates"
64
  },
65
  {
66
  "key": "export",
67
  "label": "MIDI, reconstruction, WAV, ZIP export",
68
- "duration_sec": 0.2172303219999776,
69
  "status": "done",
70
- "detail": "7 WAVs + MIDI + ZIP"
71
  }
72
  ]
73
  },
@@ -78,66 +78,66 @@
78
  "run_index": 0,
79
  "clustering_mode": "batch_quality",
80
  "audio_duration_sec": 4.874989,
81
- "total_duration_sec": 2.99188,
82
- "realtime_factor": 0.61372,
83
- "hit_count": 35,
84
- "cluster_count": 2,
85
  "stages": [
86
  {
87
  "key": "stem",
88
  "label": "Stem extraction / source load",
89
- "duration_sec": 0.010077079999973648,
90
  "status": "done",
91
  "detail": "loaded full mix \u00b7 cached"
92
  },
93
  {
94
  "key": "bpm",
95
  "label": "Tempo detection",
96
- "duration_sec": 0.17334403699987888,
97
  "status": "done",
98
  "detail": "161.5 BPM"
99
  },
100
  {
101
  "key": "onsets",
102
  "label": "Onset detection + slicing",
103
- "duration_sec": 2.1082552409998243,
104
  "status": "done",
105
- "detail": "35 hits"
106
  },
107
  {
108
  "key": "classification",
109
  "label": "Spectral rule classification",
110
- "duration_sec": 0.021269321000090713,
111
  "status": "done",
112
- "detail": "bright:14, cymbal:1, hihat_closed:14, hihat_open:3, kick:1, mid:2"
113
  },
114
  {
115
  "key": "clustering",
116
  "label": "Mel fingerprint + transient NCC clustering",
117
- "duration_sec": 0.26927052900009585,
118
  "status": "done",
119
- "detail": "2 clusters \u00b7 batch quality"
120
  },
121
  {
122
  "key": "selection",
123
  "label": "Best representative scoring",
124
- "duration_sec": 0.31629775500005053,
125
  "status": "done",
126
  "detail": "quality-scored representatives"
127
  },
128
  {
129
  "key": "synthesis",
130
  "label": "Optional sample synthesis",
131
- "duration_sec": 0.0011716779999915161,
132
  "status": "done",
133
- "detail": "2 synthesized alternates"
134
  },
135
  {
136
  "key": "export",
137
  "label": "MIDI, reconstruction, WAV, ZIP export",
138
- "duration_sec": 0.09167172899992693,
139
  "status": "done",
140
- "detail": "2 WAVs + MIDI + ZIP"
141
  }
142
  ]
143
  },
@@ -148,66 +148,66 @@
148
  "run_index": 0,
149
  "clustering_mode": "batch_quality",
150
  "audio_duration_sec": 4.874989,
151
- "total_duration_sec": 2.597859,
152
- "realtime_factor": 0.532895,
153
- "hit_count": 23,
154
- "cluster_count": 3,
155
  "stages": [
156
  {
157
  "key": "stem",
158
  "label": "Stem extraction / source load",
159
- "duration_sec": 0.012474630000042453,
160
  "status": "done",
161
  "detail": "loaded full mix \u00b7 cached"
162
  },
163
  {
164
  "key": "bpm",
165
  "label": "Tempo detection",
166
- "duration_sec": 0.18858063699985905,
167
  "status": "done",
168
  "detail": "120.2 BPM"
169
  },
170
  {
171
  "key": "onsets",
172
  "label": "Onset detection + slicing",
173
- "duration_sec": 1.9154837959999895,
174
  "status": "done",
175
- "detail": "23 hits"
176
  },
177
  {
178
  "key": "classification",
179
  "label": "Spectral rule classification",
180
- "duration_sec": 0.0188920179998604,
181
  "status": "done",
182
- "detail": "bright:3, hihat_closed:17, hihat_open:3"
183
  },
184
  {
185
  "key": "clustering",
186
  "label": "Mel fingerprint + transient NCC clustering",
187
- "duration_sec": 0.10195718500017392,
188
  "status": "done",
189
- "detail": "3 clusters \u00b7 batch quality"
190
  },
191
  {
192
  "key": "selection",
193
  "label": "Best representative scoring",
194
- "duration_sec": 0.19837312200002089,
195
  "status": "done",
196
  "detail": "quality-scored representatives"
197
  },
198
  {
199
  "key": "synthesis",
200
  "label": "Optional sample synthesis",
201
- "duration_sec": 0.0011928339999940363,
202
  "status": "done",
203
- "detail": "3 synthesized alternates"
204
  },
205
  {
206
  "key": "export",
207
  "label": "MIDI, reconstruction, WAV, ZIP export",
208
- "duration_sec": 0.1603816869999264,
209
  "status": "done",
210
- "detail": "3 WAVs + MIDI + ZIP"
211
  }
212
  ]
213
  }
@@ -215,59 +215,59 @@
215
  "summary": [
216
  {
217
  "stage": "stem",
218
- "mean_sec": 0.011356,
219
- "median_sec": 0.011517,
220
- "min_sec": 0.010077,
221
- "max_sec": 0.012475
222
  },
223
  {
224
  "stage": "bpm",
225
- "mean_sec": 0.185436,
226
- "median_sec": 0.188581,
227
- "min_sec": 0.173344,
228
- "max_sec": 0.194385
229
  },
230
  {
231
  "stage": "onsets",
232
- "mean_sec": 1.943319,
233
- "median_sec": 1.915484,
234
- "min_sec": 1.806219,
235
- "max_sec": 2.108255
236
  },
237
  {
238
  "stage": "classification",
239
- "mean_sec": 0.018851,
240
- "median_sec": 0.018892,
241
- "min_sec": 0.016392,
242
- "max_sec": 0.021269
243
  },
244
  {
245
  "stage": "clustering",
246
- "mean_sec": 0.148252,
247
- "median_sec": 0.101957,
248
- "min_sec": 0.073529,
249
- "max_sec": 0.269271
250
  },
251
  {
252
  "stage": "selection",
253
- "mean_sec": 0.203648,
254
- "median_sec": 0.198373,
255
- "min_sec": 0.096274,
256
- "max_sec": 0.316298
257
  },
258
  {
259
  "stage": "synthesis",
260
- "mean_sec": 0.001021,
261
- "median_sec": 0.001172,
262
- "min_sec": 0.000699,
263
- "max_sec": 0.001193
264
  },
265
  {
266
  "stage": "export",
267
- "mean_sec": 0.156428,
268
- "median_sec": 0.160382,
269
- "min_sec": 0.091672,
270
- "max_sec": 0.21723
271
  }
272
  ]
273
  }
 
8
  "run_index": 0,
9
  "clustering_mode": "batch_quality",
10
  "audio_duration_sec": 4.75,
11
+ "total_duration_sec": 2.508936,
12
+ "realtime_factor": 0.528197,
13
+ "hit_count": 13,
14
  "cluster_count": 7,
15
  "stages": [
16
  {
17
  "key": "stem",
18
  "label": "Stem extraction / source load",
19
+ "duration_sec": 0.010515291000047,
20
  "status": "done",
21
  "detail": "loaded full mix \u00b7 cached"
22
  },
23
  {
24
  "key": "bpm",
25
  "label": "Tempo detection",
26
+ "duration_sec": 0.11277726900016205,
27
  "status": "done",
28
  "detail": "120.2 BPM"
29
  },
30
  {
31
  "key": "onsets",
32
  "label": "Onset detection + slicing",
33
+ "duration_sec": 1.9893157869996685,
34
  "status": "done",
35
+ "detail": "13 hits"
36
  },
37
  {
38
  "key": "classification",
39
  "label": "Spectral rule classification",
40
+ "duration_sec": 0.013427571999727661,
41
  "status": "done",
42
+ "detail": "bright:5, hihat_closed:1, hihat_open:6, kick:1"
43
  },
44
  {
45
  "key": "clustering",
46
  "label": "Mel fingerprint + transient NCC clustering",
47
+ "duration_sec": 0.013959215999875596,
48
  "status": "done",
49
  "detail": "7 clusters \u00b7 batch quality"
50
  },
51
  {
52
  "key": "selection",
53
  "label": "Best representative scoring",
54
+ "duration_sec": 0.09699052199994185,
55
  "status": "done",
56
  "detail": "quality-scored representatives"
57
  },
58
  {
59
  "key": "synthesis",
60
  "label": "Optional sample synthesis",
61
+ "duration_sec": 0.000661541999761539,
62
  "status": "done",
63
  "detail": "2 synthesized alternates"
64
  },
65
  {
66
  "key": "export",
67
  "label": "MIDI, reconstruction, WAV, ZIP export",
68
+ "duration_sec": 0.2707521170000291,
69
  "status": "done",
70
+ "detail": "7 samples + 13 review hits + MIDI + ZIP"
71
  }
72
  ]
73
  },
 
78
  "run_index": 0,
79
  "clustering_mode": "batch_quality",
80
  "audio_duration_sec": 4.874989,
81
+ "total_duration_sec": 2.562433,
82
+ "realtime_factor": 0.525628,
83
+ "hit_count": 30,
84
+ "cluster_count": 1,
85
  "stages": [
86
  {
87
  "key": "stem",
88
  "label": "Stem extraction / source load",
89
+ "duration_sec": 0.009733310000228812,
90
  "status": "done",
91
  "detail": "loaded full mix \u00b7 cached"
92
  },
93
  {
94
  "key": "bpm",
95
  "label": "Tempo detection",
96
+ "duration_sec": 0.18278188500016768,
97
  "status": "done",
98
  "detail": "161.5 BPM"
99
  },
100
  {
101
  "key": "onsets",
102
  "label": "Onset detection + slicing",
103
+ "duration_sec": 1.8905766069997298,
104
  "status": "done",
105
+ "detail": "30 hits"
106
  },
107
  {
108
  "key": "classification",
109
  "label": "Spectral rule classification",
110
+ "duration_sec": 0.016936135000378272,
111
  "status": "done",
112
+ "detail": "bright:15, cymbal:1, hihat_closed:10, hihat_open:3, mid:1"
113
  },
114
  {
115
  "key": "clustering",
116
  "label": "Mel fingerprint + transient NCC clustering",
117
+ "duration_sec": 0.09508980800001154,
118
  "status": "done",
119
+ "detail": "1 clusters \u00b7 batch quality"
120
  },
121
  {
122
  "key": "selection",
123
  "label": "Best representative scoring",
124
+ "duration_sec": 0.271814092999648,
125
  "status": "done",
126
  "detail": "quality-scored representatives"
127
  },
128
  {
129
  "key": "synthesis",
130
  "label": "Optional sample synthesis",
131
+ "duration_sec": 0.0009019099998113234,
132
  "status": "done",
133
+ "detail": "1 synthesized alternates"
134
  },
135
  {
136
  "key": "export",
137
  "label": "MIDI, reconstruction, WAV, ZIP export",
138
+ "duration_sec": 0.09411494899995887,
139
  "status": "done",
140
+ "detail": "1 samples + 30 review hits + MIDI + ZIP"
141
  }
142
  ]
143
  },
 
148
  "run_index": 0,
149
  "clustering_mode": "batch_quality",
150
  "audio_duration_sec": 4.874989,
151
+ "total_duration_sec": 2.587342,
152
+ "realtime_factor": 0.530738,
153
+ "hit_count": 20,
154
+ "cluster_count": 4,
155
  "stages": [
156
  {
157
  "key": "stem",
158
  "label": "Stem extraction / source load",
159
+ "duration_sec": 0.008843839000292064,
160
  "status": "done",
161
  "detail": "loaded full mix \u00b7 cached"
162
  },
163
  {
164
  "key": "bpm",
165
  "label": "Tempo detection",
166
+ "duration_sec": 0.16997624899977382,
167
  "status": "done",
168
  "detail": "120.2 BPM"
169
  },
170
  {
171
  "key": "onsets",
172
  "label": "Onset detection + slicing",
173
+ "duration_sec": 2.0115367889998197,
174
  "status": "done",
175
+ "detail": "20 hits"
176
  },
177
  {
178
  "key": "classification",
179
  "label": "Spectral rule classification",
180
+ "duration_sec": 0.0954397410000638,
181
  "status": "done",
182
+ "detail": "bright:3, hihat_closed:14, hihat_open:3"
183
  },
184
  {
185
  "key": "clustering",
186
  "label": "Mel fingerprint + transient NCC clustering",
187
+ "duration_sec": 0.02929340799983038,
188
  "status": "done",
189
+ "detail": "4 clusters \u00b7 batch quality"
190
  },
191
  {
192
  "key": "selection",
193
  "label": "Best representative scoring",
194
+ "duration_sec": 0.1620299520000117,
195
  "status": "done",
196
  "detail": "quality-scored representatives"
197
  },
198
  {
199
  "key": "synthesis",
200
  "label": "Optional sample synthesis",
201
+ "duration_sec": 0.0010316440002497984,
202
  "status": "done",
203
+ "detail": "2 synthesized alternates"
204
  },
205
  {
206
  "key": "export",
207
  "label": "MIDI, reconstruction, WAV, ZIP export",
208
+ "duration_sec": 0.108677784000065,
209
  "status": "done",
210
+ "detail": "4 samples + 20 review hits + MIDI + ZIP"
211
  }
212
  ]
213
  }
 
215
  "summary": [
216
  {
217
  "stage": "stem",
218
+ "mean_sec": 0.009697,
219
+ "median_sec": 0.009733,
220
+ "min_sec": 0.008844,
221
+ "max_sec": 0.010515
222
  },
223
  {
224
  "stage": "bpm",
225
+ "mean_sec": 0.155178,
226
+ "median_sec": 0.169976,
227
+ "min_sec": 0.112777,
228
+ "max_sec": 0.182782
229
  },
230
  {
231
  "stage": "onsets",
232
+ "mean_sec": 1.96381,
233
+ "median_sec": 1.989316,
234
+ "min_sec": 1.890577,
235
+ "max_sec": 2.011537
236
  },
237
  {
238
  "stage": "classification",
239
+ "mean_sec": 0.041934,
240
+ "median_sec": 0.016936,
241
+ "min_sec": 0.013428,
242
+ "max_sec": 0.09544
243
  },
244
  {
245
  "stage": "clustering",
246
+ "mean_sec": 0.046114,
247
+ "median_sec": 0.029293,
248
+ "min_sec": 0.013959,
249
+ "max_sec": 0.09509
250
  },
251
  {
252
  "stage": "selection",
253
+ "mean_sec": 0.176945,
254
+ "median_sec": 0.16203,
255
+ "min_sec": 0.096991,
256
+ "max_sec": 0.271814
257
  },
258
  {
259
  "stage": "synthesis",
260
+ "mean_sec": 0.000865,
261
+ "median_sec": 0.000902,
262
+ "min_sec": 0.000662,
263
+ "max_sec": 0.001032
264
  },
265
  {
266
  "stage": "export",
267
+ "mean_sec": 0.157848,
268
+ "median_sec": 0.108678,
269
+ "min_sec": 0.094115,
270
+ "max_sec": 0.270752
271
  }
272
  ]
273
  }
pipeline_runner.py CHANGED
@@ -136,6 +136,7 @@ class PipelineResult:
136
  cluster_count: int
137
  stages: list[dict[str, Any]]
138
  samples: list[dict[str, Any]]
 
139
  overview: dict[str, Any]
140
  files: dict[str, str]
141
 
@@ -267,7 +268,9 @@ def _make_overview(audio: np.ndarray, sr: int, hits: list[Any], max_points: int
267
  "envelope": [round(float(x), 6) for x in envelope],
268
  "onsets": [
269
  {
 
270
  "time_sec": round(float(h.onset_time), 6),
 
271
  "label": h.label,
272
  "energy": round(float(h.rms_energy), 6),
273
  "cluster_id": int(getattr(h, "cluster_id", -1)),
@@ -283,6 +286,13 @@ def _copy_temp_file(src: str | os.PathLike[str], dst: Path) -> str:
283
  return str(dst)
284
 
285
 
 
 
 
 
 
 
 
286
  def run_extraction_pipeline(
287
  audio_path: str | os.PathLike[str],
288
  output_dir: str | os.PathLike[str],
@@ -400,6 +410,7 @@ def run_extraction_pipeline(
400
  stage.detail = detail
401
 
402
  sample_rows: list[dict[str, Any]] = []
 
403
  files: dict[str, str] = {"stem": "stem.wav"}
404
 
405
  with _timed_stage(stages, "export", progress_cb) as stage:
@@ -421,6 +432,32 @@ def run_extraction_pipeline(
421
  files["reconstruction"] = "reconstruction.wav"
422
  files["midi"] = "reconstruction.mid"
423
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
424
  for cluster in sorted(clusters, key=lambda item: item.count, reverse=True):
425
  best = cluster.best_hit
426
  sample_path = samples_dir / f"{cluster.label}.wav"
@@ -451,7 +488,7 @@ def run_extraction_pipeline(
451
  os.unlink(archive_tmp)
452
  except OSError:
453
  pass
454
- stage.detail = f"{len(sample_rows)} WAVs + MIDI + ZIP"
455
 
456
  duration_sec = time.perf_counter() - started_total
457
  result = PipelineResult(
@@ -465,6 +502,7 @@ def run_extraction_pipeline(
465
  cluster_count=len(clusters),
466
  stages=[asdict(stage) for stage in stages],
467
  samples=sample_rows,
 
468
  overview=_make_overview(stem_audio, stem_sr, hits),
469
  files=files,
470
  )
 
136
  cluster_count: int
137
  stages: list[dict[str, Any]]
138
  samples: list[dict[str, Any]]
139
+ hits: list[dict[str, Any]]
140
  overview: dict[str, Any]
141
  files: dict[str, str]
142
 
 
268
  "envelope": [round(float(x), 6) for x in envelope],
269
  "onsets": [
270
  {
271
+ "index": int(getattr(h, "index", -1)),
272
  "time_sec": round(float(h.onset_time), 6),
273
+ "duration_sec": round(float(h.duration), 6),
274
  "label": h.label,
275
  "energy": round(float(h.rms_energy), 6),
276
  "cluster_id": int(getattr(h, "cluster_id", -1)),
 
286
  return str(dst)
287
 
288
 
289
+ def _safe_file_component(value: str) -> str:
290
+ cleaned = "".join(ch if ch.isalnum() or ch in {"-", "_"} else "_" for ch in value.lower())
291
+ while "__" in cleaned:
292
+ cleaned = cleaned.replace("__", "_")
293
+ return cleaned.strip("_") or "item"
294
+
295
+
296
  def run_extraction_pipeline(
297
  audio_path: str | os.PathLike[str],
298
  output_dir: str | os.PathLike[str],
 
410
  stage.detail = detail
411
 
412
  sample_rows: list[dict[str, Any]] = []
413
+ hit_rows: list[dict[str, Any]] = []
414
  files: dict[str, str] = {"stem": "stem.wav"}
415
 
416
  with _timed_stage(stages, "export", progress_cb) as stage:
 
432
  files["reconstruction"] = "reconstruction.wav"
433
  files["midi"] = "reconstruction.mid"
434
 
435
+ cluster_labels = {int(cluster.cluster_id): cluster.label for cluster in clusters}
436
+ representative_ids = {id(cluster.best_hit) for cluster in clusters}
437
+ review_hits_dir = out / "review" / "hits"
438
+ if hits:
439
+ review_hits_dir.mkdir(parents=True, exist_ok=True)
440
+ for hit in sorted(hits, key=lambda item: item.index):
441
+ safe_label = _safe_file_component(hit.label or "hit")
442
+ file_name = f"hit_{int(hit.index):05d}_{safe_label}.wav"
443
+ rel_file = f"review/hits/{file_name}"
444
+ hit.save(str(out / rel_file))
445
+ cluster_id = int(getattr(hit, "cluster_id", -1))
446
+ hit_rows.append(
447
+ {
448
+ "index": int(hit.index),
449
+ "label": hit.label,
450
+ "cluster_id": cluster_id,
451
+ "cluster_label": cluster_labels.get(cluster_id, "unclustered"),
452
+ "is_representative": id(hit) in representative_ids,
453
+ "onset_sec": round(float(hit.onset_time), 6),
454
+ "duration_ms": round(float(hit.duration * 1000), 1),
455
+ "rms_energy": round(float(hit.rms_energy), 6),
456
+ "spectral_centroid_hz": round(float(hit.spectral_centroid), 1),
457
+ "file": rel_file,
458
+ }
459
+ )
460
+
461
  for cluster in sorted(clusters, key=lambda item: item.count, reverse=True):
462
  best = cluster.best_hit
463
  sample_path = samples_dir / f"{cluster.label}.wav"
 
488
  os.unlink(archive_tmp)
489
  except OSError:
490
  pass
491
+ stage.detail = f"{len(sample_rows)} samples + {len(hit_rows)} review hits + MIDI + ZIP"
492
 
493
  duration_sec = time.perf_counter() - started_total
494
  result = PipelineResult(
 
502
  cluster_count=len(clusters),
503
  stages=[asdict(stage) for stage in stages],
504
  samples=sample_rows,
505
+ hits=hit_rows,
506
  overview=_make_overview(stem_audio, stem_sr, hits),
507
  files=files,
508
  )
scripts/test_sse_and_review_hits.py ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Smoke-test SSE progress plus per-hit review artifacts."""
3
+
4
+ from __future__ import annotations
5
+
6
+ import io
7
+ import json
8
+ import sys
9
+ from pathlib import Path
10
+
11
+ import soundfile as sf
12
+ from fastapi.testclient import TestClient
13
+
14
+ sys.path.insert(0, str(Path(__file__).resolve().parents[1]))
15
+
16
+ from app import app # noqa: E402
17
+ from synth_generator import generate_test_song # noqa: E402
18
+
19
+
20
+ def main() -> int:
21
+ song = generate_test_song(pattern_name="funk", bars=1, bpm=120, add_bass=False)
22
+ buf = io.BytesIO()
23
+ sf.write(buf, song.drums_only, song.sr, format="WAV")
24
+ buf.seek(0)
25
+
26
+ client = TestClient(app)
27
+ response = client.post(
28
+ "/api/jobs",
29
+ files={"file": ("funk.wav", buf, "audio/wav")},
30
+ data={"params": json.dumps({"stem": "all", "clustering_mode": "online_preview", "target_min": 2, "target_max": 8})},
31
+ )
32
+ response.raise_for_status()
33
+ job_id = response.json()["id"]
34
+
35
+ final = None
36
+ with client.stream("GET", f"/api/jobs/{job_id}/events") as stream:
37
+ stream.raise_for_status()
38
+ for line in stream.iter_lines():
39
+ if not line or not line.startswith("data: "):
40
+ continue
41
+ payload = json.loads(line[6:])
42
+ if payload["status"] == "error":
43
+ raise RuntimeError(payload.get("error"))
44
+ if payload["status"] == "complete":
45
+ final = payload
46
+ break
47
+
48
+ assert final is not None, "SSE stream ended without complete event"
49
+ hits = final["result"]["hits"]
50
+ samples = final["result"]["samples"]
51
+ assert hits, "expected review hit rows"
52
+ assert samples, "expected representative sample rows"
53
+
54
+ first_hit_url = hits[0]["url"]
55
+ file_response = client.get(first_hit_url)
56
+ assert file_response.status_code == 200, first_hit_url
57
+ assert file_response.content[:4] == b"RIFF", "review hit should be a WAV file"
58
+
59
+ print(json.dumps({
60
+ "status": final["status"],
61
+ "job_id": job_id,
62
+ "hit_count": len(hits),
63
+ "sample_count": len(samples),
64
+ "first_hit_url": first_hit_url,
65
+ }, indent=2))
66
+ return 0
67
+
68
+
69
+ if __name__ == "__main__":
70
+ raise SystemExit(main())
web/app.js CHANGED
@@ -10,6 +10,9 @@ const fields = [
10
  let config = null;
11
  let selectedFile = null;
12
  let activePoll = null;
 
 
 
13
 
14
  function esc(value) {
15
  return String(value ?? "").replace(/[&<>'"]/g, (c) => ({ "&": "&amp;", "<": "&lt;", ">": "&gt;", "'": "&#39;", '"': "&quot;" }[c]));
@@ -134,24 +137,104 @@ function drawWaveform(overview) {
134
  ctx.fill();
135
  ctx.stroke();
136
 
137
- ctx.strokeStyle = "rgba(200,165,255,.55)";
138
- ctx.lineWidth = 1;
139
  for (const onset of overview.onsets ?? []) {
140
  const x = (onset.time_sec / Math.max(overview.duration_sec, 0.001)) * w;
 
 
 
141
  ctx.beginPath();
142
- ctx.moveTo(x, 10);
143
- ctx.lineTo(x, h - 10);
144
  ctx.stroke();
145
  }
146
  }
147
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
148
  function renderResult(job) {
149
  const result = job.result;
150
  if (!result) return;
 
 
 
 
151
  const rtf = Number(result.realtime_factor).toFixed(2);
152
  const mode = result.params?.clustering_mode ?? "—";
153
  $("resultSummary").textContent = `${result.hit_count} hits → ${result.cluster_count} samples · BPM ${result.bpm ?? "—"} · ${fmtSec(result.duration_sec)} total · ${rtf}× realtime · ${mode}`;
154
- drawWaveform(result.overview);
155
 
156
  const fileUrls = result.file_urls ?? {};
157
  const labels = { archive: "Sample pack ZIP", midi: "MIDI", stem: "Stem WAV", reconstruction: "Reconstruction WAV" };
@@ -159,18 +242,9 @@ function renderResult(job) {
159
  $("stemAudio").src = fileUrls.stem ?? "";
160
  $("reconAudio").src = fileUrls.reconstruction ?? "";
161
 
162
- const tbody = $("samplesTable").querySelector("tbody");
163
- tbody.innerHTML = (result.samples ?? []).map((sample) => `
164
- <tr>
165
- <td>${esc(sample.label)}</td>
166
- <td>${esc(sample.classification)}</td>
167
- <td>${esc(sample.hits)}</td>
168
- <td>${esc(sample.score)}</td>
169
- <td>${esc(sample.duration_ms)} ms</td>
170
- <td>${esc(sample.first_onset_sec)} s</td>
171
- <td><a href="${esc(sample.url)}" download>WAV</a></td>
172
- </tr>
173
- `).join("");
174
  }
175
 
176
  function renderJob(job) {
@@ -201,6 +275,7 @@ function renderHistory(payload) {
201
  for (const button of $("historyList").querySelectorAll(".history-row")) {
202
  button.addEventListener("click", async () => {
203
  const job = await api(`/api/jobs/${button.dataset.jobId}`);
 
204
  renderJob(job);
205
  window.scrollTo({ top: document.body.scrollHeight, behavior: "smooth" });
206
  });
@@ -216,21 +291,26 @@ async function refreshHistory() {
216
  }
217
  }
218
 
219
- async function pollJob(id) {
220
  if (activePoll) clearInterval(activePoll);
 
 
 
 
 
 
 
221
  const tick = async () => {
222
  try {
223
  const job = await api(`/api/jobs/${id}`);
224
  renderJob(job);
225
  if (["complete", "error"].includes(job.status)) {
226
- clearInterval(activePoll);
227
- activePoll = null;
228
  $("runButton").disabled = !selectedFile;
229
  await refreshHistory();
230
  }
231
  } catch (error) {
232
- clearInterval(activePoll);
233
- activePoll = null;
234
  $("runButton").disabled = !selectedFile;
235
  $("resultSummary").textContent = error.message;
236
  }
@@ -239,8 +319,32 @@ async function pollJob(id) {
239
  activePoll = setInterval(tick, 800);
240
  }
241
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
242
  async function runExtraction() {
243
  if (!selectedFile) return;
 
 
244
  $("runButton").disabled = true;
245
  $("jobPill").textContent = "uploading";
246
  $("logs").textContent = "Uploading source and starting extraction…";
@@ -250,7 +354,7 @@ async function runExtraction() {
250
  try {
251
  const job = await api("/api/jobs", { method: "POST", body: form });
252
  renderJob(job);
253
- await pollJob(job.id);
254
  await refreshHistory();
255
  } catch (error) {
256
  $("runButton").disabled = false;
@@ -269,6 +373,20 @@ function setFile(file) {
269
  }
270
  }
271
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
272
  async function boot() {
273
  try {
274
  await api("/api/health");
@@ -308,6 +426,7 @@ $("clearCacheButton").addEventListener("click", async () => {
308
  $("logs").textContent = error.message;
309
  }
310
  });
 
311
 
312
  const dropzone = $("dropzone");
313
  for (const eventName of ["dragenter", "dragover"]) {
 
10
  let config = null;
11
  let selectedFile = null;
12
  let activePoll = null;
13
+ let activeEvents = null;
14
+ let lastResult = null;
15
+ let selectedHitIndex = null;
16
 
17
  function esc(value) {
18
  return String(value ?? "").replace(/[&<>'"]/g, (c) => ({ "&": "&amp;", "<": "&lt;", ">": "&gt;", "'": "&#39;", '"': "&quot;" }[c]));
 
137
  ctx.fill();
138
  ctx.stroke();
139
 
 
 
140
  for (const onset of overview.onsets ?? []) {
141
  const x = (onset.time_sec / Math.max(overview.duration_sec, 0.001)) * w;
142
+ const selected = Number(onset.index) === Number(selectedHitIndex);
143
+ ctx.strokeStyle = selected ? "rgba(255,255,255,.95)" : "rgba(200,165,255,.55)";
144
+ ctx.lineWidth = selected ? 2.4 : 1;
145
  ctx.beginPath();
146
+ ctx.moveTo(x, selected ? 3 : 10);
147
+ ctx.lineTo(x, selected ? h - 3 : h - 10);
148
  ctx.stroke();
149
  }
150
  }
151
 
152
+ function playAudio(el, url) {
153
+ if (!url) return;
154
+ el.src = url;
155
+ el.currentTime = 0;
156
+ const promise = el.play();
157
+ if (promise && typeof promise.catch === "function") promise.catch(() => {});
158
+ }
159
+
160
+ function selectHit(index, shouldPlay = true) {
161
+ if (!lastResult) return;
162
+ const hit = (lastResult.hits ?? []).find((item) => Number(item.index) === Number(index));
163
+ if (!hit) return;
164
+ selectedHitIndex = hit.index;
165
+ $("selectedHitMeta").textContent = `#${hit.index} · ${hit.label} · ${hit.cluster_label} · ${hit.onset_sec}s · ${hit.duration_ms} ms${hit.is_representative ? " · representative" : ""}`;
166
+ if (shouldPlay) playAudio($("hitAudio"), hit.url);
167
+ for (const row of document.querySelectorAll("[data-hit-index]")) {
168
+ row.classList.toggle("selected", Number(row.dataset.hitIndex) === Number(hit.index));
169
+ }
170
+ drawWaveform(lastResult.overview);
171
+ }
172
+
173
+ function auditionSample(sample) {
174
+ $("selectedSampleMeta").textContent = `${sample.label} · ${sample.classification} · ${sample.hits} hits · score ${sample.score}`;
175
+ playAudio($("sampleAudio"), sample.url);
176
+ }
177
+
178
+ function renderSamples(result) {
179
+ const tbody = $("samplesTable").querySelector("tbody");
180
+ tbody.innerHTML = (result.samples ?? []).map((sample, i) => `
181
+ <tr data-sample-index="${i}">
182
+ <td><button class="mini-button" type="button" data-sample-audition="${i}">Audition</button></td>
183
+ <td>${esc(sample.label)}</td>
184
+ <td>${esc(sample.classification)}</td>
185
+ <td>${esc(sample.hits)}</td>
186
+ <td>${esc(sample.score)}</td>
187
+ <td>${esc(sample.duration_ms)} ms</td>
188
+ <td>${esc(sample.first_onset_sec)} s</td>
189
+ <td><a href="${esc(sample.url)}" download>WAV</a></td>
190
+ </tr>
191
+ `).join("");
192
+ for (const button of tbody.querySelectorAll("[data-sample-audition]")) {
193
+ button.addEventListener("click", (event) => {
194
+ event.stopPropagation();
195
+ const sample = result.samples[Number(button.dataset.sampleAudition)];
196
+ auditionSample(sample);
197
+ });
198
+ }
199
+ }
200
+
201
+ function renderHits(result) {
202
+ const tbody = $("hitsTable").querySelector("tbody");
203
+ const hits = result.hits ?? [];
204
+ tbody.innerHTML = hits.map((hit) => `
205
+ <tr data-hit-index="${esc(hit.index)}" class="${Number(hit.index) === Number(selectedHitIndex) ? "selected" : ""}">
206
+ <td><button class="mini-button" type="button" data-hit-audition="${esc(hit.index)}">Audition</button></td>
207
+ <td>${esc(hit.index)}</td>
208
+ <td>${esc(hit.label)}${hit.is_representative ? " ★" : ""}</td>
209
+ <td>${esc(hit.cluster_label)}</td>
210
+ <td>${esc(hit.onset_sec)} s</td>
211
+ <td>${esc(hit.duration_ms)} ms</td>
212
+ <td>${esc(hit.rms_energy)}</td>
213
+ <td><a href="${esc(hit.url)}" download>WAV</a></td>
214
+ </tr>
215
+ `).join("");
216
+ for (const row of tbody.querySelectorAll("[data-hit-index]")) {
217
+ row.addEventListener("click", () => selectHit(row.dataset.hitIndex));
218
+ }
219
+ for (const button of tbody.querySelectorAll("[data-hit-audition]")) {
220
+ button.addEventListener("click", (event) => {
221
+ event.stopPropagation();
222
+ selectHit(button.dataset.hitAudition);
223
+ });
224
+ }
225
+ if (hits.length && selectedHitIndex === null) selectHit(hits[0].index, false);
226
+ }
227
+
228
  function renderResult(job) {
229
  const result = job.result;
230
  if (!result) return;
231
+ lastResult = result;
232
+ if (!(result.hits ?? []).some((hit) => Number(hit.index) === Number(selectedHitIndex))) {
233
+ selectedHitIndex = (result.hits ?? [])[0]?.index ?? null;
234
+ }
235
  const rtf = Number(result.realtime_factor).toFixed(2);
236
  const mode = result.params?.clustering_mode ?? "—";
237
  $("resultSummary").textContent = `${result.hit_count} hits → ${result.cluster_count} samples · BPM ${result.bpm ?? "—"} · ${fmtSec(result.duration_sec)} total · ${rtf}× realtime · ${mode}`;
 
238
 
239
  const fileUrls = result.file_urls ?? {};
240
  const labels = { archive: "Sample pack ZIP", midi: "MIDI", stem: "Stem WAV", reconstruction: "Reconstruction WAV" };
 
242
  $("stemAudio").src = fileUrls.stem ?? "";
243
  $("reconAudio").src = fileUrls.reconstruction ?? "";
244
 
245
+ renderSamples(result);
246
+ renderHits(result);
247
+ drawWaveform(result.overview);
 
 
 
 
 
 
 
 
 
248
  }
249
 
250
  function renderJob(job) {
 
275
  for (const button of $("historyList").querySelectorAll(".history-row")) {
276
  button.addEventListener("click", async () => {
277
  const job = await api(`/api/jobs/${button.dataset.jobId}`);
278
+ selectedHitIndex = null;
279
  renderJob(job);
280
  window.scrollTo({ top: document.body.scrollHeight, behavior: "smooth" });
281
  });
 
291
  }
292
  }
293
 
294
+ function stopWatchers() {
295
  if (activePoll) clearInterval(activePoll);
296
+ activePoll = null;
297
+ if (activeEvents) activeEvents.close();
298
+ activeEvents = null;
299
+ }
300
+
301
+ async function pollJob(id) {
302
+ stopWatchers();
303
  const tick = async () => {
304
  try {
305
  const job = await api(`/api/jobs/${id}`);
306
  renderJob(job);
307
  if (["complete", "error"].includes(job.status)) {
308
+ stopWatchers();
 
309
  $("runButton").disabled = !selectedFile;
310
  await refreshHistory();
311
  }
312
  } catch (error) {
313
+ stopWatchers();
 
314
  $("runButton").disabled = !selectedFile;
315
  $("resultSummary").textContent = error.message;
316
  }
 
319
  activePoll = setInterval(tick, 800);
320
  }
321
 
322
+ async function watchJob(id) {
323
+ if (!("EventSource" in window)) return pollJob(id);
324
+ stopWatchers();
325
+ return new Promise((resolve) => {
326
+ activeEvents = new EventSource(`/api/jobs/${id}/events`);
327
+ activeEvents.addEventListener("job", async (event) => {
328
+ const job = JSON.parse(event.data);
329
+ renderJob(job);
330
+ if (["complete", "error"].includes(job.status)) {
331
+ stopWatchers();
332
+ $("runButton").disabled = !selectedFile;
333
+ await refreshHistory();
334
+ resolve();
335
+ }
336
+ });
337
+ activeEvents.onerror = () => {
338
+ stopWatchers();
339
+ pollJob(id).then(resolve);
340
+ };
341
+ });
342
+ }
343
+
344
  async function runExtraction() {
345
  if (!selectedFile) return;
346
+ selectedHitIndex = null;
347
+ lastResult = null;
348
  $("runButton").disabled = true;
349
  $("jobPill").textContent = "uploading";
350
  $("logs").textContent = "Uploading source and starting extraction…";
 
354
  try {
355
  const job = await api("/api/jobs", { method: "POST", body: form });
356
  renderJob(job);
357
+ await watchJob(job.id);
358
  await refreshHistory();
359
  } catch (error) {
360
  $("runButton").disabled = false;
 
373
  }
374
  }
375
 
376
+ function selectNearestWaveformHit(event) {
377
+ if (!lastResult?.overview?.onsets?.length) return;
378
+ const rect = $("waveform").getBoundingClientRect();
379
+ const ratio = Math.min(1, Math.max(0, (event.clientX - rect.left) / Math.max(1, rect.width)));
380
+ const time = ratio * Math.max(lastResult.overview.duration_sec, 0.001);
381
+ let best = null;
382
+ let bestDelta = Infinity;
383
+ for (const onset of lastResult.overview.onsets) {
384
+ const delta = Math.abs(Number(onset.time_sec) - time);
385
+ if (delta < bestDelta) { best = onset; bestDelta = delta; }
386
+ }
387
+ if (best) selectHit(best.index);
388
+ }
389
+
390
  async function boot() {
391
  try {
392
  await api("/api/health");
 
426
  $("logs").textContent = error.message;
427
  }
428
  });
429
+ $("waveform").addEventListener("click", selectNearestWaveformHit);
430
 
431
  const dropzone = $("dropzone");
432
  for (const eventName of ["dragenter", "dragover"]) {
web/index.html CHANGED
@@ -175,15 +175,46 @@
175
  <label>Stem audio<audio id="stemAudio" controls></audio></label>
176
  <label>Reconstruction<audio id="reconAudio" controls></audio></label>
177
  </div>
178
- <div class="table-wrap">
179
- <table id="samplesTable">
180
- <thead>
181
- <tr>
182
- <th>Sample</th><th>Class</th><th>Hits</th><th>Score</th><th>Duration</th><th>First hit</th><th>File</th>
183
- </tr>
184
- </thead>
185
- <tbody></tbody>
186
- </table>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
187
  </div>
188
  </section>
189
  </main>
 
175
  <label>Stem audio<audio id="stemAudio" controls></audio></label>
176
  <label>Reconstruction<audio id="reconAudio" controls></audio></label>
177
  </div>
178
+ <div class="review-grid">
179
+ <article class="review-card">
180
+ <strong>Selected hit</strong>
181
+ <span id="selectedHitMeta">Click an onset marker or hit row to audition the detected slice.</span>
182
+ <audio id="hitAudio" controls></audio>
183
+ </article>
184
+ <article class="review-card">
185
+ <strong>Selected sample</strong>
186
+ <span id="selectedSampleMeta">Click Audition in the sample table to hear the representative sample.</span>
187
+ <audio id="sampleAudio" controls></audio>
188
+ </article>
189
+ </div>
190
+ <div class="result-columns">
191
+ <section>
192
+ <h3>Representative samples</h3>
193
+ <div class="table-wrap">
194
+ <table id="samplesTable">
195
+ <thead>
196
+ <tr>
197
+ <th>Audition</th><th>Sample</th><th>Class</th><th>Hits</th><th>Score</th><th>Duration</th><th>First hit</th><th>File</th>
198
+ </tr>
199
+ </thead>
200
+ <tbody></tbody>
201
+ </table>
202
+ </div>
203
+ </section>
204
+ <section>
205
+ <h3>Detected hit review</h3>
206
+ <p class="subtle">Every detected slice is exported under <code>review/hits/</code>. Click rows or waveform markers to audition.</p>
207
+ <div class="table-wrap hit-table-wrap">
208
+ <table id="hitsTable">
209
+ <thead>
210
+ <tr>
211
+ <th>Audition</th><th>#</th><th>Label</th><th>Cluster</th><th>Onset</th><th>Duration</th><th>Energy</th><th>File</th>
212
+ </tr>
213
+ </thead>
214
+ <tbody></tbody>
215
+ </table>
216
+ </div>
217
+ </section>
218
  </div>
219
  </section>
220
  </main>
web/styles.css CHANGED
@@ -86,3 +86,16 @@ tr:last-child td { border-bottom: 0; }
86
  .history-row span:not(:first-child) { color: #dbe5f7; font-size: 12px; font-variant-numeric: tabular-nums; }
87
  .empty { color: var(--muted); margin: 0; }
88
  @media (max-width: 680px) { .history-row { grid-template-columns: 1fr 1fr; } }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
86
  .history-row span:not(:first-child) { color: #dbe5f7; font-size: 12px; font-variant-numeric: tabular-nums; }
87
  .empty { color: var(--muted); margin: 0; }
88
  @media (max-width: 680px) { .history-row { grid-template-columns: 1fr 1fr; } }
89
+ h3 { margin: 0 0 10px; font-size: 16px; letter-spacing: -.015em; }
90
+ .subtle { margin: -4px 0 12px; color: var(--muted); font-size: 13px; }
91
+ .review-grid { display: grid; grid-template-columns: repeat(2, minmax(0, 1fr)); gap: 16px; margin: 0 0 18px; }
92
+ .review-card { border: 1px solid var(--line); border-radius: 20px; background: rgba(0,0,0,.16); padding: 14px; }
93
+ .review-card strong, .review-card span { display: block; }
94
+ .review-card span { color: var(--muted); font-size: 13px; margin-top: 5px; line-height: 1.4; }
95
+ .result-columns { display: grid; grid-template-columns: minmax(0, 1fr); gap: 20px; }
96
+ .hit-table-wrap { max-height: 420px; }
97
+ .mini-button { padding: 7px 10px; border-radius: 999px; background: rgba(255,255,255,.08); border: 1px solid var(--line); color: var(--text); font-size: 12px; font-weight: 800; }
98
+ tr.selected td { background: rgba(139,211,255,.12); }
99
+ tr[data-hit-index] { cursor: pointer; }
100
+ tr[data-hit-index]:hover td { background: rgba(255,255,255,.045); }
101
+ @media (max-width: 760px) { .review-grid { grid-template-columns: 1fr; } }