ChatGPT commited on
Commit
0b5f0f0
·
1 Parent(s): e33cc90

feat: add full-context reproduction and clearer controls

Browse files
README.md CHANGED
@@ -12,7 +12,7 @@ pinned: false
12
 
13
  A custom FastAPI + browser workstation for extracting, reviewing, and now semantically supervising reusable drum samples from an audio file.
14
 
15
- The pipeline can isolate a stem with Demucs, detect onsets, classify hits, cluster similar transients, choose representative samples, optionally synthesize alternate samples, and export WAVs, MIDI, reconstruction audio, manifests, and a complete ZIP sample pack. The interactive layer stores user corrections as replayable semantic state beside each run manifest.
16
 
17
  ## Current status
18
 
@@ -73,6 +73,7 @@ See:
73
  - `docs/REMAINING_WORK.md`
74
  - `docs/SUPERVISED_EXPORT_AND_FORCE_ONSET.md`
75
  - `docs/FIXED_WORKSTATION_UI.md`
 
76
 
77
  ## Run locally
78
 
@@ -155,7 +156,7 @@ curl http://127.0.0.1:7860/api/jobs
155
  | `sample_extractor.py` | Core DSP/sample extraction implementation |
156
  | `supervised_state.py` | Persistent semantic state, confidence, constraints, events, suggestions, force-onset, restore, undo |
157
  | `supervised_export.py` | Renders edited semantic state into supervised WAV/MIDI/reconstruction/ZIP artifacts |
158
- | `web/` | Custom no-build browser frontend with fixed non-scrolling workstation layout, explicit upload/whole-page drag-drop, sidebars, bottom dock, sample-card grid, hidden-audio audition, add-onset mode, and edited export |
159
  | `scripts/benchmark_subprocesses.py` | Synthetic benchmark runner for stage timings |
160
  | `scripts/test_interactive_supervision.py` | Smoke test for supervised state endpoints |
161
  | `scripts/test_supervised_export_and_force_onset.py` | Smoke test for force-onset, restore, suggestion diffs, and edited exports |
@@ -180,6 +181,7 @@ Each run is stored under `.runs/<job-id>/output/`:
180
  - `supervised/samples/*.wav` after edited export
181
  - `supervised/reconstruction.mid` after edited export
182
  - `supervised/reconstruction.wav` after edited export
 
183
 
184
  Generated runtime directories are ignored by git:
185
 
 
12
 
13
  A custom FastAPI + browser workstation for extracting, reviewing, and now semantically supervising reusable drum samples from an audio file.
14
 
15
+ The pipeline can isolate a stem with Demucs, detect onsets, classify hits, cluster similar transients, choose representative samples, optionally synthesize alternate samples, and export WAVs, MIDI, target-stem reconstruction, full-context reproduced audio, manifests, and a complete ZIP sample pack. The interactive layer stores user corrections as replayable semantic state beside each run manifest.
16
 
17
  ## Current status
18
 
 
73
  - `docs/REMAINING_WORK.md`
74
  - `docs/SUPERVISED_EXPORT_AND_FORCE_ONSET.md`
75
  - `docs/FIXED_WORKSTATION_UI.md`
76
+ - `docs/REPRODUCED_AUDIO_AND_PARAMETERS.md`
77
 
78
  ## Run locally
79
 
 
156
  | `sample_extractor.py` | Core DSP/sample extraction implementation |
157
  | `supervised_state.py` | Persistent semantic state, confidence, constraints, events, suggestions, force-onset, restore, undo |
158
  | `supervised_export.py` | Renders edited semantic state into supervised WAV/MIDI/reconstruction/ZIP artifacts |
159
+ | `web/` | Custom no-build browser frontend with fixed non-scrolling workstation layout, explicit upload/whole-page drag-drop, source/stem/reproduced preview transport, common/advanced parameter separation, sidebars, bottom dock, sample-card grid, hidden-audio audition, add-onset mode, and edited export |
160
  | `scripts/benchmark_subprocesses.py` | Synthetic benchmark runner for stage timings |
161
  | `scripts/test_interactive_supervision.py` | Smoke test for supervised state endpoints |
162
  | `scripts/test_supervised_export_and_force_onset.py` | Smoke test for force-onset, restore, suggestion diffs, and edited exports |
 
181
  - `supervised/samples/*.wav` after edited export
182
  - `supervised/reconstruction.mid` after edited export
183
  - `supervised/reconstruction.wav` after edited export
184
+ - `source.wav`, `context_bed.wav`, and `target_reconstruction.wav` for source/stem/reproduced A/B previews
185
 
186
  Generated runtime directories are ignored by git:
187
 
app.py CHANGED
@@ -48,7 +48,7 @@ WEB_DIR = ROOT / "web"
48
  RUNS_DIR = ROOT / ".runs"
49
  RUNS_DIR.mkdir(exist_ok=True)
50
 
51
- app = FastAPI(title="Drum Sample Extractor", version="12.0.0")
52
  app.add_middleware(
53
  CORSMiddleware,
54
  allow_origins=["*"],
 
48
  RUNS_DIR = ROOT / ".runs"
49
  RUNS_DIR.mkdir(exist_ok=True)
50
 
51
+ app = FastAPI(title="Drum Sample Extractor", version="13.0.0")
52
  app.add_middleware(
53
  CORSMiddleware,
54
  allow_origins=["*"],
docs/API.md CHANGED
@@ -134,7 +134,7 @@ Completed jobs contain:
134
  | `samples` | Representative sample rows with score, duration, first onset, and playback/download URL. |
135
  | `hits` | Per-detected-hit review rows with onset, duration, label, cluster, representative flag, and playback/download URL. |
136
  | `overview` | Decimated envelope and clickable onset markers for waveform display. |
137
- | `files` | Relative artifact paths. |
138
  | `file_urls` | Direct API URLs for top-level artifacts. |
139
 
140
  ## `GET /api/jobs/{job_id}/events`
@@ -154,6 +154,18 @@ data: {"id":"58ca0db4ac74","status":"running","stages":[...]}
154
 
155
  The stream closes after `complete` or `error`. Completed historical jobs emit one final `job` event and close.
156
 
 
 
 
 
 
 
 
 
 
 
 
 
157
  ## `GET /api/jobs/{job_id}/files/{relative_path}`
158
 
159
  Downloads an artifact from a completed job.
@@ -163,6 +175,8 @@ Examples:
163
  ```bash
164
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/sample-pack.zip
165
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/reconstruction.mid
 
 
166
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/samples/hihat_open_0.wav
167
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/review/hits/hit_00000_kick.wav
168
  ```
@@ -313,6 +327,7 @@ Response shape:
313
  "files": {
314
  "archive": "supervised/sample-pack.zip",
315
  "midi": "supervised/reconstruction.mid",
 
316
  "reconstruction": "supervised/reconstruction.wav"
317
  },
318
  "file_urls": {}
 
134
  | `samples` | Representative sample rows with score, duration, first onset, and playback/download URL. |
135
  | `hits` | Per-detected-hit review rows with onset, duration, label, cluster, representative flag, and playback/download URL. |
136
  | `overview` | Decimated envelope and clickable onset markers for waveform display. |
137
+ | `files` | Relative artifact paths. Includes `source`, `stem`, `context_bed`, `target_reconstruction`, `reconstruction`, `midi`, and `archive` when available. |
138
  | `file_urls` | Direct API URLs for top-level artifacts. |
139
 
140
  ## `GET /api/jobs/{job_id}/events`
 
154
 
155
  The stream closes after `complete` or `error`. Completed historical jobs emit one final `job` event and close.
156
 
157
+ ## Top-level artifact meanings
158
+
159
+ | Key | Path | Meaning |
160
+ |---|---|---|
161
+ | `source` | `source.wav` | Normalized source mix used for source preview. |
162
+ | `stem` | `stem.wav` | Target stem being sampled. |
163
+ | `context_bed` | `context_bed.wav` | Non-target stems/context bed; silent for `stem=all`. |
164
+ | `target_reconstruction` | `target_reconstruction.wav` | Sample-triggered reconstruction of only the target stem. |
165
+ | `reconstruction` | `reconstruction.wav` | Full-context reproduced mix: context bed plus target reconstruction. |
166
+ | `midi` | `reconstruction.mid` | MIDI trigger reconstruction. |
167
+ | `archive` | `sample-pack.zip` | Complete sample pack and reproduction artifacts. |
168
+
169
  ## `GET /api/jobs/{job_id}/files/{relative_path}`
170
 
171
  Downloads an artifact from a completed job.
 
175
  ```bash
176
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/sample-pack.zip
177
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/reconstruction.mid
178
+ curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/reconstruction.wav
179
+ curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/target_reconstruction.wav
180
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/samples/hihat_open_0.wav
181
  curl -O http://127.0.0.1:7860/api/jobs/58ca0db4ac74/files/review/hits/hit_00000_kick.wav
182
  ```
 
327
  "files": {
328
  "archive": "supervised/sample-pack.zip",
329
  "midi": "supervised/reconstruction.mid",
330
+ "target_reconstruction": "supervised/target_reconstruction.wav",
331
  "reconstruction": "supervised/reconstruction.wav"
332
  },
333
  "file_urls": {}
docs/FEATURES.md CHANGED
@@ -4,7 +4,7 @@ Last updated: 2026-05-12
4
 
5
  ## Product goal
6
 
7
- Turn an input audio file into a practical drum sample pack: detected hits, grouped sample classes, representative WAVs, optional synthesized alternates, MIDI reconstruction, rendered reconstruction audio, and an inspectable manifest.
8
 
9
  ## Implemented features
10
 
@@ -14,8 +14,8 @@ Turn an input audio file into a practical drum sample pack: detected hits, group
14
  | UI | Explicit upload button | Implemented | Top bar contains a visible `Upload audio` control. |
15
  | UI | Whole-app drag/drop audio upload | Implemented | Dropping files anywhere on the app selects the file and shows a drop overlay during drag. |
16
  | UI | Fixed non-scrolling workstation layout | Implemented | Body is viewport-locked; tools live in left/right sidebars and a bottom dock; long content scrolls inside panels only. |
17
- | UI | Minimal custom transport | Implemented | One image-aligned play/time/progress row drives source preview before extraction and stem preview after extraction. |
18
- | UI | Pipeline controls | Implemented | Stem/model/onset/clustering/MIDI/synthesis/cache controls. |
19
  | UI | Streaming progress | Implemented | Uses `EventSource` over `GET /api/jobs/{id}/events`, with polling fallback. |
20
  | UI | Waveform/onset overview | Implemented | Canvas envelope plus clickable onset markers from `manifest.json`. |
21
  | UI | Result downloads | Implemented | ZIP, MIDI, stem WAV, reconstruction WAV, individual sample WAVs, and per-hit review WAVs. |
@@ -37,9 +37,10 @@ Turn an input audio file into a practical drum sample pack: detected hits, group
37
  | Pipeline | Representative selection | Implemented | Quality score picks best hit per cluster. |
38
  | Pipeline | Optional synthesis | Implemented | Weighted aligned average for multi-hit clusters. |
39
  | Pipeline | MIDI export | Implemented | Quantized or unquantized reconstruction MIDI. |
40
- | Pipeline | Reconstruction render | Implemented | Renders MIDI-like reconstruction using selected samples. |
 
41
  | Pipeline | Per-hit review export | Implemented | Writes every accepted detected hit to `review/hits/*.wav` and records rows in the manifest. |
42
- | Pipeline | Sample pack ZIP | Implemented | Includes WAVs, index JSON, MIDI, rendered reconstruction. |
43
  | Supervision | Edited artifact re-export | Implemented | `supervised_export.py` writes edited samples, MIDI, reconstruction, ZIP, and `supervised/manifest.json`. |
44
  | Supervision | Force-onset from waveform | Implemented | Adds user-forced hit slices from cached `stem.wav`; UI add-onset mode posts to `/hits/force-onset`. |
45
  | Supervision | Suppressed-hit restore | Implemented | Restore endpoint and UI button reverse suppression without undoing unrelated edits. |
 
4
 
5
  ## Product goal
6
 
7
+ Turn an input audio file into a practical drum sample pack: detected hits, grouped sample classes, representative WAVs, optional synthesized alternates, MIDI reconstruction, target-stem reconstruction, full-context reproduced audio, and an inspectable manifest.
8
 
9
  ## Implemented features
10
 
 
14
  | UI | Explicit upload button | Implemented | Top bar contains a visible `Upload audio` control. |
15
  | UI | Whole-app drag/drop audio upload | Implemented | Dropping files anywhere on the app selects the file and shows a drop overlay during drag. |
16
  | UI | Fixed non-scrolling workstation layout | Implemented | Body is viewport-locked; tools live in left/right sidebars and a bottom dock; long content scrolls inside panels only. |
17
+ | UI | Minimal custom transport | Implemented | One play/time/progress row can audition Source, Stem, or Reproduced previews; completed runs default to Reproduced. |
18
+ | UI | Common vs advanced parameters | Implemented | Common controls show stem, sensitivity, group count, and two presets; advanced model/DSP/export controls are grouped by pipeline stage. |
19
  | UI | Streaming progress | Implemented | Uses `EventSource` over `GET /api/jobs/{id}/events`, with polling fallback. |
20
  | UI | Waveform/onset overview | Implemented | Canvas envelope plus clickable onset markers from `manifest.json`. |
21
  | UI | Result downloads | Implemented | ZIP, MIDI, stem WAV, reconstruction WAV, individual sample WAVs, and per-hit review WAVs. |
 
37
  | Pipeline | Representative selection | Implemented | Quality score picks best hit per cluster. |
38
  | Pipeline | Optional synthesis | Implemented | Weighted aligned average for multi-hit clusters. |
39
  | Pipeline | MIDI export | Implemented | Quantized or unquantized reconstruction MIDI. |
40
+ | Pipeline | Target reconstruction render | Implemented | Renders the selected sample representatives from MIDI/onset timing and matches RMS to the target stem. |
41
+ | Pipeline | Full-context reproduced mix | Implemented | Writes `reconstruction.wav` as non-target context bed plus target reconstruction, so separated-stem runs incorporate all other stems. |
42
  | Pipeline | Per-hit review export | Implemented | Writes every accepted detected hit to `review/hits/*.wav` and records rows in the manifest. |
43
+ | Pipeline | Sample pack ZIP | Implemented | Includes WAVs, index JSON, MIDI, full-context reproduced mix, and target-stem reconstruction. |
44
  | Supervision | Edited artifact re-export | Implemented | `supervised_export.py` writes edited samples, MIDI, reconstruction, ZIP, and `supervised/manifest.json`. |
45
  | Supervision | Force-onset from waveform | Implemented | Adds user-forced hit slices from cached `stem.wav`; UI add-onset mode posts to `/hits/force-onset`. |
46
  | Supervision | Suppressed-hit restore | Implemented | Restore endpoint and UI button reverse suppression without undoing unrelated edits. |
docs/FIXED_WORKSTATION_UI.md CHANGED
@@ -12,8 +12,8 @@ The web app should behave like a compact workstation rather than a long document
12
  |---|---|---|
13
  | Top bar | App title, explicit `Upload audio` button, selected-file metadata, backend status, primary `Extract Samples` action | Fixed height; no scroll |
14
  | Left sidebar | Source/drop guidance, selected-hit/sample context, pipeline stages/logs, run history | Sidebar-internal scroll only |
15
- | Center workspace | Large waveform/transport and representative sample cards | Sample grid scrolls internally when needed |
16
- | Right sidebar | Core extraction controls, exports, advanced model/DSP settings | Sidebar-internal scroll only |
17
  | Bottom bar | Interactive review/edit tools and raw tables | Panel-internal scroll only |
18
 
19
  The document itself is locked with `overflow: hidden`; long content is constrained to the relevant tool panel.
@@ -45,3 +45,11 @@ Static checks added/performed for this pass:
45
  - Duplicate id check for `web/index.html`
46
  - Python compile check for active Python files
47
  - FastAPI extraction smoke test
 
 
 
 
 
 
 
 
 
12
  |---|---|---|
13
  | Top bar | App title, explicit `Upload audio` button, selected-file metadata, backend status, primary `Extract Samples` action | Fixed height; no scroll |
14
  | Left sidebar | Source/drop guidance, selected-hit/sample context, pipeline stages/logs, run history | Sidebar-internal scroll only |
15
+ | Center workspace | Large waveform, Source/Stem/Reproduced transport, and representative sample cards | Sample grid scrolls internally when needed |
16
+ | Right sidebar | Common extraction controls, exports, and advanced parameters grouped by stage | Sidebar-internal scroll only |
17
  | Bottom bar | Interactive review/edit tools and raw tables | Panel-internal scroll only |
18
 
19
  The document itself is locked with `overflow: hidden`; long content is constrained to the relevant tool panel.
 
45
  - Duplicate id check for `web/index.html`
46
  - Python compile check for active Python files
47
  - FastAPI extraction smoke test
48
+
49
+
50
+ ## Pass 9 additions
51
+
52
+ - The center transport now exposes explicit `Source`, `Stem`, and `Reproduced` preview modes.
53
+ - The right sidebar now separates the normal workflow from advanced tuning.
54
+ - Common controls are limited to stem choice, hit sensitivity, sample group count, and two presets.
55
+ - Advanced parameters are grouped into stem separation, hit detection, grouping, export/cache sections.
docs/PROGRESS.md CHANGED
@@ -259,3 +259,27 @@ Completed in this pass:
259
  Outcome:
260
 
261
  The UI no longer behaves like a scrollable webpage. It now behaves like a compact desktop-style sample extraction workstation with simple expandable tool panels around a central waveform/sample workspace.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
259
  Outcome:
260
 
261
  The UI no longer behaves like a scrollable webpage. It now behaves like a compact desktop-style sample extraction workstation with simple expandable tool panels around a central waveform/sample workspace.
262
+
263
+ ## Pass 9: full-context reproduced audio and clearer parameters
264
+
265
+ Completed in this pass:
266
+
267
+ 1. Added explicit source/context/reconstruction layers to the pipeline export:
268
+ - `source.wav`
269
+ - `stem.wav`
270
+ - `context_bed.wav`
271
+ - `target_reconstruction.wav`
272
+ - `reconstruction.wav`
273
+ 2. Changed `reconstruction.wav` to a full-context reproduced mix: non-target context bed plus sample-triggered target reconstruction.
274
+ 3. Kept `target_reconstruction.wav` as the isolated sample-only target layer for debugging and focused listening.
275
+ 4. Matched the target reconstruction RMS to the target stem before mixing it back into context.
276
+ 5. Updated `sample-pack.zip` to include both the full-context reproduced mix and target-stem reconstruction.
277
+ 6. Updated supervised edited export so edited packs follow the same audio-layer model under `supervised/`.
278
+ 7. Added Source / Stem / Reproduced preview modes to the single transport row; completed jobs default to Reproduced.
279
+ 8. Reworked the right sidebar into Common controls vs Advanced parameters.
280
+ 9. Grouped advanced parameters by pipeline stage: stem separation, hit detection, grouping, export/cache.
281
+ 10. Added `docs/REPRODUCED_AUDIO_AND_PARAMETERS.md`.
282
+
283
+ Outcome:
284
+
285
+ The app is easier to understand for normal use: the main right-side controls are now only the few controls users are likely to touch repeatedly, while lower-level DSP/model controls stay available but grouped by stage. Reproduced audio is now useful for musical judgment because separated-stem runs are previewed inside the rest of the mix rather than as an isolated sample-triggered stem only.
docs/REMAINING_WORK.md CHANGED
@@ -8,12 +8,15 @@ The project is now a usable extraction workstation, not a complete interactive s
8
 
9
  ## Highest-priority remaining gaps
10
 
 
 
11
  1. **Cluster editing**: allow merge, split, relabel, and manual reassignment of groups from the `Review & edit` workbench.
12
  2. **Waveform editing depth**: add onset drag/shift, hit trim boundaries, and rerun-from-edited-onsets without redoing Demucs.
13
  3. **Run comparison**: compare two manifests side-by-side for parameter tuning.
14
  4. **Lower-level progress**: expose internal Demucs/clustering progress where libraries make that possible.
15
  5. **Frontend engineering hardening**: migrate the frontend to TypeScript after the UX stabilizes and add browser-level tests.
16
  6. **Benchmark panel**: add an in-app benchmark view that can run synthetic fixtures and compare parameter profiles.
 
17
 
18
  ## Known constraints
19
 
 
8
 
9
  ## Highest-priority remaining gaps
10
 
11
+ Completed since the previous snapshot: reproduced audio now incorporates non-target context stems, and common/advanced parameters are separated in the right sidebar.
12
+
13
  1. **Cluster editing**: allow merge, split, relabel, and manual reassignment of groups from the `Review & edit` workbench.
14
  2. **Waveform editing depth**: add onset drag/shift, hit trim boundaries, and rerun-from-edited-onsets without redoing Demucs.
15
  3. **Run comparison**: compare two manifests side-by-side for parameter tuning.
16
  4. **Lower-level progress**: expose internal Demucs/clustering progress where libraries make that possible.
17
  5. **Frontend engineering hardening**: migrate the frontend to TypeScript after the UX stabilizes and add browser-level tests.
18
  6. **Benchmark panel**: add an in-app benchmark view that can run synthetic fixtures and compare parameter profiles.
19
+ 7. **Reproduction diagnostics**: add source-vs-reproduced A/B error visualization and region ranking.
20
 
21
  ## Known constraints
22
 
docs/REPRODUCED_AUDIO_AND_PARAMETERS.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Reproduced audio and parameter hierarchy
2
+
3
+ Last updated: 2026-05-12
4
+
5
+ ## Goal
6
+
7
+ The preview called “reproduced audio” should be understandable and useful for judging whether the extracted samples reproduce the source. It should not only play the extracted target stem in isolation when the user extracted a separated stem.
8
+
9
+ ## Implemented audio model
10
+
11
+ The pipeline now writes three different preview layers per run:
12
+
13
+ | File | Meaning | UI label |
14
+ |---|---|---|
15
+ | `source.wav` | The normalized source mix used as preview context. | Source mix WAV |
16
+ | `stem.wav` | The target audio being sampled, for example the separated drums stem. | Target stem WAV |
17
+ | `context_bed.wav` | The non-target context bed: source mix minus target stem. For `stem=all`, this is silence. | Non-target stems WAV |
18
+ | `target_reconstruction.wav` | The selected sample representatives rendered from MIDI/onset timing, matched to the target stem’s RMS. | Target reconstruction WAV |
19
+ | `reconstruction.wav` | Full-context reproduced mix: `context_bed.wav + target_reconstruction.wav`, soft-limited. | Reproduced mix WAV |
20
+
21
+ For separated-stem runs, `reconstruction.wav` therefore incorporates the other stems instead of only replaying the drum/sample layer. This makes A/B listening more useful: the user hears the extracted sample layer inside the rest of the track.
22
+
23
+ ## Archive behavior
24
+
25
+ `sample-pack.zip` now includes:
26
+
27
+ - `rendered_reproduction_full_mix.wav` — the full-context reproduced mix.
28
+ - `rendered_reconstruction_target_stem.wav` — the sample-only target reconstruction.
29
+ - `rendered_reconstruction.wav` — backwards-compatible alias for the full-context reproduced mix.
30
+ - `reconstruction.mid` — the MIDI trigger reconstruction.
31
+ - `samples/*.wav` — representative samples.
32
+ - `index.json` — sample metadata and reproduction file references.
33
+
34
+ ## Supervised edited export behavior
35
+
36
+ Edited exports under `supervised/` follow the same model:
37
+
38
+ - `supervised/target_reconstruction.wav`
39
+ - `supervised/reconstruction.wav`
40
+ - `supervised/sample-pack.zip`
41
+
42
+ The edited full-context reproduction reuses `context_bed.wav`, so semantic edits affect the sampled target layer while the non-target stems remain available for context.
43
+
44
+ ## UI parameter hierarchy
45
+
46
+ The right sidebar now separates controls by frequency of use.
47
+
48
+ ### Common controls
49
+
50
+ These are visible by default:
51
+
52
+ 1. **Stem to sample** — choose the target source, usually `drums` or `all`.
53
+ 2. **Hit sensitivity** — tune onset density.
54
+ 3. **Sample groups** — choose the approximate maximum number of sample cards.
55
+ 4. **Fast preview** preset — full mix + online clustering + no Demucs shifts.
56
+ 5. **Best quality** preset — separated drums + batch clustering + conservative thresholds.
57
+
58
+ ### Advanced parameters
59
+
60
+ Advanced controls are hidden in a collapsed panel and grouped by stage:
61
+
62
+ - Stem separation: model, shifts, overlap.
63
+ - Hit detection: onset mode, energy floor, gap, padding, min/max duration.
64
+ - Grouping: clustering mode, minimum groups, transient/mel/linkage thresholds.
65
+ - Export and cache: MIDI grid, synthesis, quantization, disk cache, cache clear.
66
+
67
+ ## Preview transport
68
+
69
+ The single transport row now has explicit preview modes:
70
+
71
+ - **Source** — original source mix.
72
+ - **Stem** — target stem being sampled.
73
+ - **Reproduced** — full-context reproduced mix after extraction.
74
+
75
+ After a successful extraction, the transport switches to **Reproduced** by default because that is the most relevant quality check.
76
+
77
+ ## Remaining audio-quality work
78
+
79
+ - Add a true A/B difference view between `source.wav` and `reconstruction.wav`.
80
+ - Compute per-region reconstruction error and expose the worst mismatch regions.
81
+ - Avoid residual subtraction artifacts by optionally caching explicit Demucs non-target stem sums when the full source separation bundle is available.
82
+ - Add loudness matching using LUFS instead of RMS for longer previews.
docs/SUPERVISED_EXPORT_AND_FORCE_ONSET.md CHANGED
@@ -11,6 +11,7 @@ manifest.json + review/hits/*.wav + supervision_state.json
11
  → supervised/manifest.json
12
  → supervised/samples/*.wav
13
  → supervised/reconstruction.mid
 
14
  → supervised/reconstruction.wav
15
  → supervised/sample-pack.zip
16
  ```
@@ -27,7 +28,7 @@ manifest.json + review/hits/*.wav + supervision_state.json
27
  | Suppressed-hit restore | Implemented | `POST /api/jobs/{job_id}/hits/{hit_id}/restore` |
28
  | Exact suggestion diff preview | Implemented | `suggestion.diff` in state responses and UI diff button. |
29
  | UI add-onset mode | Implemented | Toggle in supervision header; waveform clicks add forced hits. |
30
- | UI edited export downloads | Implemented | Edited ZIP/MIDI/reconstruction links render after export. |
31
 
32
  ## Export behavior
33
 
@@ -39,7 +40,7 @@ The supervised export builds clusters from current semantic state:
39
  4. Preserve forced hits and moved/pulled hits through current cluster membership.
40
  5. Pick representatives from semantic `representative_hit_id` or favorite hits first.
41
  6. Quality-score representatives only for unpinned clusters.
42
- 7. Write edited samples, MIDI, reconstruction WAV, ZIP, and `supervised/manifest.json`.
43
  8. Append a `supervised.exported` event and `latest_export` entry to `supervision_state.json`.
44
 
45
  The original `manifest.json`, original `sample-pack.zip`, and original `samples/*.wav` are not modified.
@@ -108,3 +109,14 @@ This test verifies:
108
  - supervised export creation,
109
  - artifact download URLs for edited ZIP/MIDI/reconstruction,
110
  - latest export state metadata.
 
 
 
 
 
 
 
 
 
 
 
 
11
  → supervised/manifest.json
12
  → supervised/samples/*.wav
13
  → supervised/reconstruction.mid
14
+ → supervised/target_reconstruction.wav
15
  → supervised/reconstruction.wav
16
  → supervised/sample-pack.zip
17
  ```
 
28
  | Suppressed-hit restore | Implemented | `POST /api/jobs/{job_id}/hits/{hit_id}/restore` |
29
  | Exact suggestion diff preview | Implemented | `suggestion.diff` in state responses and UI diff button. |
30
  | UI add-onset mode | Implemented | Toggle in supervision header; waveform clicks add forced hits. |
31
+ | UI edited export downloads | Implemented | Edited ZIP/MIDI/target-reconstruction/full-context-reproduction links render after export. |
32
 
33
  ## Export behavior
34
 
 
40
  4. Preserve forced hits and moved/pulled hits through current cluster membership.
41
  5. Pick representatives from semantic `representative_hit_id` or favorite hits first.
42
  6. Quality-score representatives only for unpinned clusters.
43
+ 7. Write edited samples, MIDI, target reconstruction WAV, full-context reproduction WAV, ZIP, and `supervised/manifest.json`.
44
  8. Append a `supervised.exported` event and `latest_export` entry to `supervision_state.json`.
45
 
46
  The original `manifest.json`, original `sample-pack.zip`, and original `samples/*.wav` are not modified.
 
109
  - supervised export creation,
110
  - artifact download URLs for edited ZIP/MIDI/reconstruction,
111
  - latest export state metadata.
112
+
113
+
114
+ ## Reproduced-audio update
115
+
116
+ Supervised export now mirrors the batch export audio model:
117
+
118
+ - `supervised/target_reconstruction.wav` is the edited sample-triggered target layer.
119
+ - `supervised/reconstruction.wav` is the edited full-context reproduced mix.
120
+ - The full-context mix reuses the original run’s `context_bed.wav`, then adds the edited target reconstruction.
121
+
122
+ This keeps the original batch artifacts immutable while making edited exports useful for listening inside the whole track context.
docs/TASKS.md CHANGED
@@ -72,7 +72,7 @@ Last updated: 2026-05-12
72
  | Add pull hit into new cluster | Done | `POST /api/jobs/{job_id}/hits/{hit_id}/pull-out`. |
73
  | Add cluster lock/unlock | Done | `POST /api/jobs/{job_id}/clusters/{cluster_id}/lock`. |
74
  | Add suppress hit as bleed/noise | Done | `POST /api/jobs/{job_id}/hits/{hit_id}/suppress`. |
75
- | Add accept/favorite hit action | Done/Partial | `POST /api/jobs/{job_id}/hits/{hit_id}/review`; artifact re-export still open. |
76
  | Add suggestion inbox | Done/Partial | UI/API supports accept/reject; exact diff preview still open. |
77
  | Add cluster explanation drawer | Done | `GET /api/jobs/{job_id}/explain/cluster/{cluster_id}` plus UI drawer. |
78
  | Add semantic undo | Done | `POST /api/jobs/{job_id}/undo`. |
@@ -106,3 +106,24 @@ Last updated: 2026-05-12
106
  | Add explicit upload button | Done | Top bar now has a visible `Upload audio` control. |
107
  | Make whole-app file dropping work | Done | Window-level drag/drop handlers select dropped files and prevent browser navigation. |
108
  | Add drag overlay | Done | `globalDropOverlay` appears while dragging files over the app. |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
72
  | Add pull hit into new cluster | Done | `POST /api/jobs/{job_id}/hits/{hit_id}/pull-out`. |
73
  | Add cluster lock/unlock | Done | `POST /api/jobs/{job_id}/clusters/{cluster_id}/lock`. |
74
  | Add suppress hit as bleed/noise | Done | `POST /api/jobs/{job_id}/hits/{hit_id}/suppress`. |
75
+ | Add accept/favorite hit action | Done | `POST /api/jobs/{job_id}/hits/{hit_id}/review`; supervised re-export honors pinned/favorite representatives. |
76
  | Add suggestion inbox | Done/Partial | UI/API supports accept/reject; exact diff preview still open. |
77
  | Add cluster explanation drawer | Done | `GET /api/jobs/{job_id}/explain/cluster/{cluster_id}` plus UI drawer. |
78
  | Add semantic undo | Done | `POST /api/jobs/{job_id}/undo`. |
 
106
  | Add explicit upload button | Done | Top bar now has a visible `Upload audio` control. |
107
  | Make whole-app file dropping work | Done | Window-level drag/drop handlers select dropped files and prevent browser navigation. |
108
  | Add drag overlay | Done | `globalDropOverlay` appears while dragging files over the app. |
109
+
110
+ ## Pass 9 tasks: reproduced audio and parameter hierarchy
111
+
112
+ | Task | Status | Notes |
113
+ |---|---:|---|
114
+ | Export normalized source mix | Done | `source.wav` written per run. |
115
+ | Export non-target context bed | Done | `context_bed.wav` is source minus target stem; silent for `stem=all`. |
116
+ | Keep isolated target reconstruction | Done | `target_reconstruction.wav` written per run and per supervised export. |
117
+ | Make reproduced audio incorporate context | Done | `reconstruction.wav` is context bed plus target reconstruction. |
118
+ | Add full-context reproduction to ZIP | Done | `rendered_reproduction_full_mix.wav` plus compatibility alias. |
119
+ | Add target-stem reconstruction to ZIP | Done | `rendered_reconstruction_target_stem.wav`. |
120
+ | Update supervised export audio model | Done | Edited export writes full-context and target-only previews. |
121
+ | Add Source/Stem/Reproduced transport modes | Done | Transport buttons added in `web/index.html` and wired in `web/app.js`. |
122
+ | Separate common controls from advanced parameters | Done | Common controls: stem, sensitivity, sample groups, presets. |
123
+ | Group advanced parameters by pipeline stage | Done | Stem separation, hit detection, grouping, export/cache. |
124
+
125
+ Remaining follow-up tasks:
126
+
127
+ - [ ] Add source-vs-reproduced waveform/error comparison.
128
+ - [ ] Add LUFS loudness matching for long previews.
129
+ - [ ] Optionally cache explicit Demucs non-target stem sums instead of residual subtraction.
docs/UI_REPLACEMENT.md CHANGED
@@ -33,8 +33,8 @@ The UI was first restyled to the supplied minimal reference direction:
33
  This pass closed the visual fidelity gaps from the previous approximation:
34
 
35
  - removed the visible waveform header so the canvas is quiet like the reference image;
36
- - replaced separate native stem/reconstruction audio controls with one minimal transport row: play button, time, and progress line;
37
- - moved fast-mode buttons into `Advanced` so the right card now exposes only `Stem`, `Sensitivity`, `Cluster Count`, and `Export Samples`;
38
  - collapsed pipeline/history/supervision/tables into one `Review & edit` workbench below the sample cards;
39
  - hid selected-hit/sample audio elements from the default layout while preserving click-to-audition behavior;
40
  - tightened card spacing, border radii, font scale, waveform height, and sample-card proportions to better match the supplied image.
@@ -61,8 +61,8 @@ This pass responds to the no-scroll workstation requirement and the missing-uplo
61
  |---|---|
62
  | Top bar | App identity, explicit upload button, selected-file metadata, backend status, and one primary purple `Extract Samples` action. |
63
  | Left sidebar | Source/drop guidance, selected-hit/sample context, pipeline logs, and run history. |
64
- | Center workspace | Quiet waveform canvas, one custom transport row, and representative sample cards. |
65
- | Right sidebar | Stem, sensitivity, cluster count, exports, and collapsed advanced DSP/model controls. |
66
  | Bottom dock | Review/edit semantic supervision tools and raw tables in expandable panels. |
67
 
68
  ## Frontend implementation
@@ -89,8 +89,11 @@ The frontend creates a job with `POST /api/jobs`, then polls `GET /api/jobs/{id}
89
 
90
  - sample pack ZIP
91
  - MIDI reconstruction
92
- - stem WAV
93
- - reconstruction WAV
 
 
 
94
  - individual sample WAVs
95
 
96
  The run history panel calls `GET /api/jobs` and can reload any completed manifest still present under `.runs/`.
@@ -128,3 +131,17 @@ The current UI now includes:
128
  - Server-sent-events job progress via `GET /api/jobs/{job_id}/events`, with polling fallback.
129
 
130
  This still stops short of destructive editing. The next UI layer should store edits as manifest overlays, then call a re-export endpoint that reuses cached hit audio instead of rerunning Demucs/onset detection.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  This pass closed the visual fidelity gaps from the previous approximation:
34
 
35
  - removed the visible waveform header so the canvas is quiet like the reference image;
36
+ - replaced separate native stem/reconstruction audio controls with one minimal transport row: play button, time, progress line, and Source/Stem/Reproduced preview modes;
37
+ - renamed the right card to `Common controls` and limited it to stem, hit sensitivity, sample groups, plus fast-preview/best-quality presets;
38
  - collapsed pipeline/history/supervision/tables into one `Review & edit` workbench below the sample cards;
39
  - hid selected-hit/sample audio elements from the default layout while preserving click-to-audition behavior;
40
  - tightened card spacing, border radii, font scale, waveform height, and sample-card proportions to better match the supplied image.
 
61
  |---|---|
62
  | Top bar | App identity, explicit upload button, selected-file metadata, backend status, and one primary purple `Extract Samples` action. |
63
  | Left sidebar | Source/drop guidance, selected-hit/sample context, pipeline logs, and run history. |
64
+ | Center workspace | Quiet waveform canvas, Source/Stem/Reproduced transport row, and representative sample cards. |
65
+ | Right sidebar | Common controls, exports, and collapsed advanced parameters grouped by stem separation, hit detection, grouping, export, and cache. |
66
  | Bottom dock | Review/edit semantic supervision tools and raw tables in expandable panels. |
67
 
68
  ## Frontend implementation
 
89
 
90
  - sample pack ZIP
91
  - MIDI reconstruction
92
+ - source mix WAV
93
+ - target stem WAV
94
+ - non-target context bed WAV
95
+ - target reconstruction WAV
96
+ - full-context reproduced mix WAV
97
  - individual sample WAVs
98
 
99
  The run history panel calls `GET /api/jobs` and can reload any completed manifest still present under `.runs/`.
 
131
  - Server-sent-events job progress via `GET /api/jobs/{job_id}/events`, with polling fallback.
132
 
133
  This still stops short of destructive editing. The next UI layer should store edits as manifest overlays, then call a re-export endpoint that reuses cached hit audio instead of rerunning Demucs/onset detection.
134
+
135
+
136
+ ## Pass 9 reproduced audio and parameter hierarchy
137
+
138
+ This pass made the audio preview and control model more explicit:
139
+
140
+ - `reconstruction.wav` is now a full-context reproduced mix, not just the sample-triggered target layer.
141
+ - `target_reconstruction.wav` preserves the sample-only target reconstruction for focused inspection.
142
+ - `source.wav`, `stem.wav`, and `context_bed.wav` are exported as explicit layers.
143
+ - The transport has Source, Stem, and Reproduced preview buttons and switches to Reproduced after extraction.
144
+ - The right sidebar now separates Common controls from Advanced parameters.
145
+ - Advanced parameters are grouped by pipeline stage: stem separation, hit detection, grouping, export/cache.
146
+
147
+ See `docs/REPRODUCED_AUDIO_AND_PARAMETERS.md`.
docs/benchmark-online-preview.json CHANGED
@@ -8,66 +8,66 @@
8
  "run_index": 0,
9
  "clustering_mode": "online_preview",
10
  "audio_duration_sec": 4.75,
11
- "total_duration_sec": 1.88646,
12
- "realtime_factor": 0.397149,
13
- "hit_count": 13,
14
- "cluster_count": 10,
15
  "stages": [
16
  {
17
  "key": "stem",
18
  "label": "Stem extraction / source load",
19
- "duration_sec": 0.011189419999936945,
20
  "status": "done",
21
- "detail": "loaded full mix \u00b7 cached"
22
  },
23
  {
24
  "key": "bpm",
25
  "label": "Tempo detection",
26
- "duration_sec": 0.09853705299974536,
27
  "status": "done",
28
  "detail": "120.2 BPM"
29
  },
30
  {
31
  "key": "onsets",
32
  "label": "Onset detection + slicing",
33
- "duration_sec": 1.3858792310002173,
34
  "status": "done",
35
- "detail": "13 hits"
36
  },
37
  {
38
  "key": "classification",
39
  "label": "Spectral rule classification",
40
- "duration_sec": 0.014456886000061786,
41
  "status": "done",
42
- "detail": "bright:5, hihat_open:7, kick:1"
43
  },
44
  {
45
  "key": "clustering",
46
  "label": "Mel fingerprint + transient NCC clustering",
47
- "duration_sec": 0.016802669999833597,
48
  "status": "done",
49
- "detail": "10 clusters \u00b7 online preview"
50
  },
51
  {
52
  "key": "selection",
53
  "label": "Best representative scoring",
54
- "duration_sec": 0.07535981499995614,
55
  "status": "done",
56
  "detail": "quality-scored representatives"
57
  },
58
  {
59
  "key": "synthesis",
60
  "label": "Optional sample synthesis",
61
- "duration_sec": 0.00036268399981054245,
62
  "status": "done",
63
  "detail": "2 synthesized alternates"
64
  },
65
  {
66
  "key": "export",
67
- "label": "MIDI, reconstruction, WAV, ZIP export",
68
- "duration_sec": 0.28339249200007544,
69
  "status": "done",
70
- "detail": "10 samples + 13 review hits + MIDI + ZIP"
71
  }
72
  ]
73
  },
@@ -78,66 +78,66 @@
78
  "run_index": 0,
79
  "clustering_mode": "online_preview",
80
  "audio_duration_sec": 4.874989,
81
- "total_duration_sec": 2.914241,
82
- "realtime_factor": 0.597794,
83
- "hit_count": 28,
84
  "cluster_count": 12,
85
  "stages": [
86
  {
87
  "key": "stem",
88
  "label": "Stem extraction / source load",
89
- "duration_sec": 0.00999813099997482,
90
  "status": "done",
91
- "detail": "loaded full mix \u00b7 cached"
92
  },
93
  {
94
  "key": "bpm",
95
  "label": "Tempo detection",
96
- "duration_sec": 0.10688103099982982,
97
  "status": "done",
98
  "detail": "161.5 BPM"
99
  },
100
  {
101
  "key": "onsets",
102
  "label": "Onset detection + slicing",
103
- "duration_sec": 2.1018096600000717,
104
  "status": "done",
105
- "detail": "28 hits"
106
  },
107
  {
108
  "key": "classification",
109
  "label": "Spectral rule classification",
110
- "duration_sec": 0.09064649800029656,
111
  "status": "done",
112
- "detail": "bright:12, cymbal:1, hihat_closed:9, hihat_open:3, mid:3"
113
  },
114
  {
115
  "key": "clustering",
116
  "label": "Mel fingerprint + transient NCC clustering",
117
- "duration_sec": 0.049414074000196706,
118
  "status": "done",
119
  "detail": "12 clusters \u00b7 online preview"
120
  },
121
  {
122
  "key": "selection",
123
  "label": "Best representative scoring",
124
- "duration_sec": 0.23301379500026087,
125
  "status": "done",
126
  "detail": "quality-scored representatives"
127
  },
128
  {
129
  "key": "synthesis",
130
  "label": "Optional sample synthesis",
131
- "duration_sec": 0.0012726520003525366,
132
  "status": "done",
133
  "detail": "5 synthesized alternates"
134
  },
135
  {
136
  "key": "export",
137
- "label": "MIDI, reconstruction, WAV, ZIP export",
138
- "duration_sec": 0.32063418000007005,
139
  "status": "done",
140
- "detail": "12 samples + 28 review hits + MIDI + ZIP"
141
  }
142
  ]
143
  },
@@ -148,66 +148,66 @@
148
  "run_index": 0,
149
  "clustering_mode": "online_preview",
150
  "audio_duration_sec": 4.874989,
151
- "total_duration_sec": 2.480844,
152
- "realtime_factor": 0.508892,
153
  "hit_count": 29,
154
  "cluster_count": 12,
155
  "stages": [
156
  {
157
  "key": "stem",
158
  "label": "Stem extraction / source load",
159
- "duration_sec": 0.010305768999842257,
160
  "status": "done",
161
- "detail": "loaded full mix \u00b7 cached"
162
  },
163
  {
164
  "key": "bpm",
165
  "label": "Tempo detection",
166
- "duration_sec": 0.1724793140001566,
167
  "status": "done",
168
  "detail": "120.2 BPM"
169
  },
170
  {
171
  "key": "onsets",
172
  "label": "Onset detection + slicing",
173
- "duration_sec": 1.8014776340000935,
174
  "status": "done",
175
  "detail": "29 hits"
176
  },
177
  {
178
  "key": "classification",
179
  "label": "Spectral rule classification",
180
- "duration_sec": 0.017559420999987196,
181
  "status": "done",
182
- "detail": "bright:5, cymbal:1, hihat_closed:20, hihat_open:3"
183
  },
184
  {
185
  "key": "clustering",
186
  "label": "Mel fingerprint + transient NCC clustering",
187
- "duration_sec": 0.043723993000185146,
188
  "status": "done",
189
  "detail": "12 clusters \u00b7 online preview"
190
  },
191
  {
192
  "key": "selection",
193
  "label": "Best representative scoring",
194
- "duration_sec": 0.16425892699999167,
195
  "status": "done",
196
  "detail": "quality-scored representatives"
197
  },
198
  {
199
  "key": "synthesis",
200
  "label": "Optional sample synthesis",
201
- "duration_sec": 0.0012976000002709043,
202
  "status": "done",
203
- "detail": "8 synthesized alternates"
204
  },
205
  {
206
  "key": "export",
207
- "label": "MIDI, reconstruction, WAV, ZIP export",
208
- "duration_sec": 0.2692134119997718,
209
  "status": "done",
210
- "detail": "12 samples + 29 review hits + MIDI + ZIP"
211
  }
212
  ]
213
  }
@@ -215,59 +215,59 @@
215
  "summary": [
216
  {
217
  "stage": "stem",
218
- "mean_sec": 0.010498,
219
- "median_sec": 0.010306,
220
- "min_sec": 0.009998,
221
- "max_sec": 0.011189
222
  },
223
  {
224
  "stage": "bpm",
225
- "mean_sec": 0.125966,
226
- "median_sec": 0.106881,
227
- "min_sec": 0.098537,
228
- "max_sec": 0.172479
229
  },
230
  {
231
  "stage": "onsets",
232
- "mean_sec": 1.763056,
233
- "median_sec": 1.801478,
234
- "min_sec": 1.385879,
235
- "max_sec": 2.10181
236
  },
237
  {
238
  "stage": "classification",
239
- "mean_sec": 0.040888,
240
- "median_sec": 0.017559,
241
- "min_sec": 0.014457,
242
- "max_sec": 0.090646
243
  },
244
  {
245
  "stage": "clustering",
246
- "mean_sec": 0.036647,
247
- "median_sec": 0.043724,
248
- "min_sec": 0.016803,
249
- "max_sec": 0.049414
250
  },
251
  {
252
  "stage": "selection",
253
- "mean_sec": 0.157544,
254
- "median_sec": 0.164259,
255
- "min_sec": 0.07536,
256
- "max_sec": 0.233014
257
  },
258
  {
259
  "stage": "synthesis",
260
- "mean_sec": 0.000978,
261
- "median_sec": 0.001273,
262
- "min_sec": 0.000363,
263
- "max_sec": 0.001298
264
  },
265
  {
266
  "stage": "export",
267
- "mean_sec": 0.29108,
268
- "median_sec": 0.283392,
269
- "min_sec": 0.269213,
270
- "max_sec": 0.320634
271
  }
272
  ]
273
  }
 
8
  "run_index": 0,
9
  "clustering_mode": "online_preview",
10
  "audio_duration_sec": 4.75,
11
+ "total_duration_sec": 2.619948,
12
+ "realtime_factor": 0.551568,
13
+ "hit_count": 14,
14
+ "cluster_count": 11,
15
  "stages": [
16
  {
17
  "key": "stem",
18
  "label": "Stem extraction / source load",
19
+ "duration_sec": 0.025709866000397597,
20
  "status": "done",
21
+ "detail": "loaded full mix \u00b7 cached \u00b7 reproduction uses full mix"
22
  },
23
  {
24
  "key": "bpm",
25
  "label": "Tempo detection",
26
+ "duration_sec": 0.17244149500038475,
27
  "status": "done",
28
  "detail": "120.2 BPM"
29
  },
30
  {
31
  "key": "onsets",
32
  "label": "Onset detection + slicing",
33
+ "duration_sec": 2.01123834900136,
34
  "status": "done",
35
+ "detail": "14 hits"
36
  },
37
  {
38
  "key": "classification",
39
  "label": "Spectral rule classification",
40
+ "duration_sec": 0.015600401000483544,
41
  "status": "done",
42
+ "detail": "bright:4, hihat_closed:1, hihat_open:7, kick:1, mid:1"
43
  },
44
  {
45
  "key": "clustering",
46
  "label": "Mel fingerprint + transient NCC clustering",
47
+ "duration_sec": 0.08166359800088685,
48
  "status": "done",
49
+ "detail": "11 clusters \u00b7 online preview"
50
  },
51
  {
52
  "key": "selection",
53
  "label": "Best representative scoring",
54
+ "duration_sec": 0.02976710500115587,
55
  "status": "done",
56
  "detail": "quality-scored representatives"
57
  },
58
  {
59
  "key": "synthesis",
60
  "label": "Optional sample synthesis",
61
+ "duration_sec": 0.0003646059994935058,
62
  "status": "done",
63
  "detail": "2 synthesized alternates"
64
  },
65
  {
66
  "key": "export",
67
+ "label": "MIDI, reproduced mix, WAV, ZIP export",
68
+ "duration_sec": 0.28268050299993774,
69
  "status": "done",
70
+ "detail": "11 samples + 14 review hits + MIDI + full-context reproduction + ZIP"
71
  }
72
  ]
73
  },
 
78
  "run_index": 0,
79
  "clustering_mode": "online_preview",
80
  "audio_duration_sec": 4.874989,
81
+ "total_duration_sec": 3.096466,
82
+ "realtime_factor": 0.635174,
83
+ "hit_count": 29,
84
  "cluster_count": 12,
85
  "stages": [
86
  {
87
  "key": "stem",
88
  "label": "Stem extraction / source load",
89
+ "duration_sec": 0.02813513600085571,
90
  "status": "done",
91
+ "detail": "loaded full mix \u00b7 cached \u00b7 reproduction uses full mix"
92
  },
93
  {
94
  "key": "bpm",
95
  "label": "Tempo detection",
96
+ "duration_sec": 0.0898798819998774,
97
  "status": "done",
98
  "detail": "161.5 BPM"
99
  },
100
  {
101
  "key": "onsets",
102
  "label": "Onset detection + slicing",
103
+ "duration_sec": 2.39328171499983,
104
  "status": "done",
105
+ "detail": "29 hits"
106
  },
107
  {
108
  "key": "classification",
109
  "label": "Spectral rule classification",
110
+ "duration_sec": 0.01869549100047152,
111
  "status": "done",
112
+ "detail": "bright:12, cymbal:1, hihat_closed:9, hihat_open:3, mid:4"
113
  },
114
  {
115
  "key": "clustering",
116
  "label": "Mel fingerprint + transient NCC clustering",
117
+ "duration_sec": 0.03839712700028031,
118
  "status": "done",
119
  "detail": "12 clusters \u00b7 online preview"
120
  },
121
  {
122
  "key": "selection",
123
  "label": "Best representative scoring",
124
+ "duration_sec": 0.2286190050017467,
125
  "status": "done",
126
  "detail": "quality-scored representatives"
127
  },
128
  {
129
  "key": "synthesis",
130
  "label": "Optional sample synthesis",
131
+ "duration_sec": 0.0011234290013817372,
132
  "status": "done",
133
  "detail": "5 synthesized alternates"
134
  },
135
  {
136
  "key": "export",
137
+ "label": "MIDI, reproduced mix, WAV, ZIP export",
138
+ "duration_sec": 0.29779865499949665,
139
  "status": "done",
140
+ "detail": "12 samples + 29 review hits + MIDI + full-context reproduction + ZIP"
141
  }
142
  ]
143
  },
 
148
  "run_index": 0,
149
  "clustering_mode": "online_preview",
150
  "audio_duration_sec": 4.874989,
151
+ "total_duration_sec": 2.58942,
152
+ "realtime_factor": 0.531164,
153
  "hit_count": 29,
154
  "cluster_count": 12,
155
  "stages": [
156
  {
157
  "key": "stem",
158
  "label": "Stem extraction / source load",
159
+ "duration_sec": 0.0707627699994191,
160
  "status": "done",
161
+ "detail": "loaded full mix \u00b7 cached \u00b7 reproduction uses full mix"
162
  },
163
  {
164
  "key": "bpm",
165
  "label": "Tempo detection",
166
+ "duration_sec": 0.1129706889987574,
167
  "status": "done",
168
  "detail": "120.2 BPM"
169
  },
170
  {
171
  "key": "onsets",
172
  "label": "Onset detection + slicing",
173
+ "duration_sec": 1.902288953999232,
174
  "status": "done",
175
  "detail": "29 hits"
176
  },
177
  {
178
  "key": "classification",
179
  "label": "Spectral rule classification",
180
+ "duration_sec": 0.018896421999670565,
181
  "status": "done",
182
+ "detail": "bright:4, hihat_closed:23, hihat_open:2"
183
  },
184
  {
185
  "key": "clustering",
186
  "label": "Mel fingerprint + transient NCC clustering",
187
+ "duration_sec": 0.10450396400119644,
188
  "status": "done",
189
  "detail": "12 clusters \u00b7 online preview"
190
  },
191
  {
192
  "key": "selection",
193
  "label": "Best representative scoring",
194
+ "duration_sec": 0.11157589499998721,
195
  "status": "done",
196
  "detail": "quality-scored representatives"
197
  },
198
  {
199
  "key": "synthesis",
200
  "label": "Optional sample synthesis",
201
+ "duration_sec": 0.0010697859997890191,
202
  "status": "done",
203
+ "detail": "6 synthesized alternates"
204
  },
205
  {
206
  "key": "export",
207
+ "label": "MIDI, reproduced mix, WAV, ZIP export",
208
+ "duration_sec": 0.2668189519990847,
209
  "status": "done",
210
+ "detail": "12 samples + 29 review hits + MIDI + full-context reproduction + ZIP"
211
  }
212
  ]
213
  }
 
215
  "summary": [
216
  {
217
  "stage": "stem",
218
+ "mean_sec": 0.041536,
219
+ "median_sec": 0.028135,
220
+ "min_sec": 0.02571,
221
+ "max_sec": 0.070763
222
  },
223
  {
224
  "stage": "bpm",
225
+ "mean_sec": 0.125097,
226
+ "median_sec": 0.112971,
227
+ "min_sec": 0.08988,
228
+ "max_sec": 0.172441
229
  },
230
  {
231
  "stage": "onsets",
232
+ "mean_sec": 2.10227,
233
+ "median_sec": 2.011238,
234
+ "min_sec": 1.902289,
235
+ "max_sec": 2.393282
236
  },
237
  {
238
  "stage": "classification",
239
+ "mean_sec": 0.017731,
240
+ "median_sec": 0.018695,
241
+ "min_sec": 0.0156,
242
+ "max_sec": 0.018896
243
  },
244
  {
245
  "stage": "clustering",
246
+ "mean_sec": 0.074855,
247
+ "median_sec": 0.081664,
248
+ "min_sec": 0.038397,
249
+ "max_sec": 0.104504
250
  },
251
  {
252
  "stage": "selection",
253
+ "mean_sec": 0.123321,
254
+ "median_sec": 0.111576,
255
+ "min_sec": 0.029767,
256
+ "max_sec": 0.228619
257
  },
258
  {
259
  "stage": "synthesis",
260
+ "mean_sec": 0.000853,
261
+ "median_sec": 0.00107,
262
+ "min_sec": 0.000365,
263
+ "max_sec": 0.001123
264
  },
265
  {
266
  "stage": "export",
267
+ "mean_sec": 0.282433,
268
+ "median_sec": 0.282681,
269
+ "min_sec": 0.266819,
270
+ "max_sec": 0.297799
271
  }
272
  ]
273
  }
docs/benchmark-subprocesses.json CHANGED
@@ -8,66 +8,66 @@
8
  "run_index": 0,
9
  "clustering_mode": "batch_quality",
10
  "audio_duration_sec": 4.75,
11
- "total_duration_sec": 2.508936,
12
- "realtime_factor": 0.528197,
13
- "hit_count": 13,
14
  "cluster_count": 7,
15
  "stages": [
16
  {
17
  "key": "stem",
18
  "label": "Stem extraction / source load",
19
- "duration_sec": 0.010515291000047,
20
  "status": "done",
21
- "detail": "loaded full mix \u00b7 cached"
22
  },
23
  {
24
  "key": "bpm",
25
  "label": "Tempo detection",
26
- "duration_sec": 0.11277726900016205,
27
  "status": "done",
28
  "detail": "120.2 BPM"
29
  },
30
  {
31
  "key": "onsets",
32
  "label": "Onset detection + slicing",
33
- "duration_sec": 1.9893157869996685,
34
  "status": "done",
35
- "detail": "13 hits"
36
  },
37
  {
38
  "key": "classification",
39
  "label": "Spectral rule classification",
40
- "duration_sec": 0.013427571999727661,
41
  "status": "done",
42
- "detail": "bright:5, hihat_closed:1, hihat_open:6, kick:1"
43
  },
44
  {
45
  "key": "clustering",
46
  "label": "Mel fingerprint + transient NCC clustering",
47
- "duration_sec": 0.013959215999875596,
48
  "status": "done",
49
  "detail": "7 clusters \u00b7 batch quality"
50
  },
51
  {
52
  "key": "selection",
53
  "label": "Best representative scoring",
54
- "duration_sec": 0.09699052199994185,
55
  "status": "done",
56
  "detail": "quality-scored representatives"
57
  },
58
  {
59
  "key": "synthesis",
60
  "label": "Optional sample synthesis",
61
- "duration_sec": 0.000661541999761539,
62
  "status": "done",
63
  "detail": "2 synthesized alternates"
64
  },
65
  {
66
  "key": "export",
67
- "label": "MIDI, reconstruction, WAV, ZIP export",
68
- "duration_sec": 0.2707521170000291,
69
  "status": "done",
70
- "detail": "7 samples + 13 review hits + MIDI + ZIP"
71
  }
72
  ]
73
  },
@@ -78,66 +78,66 @@
78
  "run_index": 0,
79
  "clustering_mode": "batch_quality",
80
  "audio_duration_sec": 4.874989,
81
- "total_duration_sec": 2.562433,
82
- "realtime_factor": 0.525628,
83
- "hit_count": 30,
84
- "cluster_count": 1,
85
  "stages": [
86
  {
87
  "key": "stem",
88
  "label": "Stem extraction / source load",
89
- "duration_sec": 0.009733310000228812,
90
  "status": "done",
91
- "detail": "loaded full mix \u00b7 cached"
92
  },
93
  {
94
  "key": "bpm",
95
  "label": "Tempo detection",
96
- "duration_sec": 0.18278188500016768,
97
  "status": "done",
98
  "detail": "161.5 BPM"
99
  },
100
  {
101
  "key": "onsets",
102
  "label": "Onset detection + slicing",
103
- "duration_sec": 1.8905766069997298,
104
  "status": "done",
105
- "detail": "30 hits"
106
  },
107
  {
108
  "key": "classification",
109
  "label": "Spectral rule classification",
110
- "duration_sec": 0.016936135000378272,
111
  "status": "done",
112
- "detail": "bright:15, cymbal:1, hihat_closed:10, hihat_open:3, mid:1"
113
  },
114
  {
115
  "key": "clustering",
116
  "label": "Mel fingerprint + transient NCC clustering",
117
- "duration_sec": 0.09508980800001154,
118
  "status": "done",
119
- "detail": "1 clusters \u00b7 batch quality"
120
  },
121
  {
122
  "key": "selection",
123
  "label": "Best representative scoring",
124
- "duration_sec": 0.271814092999648,
125
  "status": "done",
126
  "detail": "quality-scored representatives"
127
  },
128
  {
129
  "key": "synthesis",
130
  "label": "Optional sample synthesis",
131
- "duration_sec": 0.0009019099998113234,
132
  "status": "done",
133
- "detail": "1 synthesized alternates"
134
  },
135
  {
136
  "key": "export",
137
- "label": "MIDI, reconstruction, WAV, ZIP export",
138
- "duration_sec": 0.09411494899995887,
139
  "status": "done",
140
- "detail": "1 samples + 30 review hits + MIDI + ZIP"
141
  }
142
  ]
143
  },
@@ -148,66 +148,66 @@
148
  "run_index": 0,
149
  "clustering_mode": "batch_quality",
150
  "audio_duration_sec": 4.874989,
151
- "total_duration_sec": 2.587342,
152
- "realtime_factor": 0.530738,
153
- "hit_count": 20,
154
- "cluster_count": 4,
155
  "stages": [
156
  {
157
  "key": "stem",
158
  "label": "Stem extraction / source load",
159
- "duration_sec": 0.008843839000292064,
160
  "status": "done",
161
- "detail": "loaded full mix \u00b7 cached"
162
  },
163
  {
164
  "key": "bpm",
165
  "label": "Tempo detection",
166
- "duration_sec": 0.16997624899977382,
167
  "status": "done",
168
  "detail": "120.2 BPM"
169
  },
170
  {
171
  "key": "onsets",
172
  "label": "Onset detection + slicing",
173
- "duration_sec": 2.0115367889998197,
174
  "status": "done",
175
- "detail": "20 hits"
176
  },
177
  {
178
  "key": "classification",
179
  "label": "Spectral rule classification",
180
- "duration_sec": 0.0954397410000638,
181
  "status": "done",
182
- "detail": "bright:3, hihat_closed:14, hihat_open:3"
183
  },
184
  {
185
  "key": "clustering",
186
  "label": "Mel fingerprint + transient NCC clustering",
187
- "duration_sec": 0.02929340799983038,
188
  "status": "done",
189
- "detail": "4 clusters \u00b7 batch quality"
190
  },
191
  {
192
  "key": "selection",
193
  "label": "Best representative scoring",
194
- "duration_sec": 0.1620299520000117,
195
  "status": "done",
196
  "detail": "quality-scored representatives"
197
  },
198
  {
199
  "key": "synthesis",
200
  "label": "Optional sample synthesis",
201
- "duration_sec": 0.0010316440002497984,
202
  "status": "done",
203
- "detail": "2 synthesized alternates"
204
  },
205
  {
206
  "key": "export",
207
- "label": "MIDI, reconstruction, WAV, ZIP export",
208
- "duration_sec": 0.108677784000065,
209
  "status": "done",
210
- "detail": "4 samples + 20 review hits + MIDI + ZIP"
211
  }
212
  ]
213
  }
@@ -215,59 +215,59 @@
215
  "summary": [
216
  {
217
  "stage": "stem",
218
- "mean_sec": 0.009697,
219
- "median_sec": 0.009733,
220
- "min_sec": 0.008844,
221
- "max_sec": 0.010515
222
  },
223
  {
224
  "stage": "bpm",
225
- "mean_sec": 0.155178,
226
- "median_sec": 0.169976,
227
- "min_sec": 0.112777,
228
- "max_sec": 0.182782
229
  },
230
  {
231
  "stage": "onsets",
232
- "mean_sec": 1.96381,
233
- "median_sec": 1.989316,
234
- "min_sec": 1.890577,
235
- "max_sec": 2.011537
236
  },
237
  {
238
  "stage": "classification",
239
- "mean_sec": 0.041934,
240
- "median_sec": 0.016936,
241
- "min_sec": 0.013428,
242
- "max_sec": 0.09544
243
  },
244
  {
245
  "stage": "clustering",
246
- "mean_sec": 0.046114,
247
- "median_sec": 0.029293,
248
- "min_sec": 0.013959,
249
- "max_sec": 0.09509
250
  },
251
  {
252
  "stage": "selection",
253
- "mean_sec": 0.176945,
254
- "median_sec": 0.16203,
255
- "min_sec": 0.096991,
256
- "max_sec": 0.271814
257
  },
258
  {
259
  "stage": "synthesis",
260
- "mean_sec": 0.000865,
261
- "median_sec": 0.000902,
262
- "min_sec": 0.000662,
263
- "max_sec": 0.001032
264
  },
265
  {
266
  "stage": "export",
267
- "mean_sec": 0.157848,
268
- "median_sec": 0.108678,
269
- "min_sec": 0.094115,
270
- "max_sec": 0.270752
271
  }
272
  ]
273
  }
 
8
  "run_index": 0,
9
  "clustering_mode": "batch_quality",
10
  "audio_duration_sec": 4.75,
11
+ "total_duration_sec": 2.994395,
12
+ "realtime_factor": 0.630399,
13
+ "hit_count": 12,
14
  "cluster_count": 7,
15
  "stages": [
16
  {
17
  "key": "stem",
18
  "label": "Stem extraction / source load",
19
+ "duration_sec": 0.02592192699921725,
20
  "status": "done",
21
+ "detail": "loaded full mix \u00b7 cached \u00b7 reproduction uses full mix"
22
  },
23
  {
24
  "key": "bpm",
25
  "label": "Tempo detection",
26
+ "duration_sec": 0.1751658609991864,
27
  "status": "done",
28
  "detail": "120.2 BPM"
29
  },
30
  {
31
  "key": "onsets",
32
  "label": "Onset detection + slicing",
33
+ "duration_sec": 2.1905335589999595,
34
  "status": "done",
35
+ "detail": "12 hits"
36
  },
37
  {
38
  "key": "classification",
39
  "label": "Spectral rule classification",
40
+ "duration_sec": 0.09557517999928677,
41
  "status": "done",
42
+ "detail": "bright:5, hihat_open:6, kick:1"
43
  },
44
  {
45
  "key": "clustering",
46
  "label": "Mel fingerprint + transient NCC clustering",
47
+ "duration_sec": 0.014000580998981604,
48
  "status": "done",
49
  "detail": "7 clusters \u00b7 batch quality"
50
  },
51
  {
52
  "key": "selection",
53
  "label": "Best representative scoring",
54
+ "duration_sec": 0.08321280500058492,
55
  "status": "done",
56
  "detail": "quality-scored representatives"
57
  },
58
  {
59
  "key": "synthesis",
60
  "label": "Optional sample synthesis",
61
+ "duration_sec": 0.0006027010003890609,
62
  "status": "done",
63
  "detail": "2 synthesized alternates"
64
  },
65
  {
66
  "key": "export",
67
+ "label": "MIDI, reproduced mix, WAV, ZIP export",
68
+ "duration_sec": 0.40873212500082445,
69
  "status": "done",
70
+ "detail": "7 samples + 12 review hits + MIDI + full-context reproduction + ZIP"
71
  }
72
  ]
73
  },
 
78
  "run_index": 0,
79
  "clustering_mode": "batch_quality",
80
  "audio_duration_sec": 4.874989,
81
+ "total_duration_sec": 2.802354,
82
+ "realtime_factor": 0.574843,
83
+ "hit_count": 23,
84
+ "cluster_count": 2,
85
  "stages": [
86
  {
87
  "key": "stem",
88
  "label": "Stem extraction / source load",
89
+ "duration_sec": 0.02253025699974387,
90
  "status": "done",
91
+ "detail": "loaded full mix \u00b7 cached \u00b7 reproduction uses full mix"
92
  },
93
  {
94
  "key": "bpm",
95
  "label": "Tempo detection",
96
+ "duration_sec": 0.09380031600085204,
97
  "status": "done",
98
  "detail": "161.5 BPM"
99
  },
100
  {
101
  "key": "onsets",
102
  "label": "Onset detection + slicing",
103
+ "duration_sec": 2.1897132599988254,
104
  "status": "done",
105
+ "detail": "23 hits"
106
  },
107
  {
108
  "key": "classification",
109
  "label": "Spectral rule classification",
110
+ "duration_sec": 0.017409414000212564,
111
  "status": "done",
112
+ "detail": "bright:13, hihat_closed:4, hihat_open:4, kick:1, mid:1"
113
  },
114
  {
115
  "key": "clustering",
116
  "label": "Mel fingerprint + transient NCC clustering",
117
+ "duration_sec": 0.03413462400021672,
118
  "status": "done",
119
+ "detail": "2 clusters \u00b7 batch quality"
120
  },
121
  {
122
  "key": "selection",
123
  "label": "Best representative scoring",
124
+ "duration_sec": 0.26413379800033,
125
  "status": "done",
126
  "detail": "quality-scored representatives"
127
  },
128
  {
129
  "key": "synthesis",
130
  "label": "Optional sample synthesis",
131
+ "duration_sec": 0.0011682919994200347,
132
  "status": "done",
133
+ "detail": "2 synthesized alternates"
134
  },
135
  {
136
  "key": "export",
137
+ "label": "MIDI, reproduced mix, WAV, ZIP export",
138
+ "duration_sec": 0.17886992200146778,
139
  "status": "done",
140
+ "detail": "2 samples + 23 review hits + MIDI + full-context reproduction + ZIP"
141
  }
142
  ]
143
  },
 
148
  "run_index": 0,
149
  "clustering_mode": "batch_quality",
150
  "audio_duration_sec": 4.874989,
151
+ "total_duration_sec": 2.514399,
152
+ "realtime_factor": 0.515775,
153
+ "hit_count": 31,
154
+ "cluster_count": 3,
155
  "stages": [
156
  {
157
  "key": "stem",
158
  "label": "Stem extraction / source load",
159
+ "duration_sec": 0.0255989449997287,
160
  "status": "done",
161
+ "detail": "loaded full mix \u00b7 cached \u00b7 reproduction uses full mix"
162
  },
163
  {
164
  "key": "bpm",
165
  "label": "Tempo detection",
166
+ "duration_sec": 0.08472461699966516,
167
  "status": "done",
168
  "detail": "120.2 BPM"
169
  },
170
  {
171
  "key": "onsets",
172
  "label": "Onset detection + slicing",
173
+ "duration_sec": 1.905502139001328,
174
  "status": "done",
175
+ "detail": "31 hits"
176
  },
177
  {
178
  "key": "classification",
179
  "label": "Spectral rule classification",
180
+ "duration_sec": 0.018339307000132976,
181
  "status": "done",
182
+ "detail": "bright:4, hihat_closed:25, hihat_open:2"
183
  },
184
  {
185
  "key": "clustering",
186
  "label": "Mel fingerprint + transient NCC clustering",
187
+ "duration_sec": 0.05202830600137531,
188
  "status": "done",
189
+ "detail": "3 clusters \u00b7 batch quality"
190
  },
191
  {
192
  "key": "selection",
193
  "label": "Best representative scoring",
194
+ "duration_sec": 0.24863046999962535,
195
  "status": "done",
196
  "detail": "quality-scored representatives"
197
  },
198
  {
199
  "key": "synthesis",
200
  "label": "Optional sample synthesis",
201
+ "duration_sec": 0.0012351730001682881,
202
  "status": "done",
203
+ "detail": "3 synthesized alternates"
204
  },
205
  {
206
  "key": "export",
207
+ "label": "MIDI, reproduced mix, WAV, ZIP export",
208
+ "duration_sec": 0.17784613499861734,
209
  "status": "done",
210
+ "detail": "3 samples + 31 review hits + MIDI + full-context reproduction + ZIP"
211
  }
212
  ]
213
  }
 
215
  "summary": [
216
  {
217
  "stage": "stem",
218
+ "mean_sec": 0.024684,
219
+ "median_sec": 0.025599,
220
+ "min_sec": 0.02253,
221
+ "max_sec": 0.025922
222
  },
223
  {
224
  "stage": "bpm",
225
+ "mean_sec": 0.117897,
226
+ "median_sec": 0.0938,
227
+ "min_sec": 0.084725,
228
+ "max_sec": 0.175166
229
  },
230
  {
231
  "stage": "onsets",
232
+ "mean_sec": 2.09525,
233
+ "median_sec": 2.189713,
234
+ "min_sec": 1.905502,
235
+ "max_sec": 2.190534
236
  },
237
  {
238
  "stage": "classification",
239
+ "mean_sec": 0.043775,
240
+ "median_sec": 0.018339,
241
+ "min_sec": 0.017409,
242
+ "max_sec": 0.095575
243
  },
244
  {
245
  "stage": "clustering",
246
+ "mean_sec": 0.033388,
247
+ "median_sec": 0.034135,
248
+ "min_sec": 0.014001,
249
+ "max_sec": 0.052028
250
  },
251
  {
252
  "stage": "selection",
253
+ "mean_sec": 0.198659,
254
+ "median_sec": 0.24863,
255
+ "min_sec": 0.083213,
256
+ "max_sec": 0.264134
257
  },
258
  {
259
  "stage": "synthesis",
260
+ "mean_sec": 0.001002,
261
+ "median_sec": 0.001168,
262
+ "min_sec": 0.000603,
263
+ "max_sec": 0.001235
264
  },
265
  {
266
  "stage": "export",
267
+ "mean_sec": 0.255149,
268
+ "median_sec": 0.17887,
269
+ "min_sec": 0.177846,
270
+ "max_sec": 0.408732
271
  }
272
  ]
273
  }
docs/interactive-ux/README.md CHANGED
@@ -12,7 +12,7 @@ The project now has a first supervised-editing foundation layered on top of the
12
  - The state contains hits, clusters, confidence scores, review queue entries, constraints, events, suggestions, and undo snapshots.
13
  - The FastAPI backend exposes state, move, pull-out, lock, suppress, review/favorite, suggestion, explanation, and undo endpoints.
14
  - The browser UI includes an interactive supervision panel with a review queue, cluster board, suggestion inbox, constraint/event log, and cluster explanation drawer.
15
- - The current supervised layer updates semantic state only. It does not yet rewrite sample WAVs, MIDI, reconstruction audio, or the ZIP after edits.
16
 
17
  ## Documents
18
 
 
12
  - The state contains hits, clusters, confidence scores, review queue entries, constraints, events, suggestions, and undo snapshots.
13
  - The FastAPI backend exposes state, move, pull-out, lock, suppress, review/favorite, suggestion, explanation, and undo endpoints.
14
  - The browser UI includes an interactive supervision panel with a review queue, cluster board, suggestion inbox, constraint/event log, and cluster explanation drawer.
15
+ - The supervised layer now rewrites edited sample WAVs, MIDI, target reconstruction, full-context reproduced audio, and ZIP artifacts under `supervised/` while preserving original batch outputs.
16
 
17
  ## Documents
18
 
pipeline_runner.py CHANGED
@@ -149,7 +149,7 @@ STAGE_DEFS = [
149
  ("clustering", "Mel fingerprint + transient NCC clustering"),
150
  ("selection", "Best representative scoring"),
151
  ("synthesis", "Optional sample synthesis"),
152
- ("export", "MIDI, reconstruction, WAV, ZIP export"),
153
  ]
154
 
155
 
@@ -192,6 +192,62 @@ def _normalise_audio(audio: np.ndarray) -> np.ndarray:
192
  return audio.astype(np.float32)
193
 
194
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
195
  MODULE_ROOT = Path(__file__).resolve().parent
196
  CACHE_DIR = Path(os.environ["DSE_CACHE_DIR"]) if os.environ.get("DSE_CACHE_DIR") else MODULE_ROOT / ".cache"
197
  STEM_CACHE_DIR = CACHE_DIR / "stems"
@@ -313,20 +369,32 @@ def run_extraction_pipeline(
313
 
314
  bpm: float | None = None
315
  stem_audio: np.ndarray
 
 
316
  stem_sr: int
317
  hits: list[Any] = []
318
  clusters: list[Any] = []
319
- rendered: np.ndarray | None = None
 
320
 
321
  _notify(progress_cb, {"type": "start", "stages": [asdict(s) for s in stages]})
322
 
323
  with _timed_stage(stages, "stem", progress_cb) as stage:
324
- stem_audio, stem_sr, stem_detail = _load_or_extract_stem(audio_path, params)
325
- stem_audio = _normalise_audio(stem_audio)
326
- stage.detail = stem_detail
327
- _write_audio(out / "stem.wav", stem_audio, stem_sr, subtype="PCM_16")
328
-
329
- audio_duration_sec = len(stem_audio) / stem_sr if stem_sr else 0.0
 
 
 
 
 
 
 
 
 
330
 
331
  with _timed_stage(stages, "bpm", progress_cb) as stage:
332
  bpm = detect_bpm(stem_audio, stem_sr)
@@ -411,7 +479,7 @@ def run_extraction_pipeline(
411
 
412
  sample_rows: list[dict[str, Any]] = []
413
  hit_rows: list[dict[str, Any]] = []
414
- files: dict[str, str] = {"stem": "stem.wav"}
415
 
416
  with _timed_stage(stages, "export", progress_cb) as stage:
417
  midi_path = out / "reconstruction.mid"
@@ -423,12 +491,16 @@ def run_extraction_pipeline(
423
  quantize=bool(params.quantize_midi),
424
  subdivision=int(params.subdivision),
425
  )
426
- rendered = render_midi_with_samples(clusters, sr=stem_sr)
 
427
  else:
428
- rendered = np.zeros_like(stem_audio)
429
  midi_path.write_bytes(b"")
430
 
431
- _write_audio(out / "reconstruction.wav", rendered, stem_sr, subtype="PCM_16")
 
 
 
432
  files["reconstruction"] = "reconstruction.wav"
433
  files["midi"] = "reconstruction.mid"
434
 
@@ -481,14 +553,21 @@ def run_extraction_pipeline(
481
  synth_path = samples_dir / f"{cluster.label}__synth.wav"
482
  _write_audio(synth_path, cluster.synthesized, stem_sr)
483
 
484
- archive_tmp = build_archive(clusters, bpm or 120.0, stem_sr, midi_path=str(midi_path), rendered_audio=rendered)
 
 
 
 
 
 
 
485
  files["archive"] = _copy_temp_file(archive_tmp, out / "sample-pack.zip")
486
  files["archive"] = "sample-pack.zip"
487
  try:
488
  os.unlink(archive_tmp)
489
  except OSError:
490
  pass
491
- stage.detail = f"{len(sample_rows)} samples + {len(hit_rows)} review hits + MIDI + ZIP"
492
 
493
  duration_sec = time.perf_counter() - started_total
494
  result = PipelineResult(
 
149
  ("clustering", "Mel fingerprint + transient NCC clustering"),
150
  ("selection", "Best representative scoring"),
151
  ("synthesis", "Optional sample synthesis"),
152
+ ("export", "MIDI, reproduced mix, WAV, ZIP export"),
153
  ]
154
 
155
 
 
192
  return audio.astype(np.float32)
193
 
194
 
195
+ def _mono(audio: np.ndarray) -> np.ndarray:
196
+ audio = np.asarray(audio, dtype=np.float32)
197
+ if audio.ndim > 1:
198
+ audio = audio.mean(axis=1)
199
+ return audio.astype(np.float32)
200
+
201
+
202
+ def _pad_or_trim(audio: np.ndarray, length: int) -> np.ndarray:
203
+ audio = _mono(audio)
204
+ if len(audio) == length:
205
+ return audio
206
+ if len(audio) > length:
207
+ return audio[:length]
208
+ return np.pad(audio, (0, max(0, length - len(audio)))).astype(np.float32)
209
+
210
+
211
+ def _load_source_mix(audio_path: str | os.PathLike[str], sr: int) -> np.ndarray:
212
+ audio, _ = librosa.load(audio_path, sr=sr, mono=True)
213
+ return _mono(audio)
214
+
215
+
216
+ def _common_gain(reference: np.ndarray, fallback: np.ndarray) -> float:
217
+ peak = float(np.max(np.abs(reference))) if reference.size else 0.0
218
+ if peak <= 1e-8:
219
+ peak = float(np.max(np.abs(fallback))) if fallback.size else 0.0
220
+ return peak if peak > 1e-8 else 1.0
221
+
222
+
223
+ def _rms(audio: np.ndarray) -> float:
224
+ audio = _mono(audio)
225
+ return float(np.sqrt(np.mean(np.square(audio, dtype=np.float64)))) if audio.size else 0.0
226
+
227
+
228
+ def _match_rms(rendered: np.ndarray, reference: np.ndarray, *, min_gain: float = 0.05, max_gain: float = 8.0) -> np.ndarray:
229
+ rendered_rms = _rms(rendered)
230
+ reference_rms = _rms(reference)
231
+ if rendered_rms <= 1e-10 or reference_rms <= 1e-10:
232
+ return _mono(rendered)
233
+ gain = float(np.clip(reference_rms / rendered_rms, min_gain, max_gain))
234
+ return (_mono(rendered) * gain).astype(np.float32)
235
+
236
+
237
+ def _soft_limit(audio: np.ndarray, ceiling: float = 0.98) -> np.ndarray:
238
+ audio = _mono(audio).astype(np.float32)
239
+ peak = float(np.max(np.abs(audio))) if audio.size else 0.0
240
+ if peak > ceiling > 0:
241
+ audio = audio * (ceiling / peak)
242
+ return audio.astype(np.float32)
243
+
244
+
245
+ def _make_reproduction_mix(target_reconstruction: np.ndarray, context_bed: np.ndarray, length: int) -> np.ndarray:
246
+ target = _pad_or_trim(target_reconstruction, length)
247
+ context = _pad_or_trim(context_bed, length)
248
+ return _soft_limit(context + target)
249
+
250
+
251
  MODULE_ROOT = Path(__file__).resolve().parent
252
  CACHE_DIR = Path(os.environ["DSE_CACHE_DIR"]) if os.environ.get("DSE_CACHE_DIR") else MODULE_ROOT / ".cache"
253
  STEM_CACHE_DIR = CACHE_DIR / "stems"
 
369
 
370
  bpm: float | None = None
371
  stem_audio: np.ndarray
372
+ source_audio: np.ndarray
373
+ context_bed: np.ndarray
374
  stem_sr: int
375
  hits: list[Any] = []
376
  clusters: list[Any] = []
377
+ target_rendered: np.ndarray | None = None
378
+ reproduced: np.ndarray | None = None
379
 
380
  _notify(progress_cb, {"type": "start", "stages": [asdict(s) for s in stages]})
381
 
382
  with _timed_stage(stages, "stem", progress_cb) as stage:
383
+ raw_stem_audio, stem_sr, stem_detail = _load_or_extract_stem(audio_path, params)
384
+ source_raw = _load_source_mix(audio_path, stem_sr)
385
+ length = max(len(raw_stem_audio), len(source_raw))
386
+ raw_stem_audio = _pad_or_trim(raw_stem_audio, length)
387
+ source_raw = _pad_or_trim(source_raw, length)
388
+ gain = _common_gain(raw_stem_audio if params.stem != "all" else source_raw, source_raw)
389
+ stem_audio = (raw_stem_audio / gain).astype(np.float32)
390
+ source_audio = (source_raw / gain).astype(np.float32)
391
+ context_bed = np.zeros_like(source_audio) if params.stem == "all" else (source_audio - stem_audio).astype(np.float32)
392
+ stage.detail = stem_detail + (" · reproduction uses full mix" if params.stem == "all" else " · reproduction uses residual non-target stems")
393
+ _write_audio(out / "source.wav", _soft_limit(source_audio), stem_sr, subtype="PCM_16")
394
+ _write_audio(out / "stem.wav", _soft_limit(stem_audio), stem_sr, subtype="PCM_16")
395
+ _write_audio(out / "context_bed.wav", _soft_limit(context_bed), stem_sr, subtype="PCM_16")
396
+
397
+ audio_duration_sec = len(source_audio) / stem_sr if stem_sr else 0.0
398
 
399
  with _timed_stage(stages, "bpm", progress_cb) as stage:
400
  bpm = detect_bpm(stem_audio, stem_sr)
 
479
 
480
  sample_rows: list[dict[str, Any]] = []
481
  hit_rows: list[dict[str, Any]] = []
482
+ files: dict[str, str] = {"source": "source.wav", "stem": "stem.wav", "context_bed": "context_bed.wav"}
483
 
484
  with _timed_stage(stages, "export", progress_cb) as stage:
485
  midi_path = out / "reconstruction.mid"
 
491
  quantize=bool(params.quantize_midi),
492
  subdivision=int(params.subdivision),
493
  )
494
+ target_rendered = render_midi_with_samples(clusters, sr=stem_sr)
495
+ target_rendered = _match_rms(target_rendered, stem_audio)
496
  else:
497
+ target_rendered = np.zeros_like(stem_audio)
498
  midi_path.write_bytes(b"")
499
 
500
+ reproduced = _make_reproduction_mix(target_rendered, context_bed, max(len(source_audio), len(target_rendered)))
501
+ _write_audio(out / "target_reconstruction.wav", _soft_limit(target_rendered), stem_sr, subtype="PCM_16")
502
+ _write_audio(out / "reconstruction.wav", reproduced, stem_sr, subtype="PCM_16")
503
+ files["target_reconstruction"] = "target_reconstruction.wav"
504
  files["reconstruction"] = "reconstruction.wav"
505
  files["midi"] = "reconstruction.mid"
506
 
 
553
  synth_path = samples_dir / f"{cluster.label}__synth.wav"
554
  _write_audio(synth_path, cluster.synthesized, stem_sr)
555
 
556
+ archive_tmp = build_archive(
557
+ clusters,
558
+ bpm or 120.0,
559
+ stem_sr,
560
+ midi_path=str(midi_path),
561
+ rendered_audio=reproduced,
562
+ target_rendered_audio=target_rendered,
563
+ )
564
  files["archive"] = _copy_temp_file(archive_tmp, out / "sample-pack.zip")
565
  files["archive"] = "sample-pack.zip"
566
  try:
567
  os.unlink(archive_tmp)
568
  except OSError:
569
  pass
570
+ stage.detail = f"{len(sample_rows)} samples + {len(hit_rows)} review hits + MIDI + full-context reproduction + ZIP"
571
 
572
  duration_sec = time.perf_counter() - started_total
573
  result = PipelineResult(
sample_extractor.py CHANGED
@@ -525,9 +525,13 @@ def render_midi_with_samples(clusters,sr=44100):
525
  pk=np.abs(buf).max(); return (buf/pk*0.9).astype(np.float32) if pk>1e-8 else buf.astype(np.float32)
526
  def build_sample_map(clusters):
527
  return {c.midi_note:{'label':c.label,'count':c.count,'duration_ms':int(c.best_hit.duration*1000)} for c in clusters}
528
- def build_archive(clusters,bpm,sr,midi_path=None,rendered_audio=None):
529
  import zipfile,tempfile,io; zp=tempfile.mktemp(suffix='.zip')
530
  idx={'bpm':round(bpm,1),'sample_rate':sr,'total_clusters':len(clusters),'total_hits':sum(c.count for c in clusters),'samples':{}}
 
 
 
 
531
  with zipfile.ZipFile(zp,'w',compression=zipfile.ZIP_STORED) as zf:
532
  for c in clusters:
533
  b=c.best_hit; fn=f"samples/{c.label}.wav"; buf=io.BytesIO()
@@ -544,7 +548,11 @@ def build_archive(clusters,bpm,sr,midi_path=None,rendered_audio=None):
544
  if midi_path and os.path.exists(midi_path): zf.write(midi_path,'reconstruction.mid')
545
  if rendered_audio is not None:
546
  rb=io.BytesIO(); sf.write(rb,rendered_audio,sr,format='WAV',subtype='PCM_16')
 
547
  zf.writestr('rendered_reconstruction.wav',rb.getvalue())
 
 
 
548
  return zp
549
 
550
  # ─── Auto-tuner with locking ─────────────────────────────────────────────────
 
525
  pk=np.abs(buf).max(); return (buf/pk*0.9).astype(np.float32) if pk>1e-8 else buf.astype(np.float32)
526
  def build_sample_map(clusters):
527
  return {c.midi_note:{'label':c.label,'count':c.count,'duration_ms':int(c.best_hit.duration*1000)} for c in clusters}
528
+ def build_archive(clusters,bpm,sr,midi_path=None,rendered_audio=None,target_rendered_audio=None):
529
  import zipfile,tempfile,io; zp=tempfile.mktemp(suffix='.zip')
530
  idx={'bpm':round(bpm,1),'sample_rate':sr,'total_clusters':len(clusters),'total_hits':sum(c.count for c in clusters),'samples':{}}
531
+ if rendered_audio is not None:
532
+ idx['reproduction_file']='rendered_reproduction_full_mix.wav'
533
+ if target_rendered_audio is not None:
534
+ idx['target_reconstruction_file']='rendered_reconstruction_target_stem.wav'
535
  with zipfile.ZipFile(zp,'w',compression=zipfile.ZIP_STORED) as zf:
536
  for c in clusters:
537
  b=c.best_hit; fn=f"samples/{c.label}.wav"; buf=io.BytesIO()
 
548
  if midi_path and os.path.exists(midi_path): zf.write(midi_path,'reconstruction.mid')
549
  if rendered_audio is not None:
550
  rb=io.BytesIO(); sf.write(rb,rendered_audio,sr,format='WAV',subtype='PCM_16')
551
+ zf.writestr('rendered_reproduction_full_mix.wav',rb.getvalue())
552
  zf.writestr('rendered_reconstruction.wav',rb.getvalue())
553
+ if target_rendered_audio is not None:
554
+ tb=io.BytesIO(); sf.write(tb,target_rendered_audio,sr,format='WAV',subtype='PCM_16')
555
+ zf.writestr('rendered_reconstruction_target_stem.wav',tb.getvalue())
556
  return zp
557
 
558
  # ─── Auto-tuner with locking ─────────────────────────────────────────────────
scripts/test_api_job.py CHANGED
@@ -23,3 +23,5 @@ for _ in range(60):
23
  print(json.dumps({'status':job['status'], 'error':job.get('error'), 'hit_count': job.get('result',{}).get('hit_count'), 'files': job.get('result',{}).get('file_urls')}, indent=2))
24
  assert job['status']=='complete', job.get('error')
25
  assert job['result']['hit_count'] > 0
 
 
 
23
  print(json.dumps({'status':job['status'], 'error':job.get('error'), 'hit_count': job.get('result',{}).get('hit_count'), 'files': job.get('result',{}).get('file_urls')}, indent=2))
24
  assert job['status']=='complete', job.get('error')
25
  assert job['result']['hit_count'] > 0
26
+ for key in ['source', 'stem', 'context_bed', 'target_reconstruction', 'reconstruction', 'midi', 'archive']:
27
+ assert key in job['result']['file_urls'], key
scripts/test_supervised_export_and_force_onset.py CHANGED
@@ -89,7 +89,7 @@ def main() -> int:
89
  assert export["kind"] == "supervised-export"
90
  assert export["hit_count"] == state["summary"]["hit_count"] - state["summary"].get("suppressed_hit_count", 0)
91
  assert export["cluster_count"] >= 1
92
- for key in ["archive", "midi", "reconstruction"]:
93
  url = export["file_urls"][key]
94
  file_response = client.get(url)
95
  file_response.raise_for_status()
 
89
  assert export["kind"] == "supervised-export"
90
  assert export["hit_count"] == state["summary"]["hit_count"] - state["summary"].get("suppressed_hit_count", 0)
91
  assert export["cluster_count"] >= 1
92
+ for key in ["archive", "midi", "target_reconstruction", "reconstruction"]:
93
  url = export["file_urls"][key]
94
  file_response = client.get(url)
95
  file_response.raise_for_status()
supervised_export.py CHANGED
@@ -123,6 +123,54 @@ def _write_audio(path: Path, audio: np.ndarray, sr: int, subtype: str = "PCM_16"
123
  sf.write(path, audio, sr, subtype=subtype)
124
 
125
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
126
  def export_supervised_state(
127
  output_dir: str | os.PathLike[str],
128
  job_id: str,
@@ -159,20 +207,39 @@ def export_supervised_state(
159
  files: dict[str, str] = {}
160
  samples: list[dict[str, Any]] = []
161
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
162
  midi_path = export_dir / "reconstruction.mid"
163
  if clusters:
164
  export_midi(clusters, str(midi_path), bpm=bpm, quantize=quantize, subdivision=int(subdivision))
165
- rendered = render_midi_with_samples(clusters, sr=sr)
 
166
  if synthesize:
167
  for cluster in clusters:
168
  if cluster.count >= 2:
169
  cluster.synthesized = synthesize_from_cluster(cluster)
170
  else:
171
  midi_path.write_bytes(b"")
172
- rendered = np.zeros(sr, dtype=np.float32)
173
 
 
 
174
  _write_audio(export_dir / "reconstruction.wav", rendered, sr, subtype="PCM_16")
175
  files["midi"] = "supervised/reconstruction.mid"
 
176
  files["reconstruction"] = "supervised/reconstruction.wav"
177
 
178
  for cluster in sorted(clusters, key=lambda item: item.count, reverse=True):
@@ -197,7 +264,14 @@ def export_supervised_state(
197
  if cluster.synthesized is not None:
198
  _write_audio(out / f"supervised/samples/{cluster.label}__synth.wav", cluster.synthesized, sr, subtype="PCM_24")
199
 
200
- archive_tmp = build_archive(clusters, bpm, sr, midi_path=str(midi_path), rendered_audio=rendered)
 
 
 
 
 
 
 
201
  archive_rel = "supervised/sample-pack.zip"
202
  shutil.copyfile(archive_tmp, out / archive_rel)
203
  try:
 
123
  sf.write(path, audio, sr, subtype=subtype)
124
 
125
 
126
+ def _mono(audio: np.ndarray) -> np.ndarray:
127
+ audio = np.asarray(audio, dtype=np.float32)
128
+ if audio.ndim > 1:
129
+ audio = audio.mean(axis=1)
130
+ return audio.astype(np.float32)
131
+
132
+
133
+ def _pad_or_trim(audio: np.ndarray, length: int) -> np.ndarray:
134
+ audio = _mono(audio)
135
+ if len(audio) == length:
136
+ return audio
137
+ if len(audio) > length:
138
+ return audio[:length]
139
+ return np.pad(audio, (0, max(0, length - len(audio)))).astype(np.float32)
140
+
141
+
142
+ def _rms(audio: np.ndarray) -> float:
143
+ audio = _mono(audio)
144
+ return float(np.sqrt(np.mean(np.square(audio, dtype=np.float64)))) if audio.size else 0.0
145
+
146
+
147
+ def _match_rms(rendered: np.ndarray, reference: np.ndarray, *, min_gain: float = 0.05, max_gain: float = 8.0) -> np.ndarray:
148
+ rendered_rms = _rms(rendered)
149
+ reference_rms = _rms(reference)
150
+ if rendered_rms <= 1e-10 or reference_rms <= 1e-10:
151
+ return _mono(rendered)
152
+ return (_mono(rendered) * float(np.clip(reference_rms / rendered_rms, min_gain, max_gain))).astype(np.float32)
153
+
154
+
155
+ def _soft_limit(audio: np.ndarray, ceiling: float = 0.98) -> np.ndarray:
156
+ audio = _mono(audio).astype(np.float32)
157
+ peak = float(np.max(np.abs(audio))) if audio.size else 0.0
158
+ if peak > ceiling > 0:
159
+ audio = audio * (ceiling / peak)
160
+ return audio.astype(np.float32)
161
+
162
+
163
+ def _read_optional_audio(path: Path) -> tuple[np.ndarray | None, int | None]:
164
+ if not path.exists():
165
+ return None, None
166
+ audio, sr = sf.read(path, dtype="float32", always_2d=False)
167
+ return _mono(audio), int(sr)
168
+
169
+
170
+ def _make_reproduction_mix(target_reconstruction: np.ndarray, context_bed: np.ndarray, length: int) -> np.ndarray:
171
+ return _soft_limit(_pad_or_trim(context_bed, length) + _pad_or_trim(target_reconstruction, length))
172
+
173
+
174
  def export_supervised_state(
175
  output_dir: str | os.PathLike[str],
176
  job_id: str,
 
207
  files: dict[str, str] = {}
208
  samples: list[dict[str, Any]] = []
209
 
210
+ context_bed, context_sr = _read_optional_audio(out / "context_bed.wav")
211
+ source_audio, source_sr = _read_optional_audio(out / "source.wav")
212
+ stem_audio, stem_file_sr = _read_optional_audio(out / "stem.wav")
213
+ if context_sr:
214
+ sr = int(context_sr)
215
+ elif source_sr:
216
+ sr = int(source_sr)
217
+ elif stem_file_sr:
218
+ sr = int(stem_file_sr)
219
+ source_length = len(source_audio) if source_audio is not None else max((len(context_bed) if context_bed is not None else 0), sr)
220
+ if context_bed is None:
221
+ context_bed = np.zeros(source_length, dtype=np.float32)
222
+ if stem_audio is None:
223
+ stem_audio = np.zeros(source_length, dtype=np.float32)
224
+
225
  midi_path = export_dir / "reconstruction.mid"
226
  if clusters:
227
  export_midi(clusters, str(midi_path), bpm=bpm, quantize=quantize, subdivision=int(subdivision))
228
+ target_rendered = render_midi_with_samples(clusters, sr=sr)
229
+ target_rendered = _match_rms(target_rendered, stem_audio)
230
  if synthesize:
231
  for cluster in clusters:
232
  if cluster.count >= 2:
233
  cluster.synthesized = synthesize_from_cluster(cluster)
234
  else:
235
  midi_path.write_bytes(b"")
236
+ target_rendered = np.zeros(source_length, dtype=np.float32)
237
 
238
+ rendered = _make_reproduction_mix(target_rendered, context_bed, max(source_length, len(target_rendered)))
239
+ _write_audio(export_dir / "target_reconstruction.wav", _soft_limit(target_rendered), sr, subtype="PCM_16")
240
  _write_audio(export_dir / "reconstruction.wav", rendered, sr, subtype="PCM_16")
241
  files["midi"] = "supervised/reconstruction.mid"
242
+ files["target_reconstruction"] = "supervised/target_reconstruction.wav"
243
  files["reconstruction"] = "supervised/reconstruction.wav"
244
 
245
  for cluster in sorted(clusters, key=lambda item: item.count, reverse=True):
 
264
  if cluster.synthesized is not None:
265
  _write_audio(out / f"supervised/samples/{cluster.label}__synth.wav", cluster.synthesized, sr, subtype="PCM_24")
266
 
267
+ archive_tmp = build_archive(
268
+ clusters,
269
+ bpm,
270
+ sr,
271
+ midi_path=str(midi_path),
272
+ rendered_audio=rendered,
273
+ target_rendered_audio=target_rendered,
274
+ )
275
  archive_rel = "supervised/sample-pack.zip"
276
  shutil.copyfile(archive_tmp, out / archive_rel)
277
  try:
web/app.js CHANGED
@@ -17,6 +17,7 @@ let activeJobId = null;
17
  let selectedHitIndex = null;
18
  let selectedSampleIndex = null;
19
  let forceOnsetMode = false;
 
20
  let audioContext = null;
21
 
22
  const palette = ["#9b72ef", "#4f7df2", "#42b8b4", "#ef9343", "#ea5ca9", "#6d9be8", "#8abc59", "#805fe6"];
@@ -58,9 +59,24 @@ function fmtClock(seconds) {
58
  }
59
 
60
  function transportAudio() {
61
- const stem = $("stemAudio");
62
- const source = $("sourcePreview");
63
- return stem?.src ? stem : source;
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  }
65
 
66
  function updateTransport() {
@@ -78,9 +94,10 @@ function updateTransport() {
78
  }
79
 
80
  function pauseNonTransportAudio() {
81
- for (const id of ["hitAudio", "sampleAudio", "reconAudio"]) {
 
82
  const el = $(id);
83
- if (el && !el.paused) el.pause();
84
  }
85
  }
86
 
@@ -683,7 +700,7 @@ async function undoLastEdit() {
683
 
684
  function renderEditedExport(exportPayload) {
685
  const fileUrls = exportPayload?.file_urls ?? {};
686
- const labels = { archive: "Edited sample pack ZIP", midi: "Edited MIDI", reconstruction: "Edited reconstruction WAV" };
687
  $("editedDownloads").innerHTML = Object.entries(fileUrls)
688
  .map(([key, url]) => `<a href="${esc(url)}" download>${esc(labels[key] ?? key)}</a>`)
689
  .join("");
@@ -729,11 +746,25 @@ function renderResult(job) {
729
  $("resultSummary").textContent = `${result.hit_count} hits → ${result.cluster_count} samples · BPM ${result.bpm ?? "—"} · ${fmtSec(result.duration_sec)} total · ${rtf}× realtime · ${mode}`;
730
 
731
  const fileUrls = result.file_urls ?? {};
732
- const labels = { archive: "Sample pack ZIP", midi: "MIDI", stem: "Stem WAV", reconstruction: "Reconstruction WAV" };
733
- $("downloads").innerHTML = Object.entries(fileUrls).map(([key, url]) => `<a href="${esc(url)}" download>${esc(labels[key] ?? key)}</a>`).join("");
 
 
 
 
 
 
 
 
 
 
 
 
 
734
  $("stemAudio").src = fileUrls.stem ?? "";
735
  $("reconAudio").src = fileUrls.reconstruction ?? "";
736
- updateTransport();
 
737
 
738
  renderSamples(result);
739
  renderHits(result);
@@ -870,8 +901,9 @@ function setFile(file) {
870
  }
871
  if (file) {
872
  $("stemAudio").removeAttribute("src");
 
873
  $("sourcePreview").src = URL.createObjectURL(file);
874
- updateTransport();
875
  }
876
  }
877
 
@@ -910,12 +942,6 @@ async function boot() {
910
  $("demucs_model").addEventListener("change", updateStemOptions);
911
  $("fileInput").addEventListener("change", (event) => setFile(event.target.files?.[0] ?? null));
912
  $("runButton").addEventListener("click", runExtraction);
913
- $("useFastButton").addEventListener("click", () => {
914
- $("stem").value = "all";
915
- $("demucs_shifts").value = 0;
916
- $("target_min").value = 4;
917
- $("target_max").value = 16;
918
- });
919
  $("usePreviewButton").addEventListener("click", () => {
920
  $("stem").value = "all";
921
  $("clustering_mode").value = "online_preview";
@@ -924,6 +950,18 @@ $("usePreviewButton").addEventListener("click", () => {
924
  $("target_max").value = 16;
925
  $("mel_threshold").value = 0.62;
926
  $("ncc_threshold").value = 0.72;
 
 
 
 
 
 
 
 
 
 
 
 
927
  });
928
 
929
  for (const [id, delta] of [["clusterMinusButton", -1], ["clusterPlusButton", 1]]) {
@@ -963,7 +1001,7 @@ $("targetClusterSelect").addEventListener("change", setActionButtons);
963
  $("waveform").addEventListener("click", selectNearestWaveformHit);
964
  $("transportPlayButton").addEventListener("click", () => { toggleTransportPlayback().catch(() => {}); });
965
  $("transportSeek").addEventListener("input", (event) => seekTransport(event.target.value));
966
- for (const id of ["sourcePreview", "stemAudio"]) {
967
  const audio = $(id);
968
  audio.addEventListener("timeupdate", updateTransport);
969
  audio.addEventListener("durationchange", updateTransport);
@@ -971,6 +1009,9 @@ for (const id of ["sourcePreview", "stemAudio"]) {
971
  audio.addEventListener("pause", updateTransport);
972
  audio.addEventListener("ended", updateTransport);
973
  }
 
 
 
974
 
975
  const dropzone = $("dropzone");
976
  const globalDropOverlay = $("globalDropOverlay");
 
17
  let selectedHitIndex = null;
18
  let selectedSampleIndex = null;
19
  let forceOnsetMode = false;
20
+ let previewMode = "source";
21
  let audioContext = null;
22
 
23
  const palette = ["#9b72ef", "#4f7df2", "#42b8b4", "#ef9343", "#ea5ca9", "#6d9be8", "#8abc59", "#805fe6"];
 
59
  }
60
 
61
  function transportAudio() {
62
+ const candidates = {
63
+ source: $("sourcePreview"),
64
+ stem: $("stemAudio"),
65
+ reproduction: $("reconAudio"),
66
+ };
67
+ const selected = candidates[previewMode];
68
+ if (selected?.src) return selected;
69
+ return candidates.reproduction?.src || candidates.source?.src ? (candidates.reproduction?.src ? candidates.reproduction : candidates.source) : candidates.stem;
70
+ }
71
+
72
+ function setPreviewMode(mode) {
73
+ const previous = transportAudio();
74
+ if (previous && !previous.paused) previous.pause();
75
+ previewMode = mode;
76
+ for (const button of document.querySelectorAll("[data-preview-mode]")) {
77
+ button.classList.toggle("active", button.dataset.previewMode === previewMode);
78
+ }
79
+ updateTransport();
80
  }
81
 
82
  function updateTransport() {
 
94
  }
95
 
96
  function pauseNonTransportAudio() {
97
+ const keep = transportAudio();
98
+ for (const id of ["sourcePreview", "stemAudio", "reconAudio", "hitAudio", "sampleAudio"]) {
99
  const el = $(id);
100
+ if (el && el !== keep && !el.paused) el.pause();
101
  }
102
  }
103
 
 
700
 
701
  function renderEditedExport(exportPayload) {
702
  const fileUrls = exportPayload?.file_urls ?? {};
703
+ const labels = { archive: "Edited sample pack ZIP", midi: "Edited MIDI", reconstruction: "Edited reproduced mix WAV", target_reconstruction: "Edited target reconstruction WAV" };
704
  $("editedDownloads").innerHTML = Object.entries(fileUrls)
705
  .map(([key, url]) => `<a href="${esc(url)}" download>${esc(labels[key] ?? key)}</a>`)
706
  .join("");
 
746
  $("resultSummary").textContent = `${result.hit_count} hits → ${result.cluster_count} samples · BPM ${result.bpm ?? "—"} · ${fmtSec(result.duration_sec)} total · ${rtf}× realtime · ${mode}`;
747
 
748
  const fileUrls = result.file_urls ?? {};
749
+ const labels = {
750
+ archive: "Sample pack ZIP",
751
+ midi: "MIDI",
752
+ source: "Source mix WAV",
753
+ stem: "Target stem WAV",
754
+ context_bed: "Non-target stems WAV",
755
+ target_reconstruction: "Target reconstruction WAV",
756
+ reconstruction: "Reproduced mix WAV",
757
+ };
758
+ const downloadOrder = ["archive", "reconstruction", "target_reconstruction", "midi", "source", "stem", "context_bed"];
759
+ $("downloads").innerHTML = downloadOrder
760
+ .filter((key) => fileUrls[key])
761
+ .map((key) => `<a href="${esc(fileUrls[key])}" download>${esc(labels[key] ?? key)}</a>`)
762
+ .join("");
763
+ $("sourcePreview").src = fileUrls.source ?? $("sourcePreview").src ?? "";
764
  $("stemAudio").src = fileUrls.stem ?? "";
765
  $("reconAudio").src = fileUrls.reconstruction ?? "";
766
+ if (fileUrls.reconstruction) setPreviewMode("reproduction");
767
+ else updateTransport();
768
 
769
  renderSamples(result);
770
  renderHits(result);
 
901
  }
902
  if (file) {
903
  $("stemAudio").removeAttribute("src");
904
+ $("reconAudio").removeAttribute("src");
905
  $("sourcePreview").src = URL.createObjectURL(file);
906
+ setPreviewMode("source");
907
  }
908
  }
909
 
 
942
  $("demucs_model").addEventListener("change", updateStemOptions);
943
  $("fileInput").addEventListener("change", (event) => setFile(event.target.files?.[0] ?? null));
944
  $("runButton").addEventListener("click", runExtraction);
 
 
 
 
 
 
945
  $("usePreviewButton").addEventListener("click", () => {
946
  $("stem").value = "all";
947
  $("clustering_mode").value = "online_preview";
 
950
  $("target_max").value = 16;
951
  $("mel_threshold").value = 0.62;
952
  $("ncc_threshold").value = 0.72;
953
+ $("resultSummary").textContent = "Fast preview preset applied: full mix, online grouping, no Demucs shifts.";
954
+ });
955
+ $("useQualityButton").addEventListener("click", () => {
956
+ if (($("stem").value || "") === "all") $("stem").value = "drums";
957
+ $("clustering_mode").value = "batch_quality";
958
+ $("demucs_shifts").value = 1;
959
+ $("demucs_overlap").value = 0.25;
960
+ $("target_min").value = 5;
961
+ $("target_max").value = 20;
962
+ $("mel_threshold").value = 0.75;
963
+ $("ncc_threshold").value = 0.80;
964
+ $("resultSummary").textContent = "Best quality preset applied: separated stem, batch clustering, conservative grouping.";
965
  });
966
 
967
  for (const [id, delta] of [["clusterMinusButton", -1], ["clusterPlusButton", 1]]) {
 
1001
  $("waveform").addEventListener("click", selectNearestWaveformHit);
1002
  $("transportPlayButton").addEventListener("click", () => { toggleTransportPlayback().catch(() => {}); });
1003
  $("transportSeek").addEventListener("input", (event) => seekTransport(event.target.value));
1004
+ for (const id of ["sourcePreview", "stemAudio", "reconAudio"]) {
1005
  const audio = $(id);
1006
  audio.addEventListener("timeupdate", updateTransport);
1007
  audio.addEventListener("durationchange", updateTransport);
 
1009
  audio.addEventListener("pause", updateTransport);
1010
  audio.addEventListener("ended", updateTransport);
1011
  }
1012
+ for (const button of document.querySelectorAll("[data-preview-mode]")) {
1013
+ button.addEventListener("click", () => setPreviewMode(button.dataset.previewMode));
1014
+ }
1015
 
1016
  const dropzone = $("dropzone");
1017
  const globalDropOverlay = $("globalDropOverlay");
web/index.html CHANGED
@@ -88,6 +88,11 @@
88
  <button id="transportPlayButton" class="round-play" type="button" aria-label="Play preview">▶</button>
89
  <span id="transportTime" class="transport-time">0:00 / 0:00</span>
90
  <input id="transportSeek" class="transport-seek" type="range" min="0" max="1000" value="0" step="1" aria-label="Seek preview" />
 
 
 
 
 
91
  </div>
92
  <div class="hidden-audio-bank" aria-hidden="true">
93
  <audio id="sourcePreview"></audio>
@@ -109,26 +114,35 @@
109
 
110
  <aside class="sidebar right-sidebar" aria-label="Right tool sidebar">
111
  <details class="tool-panel control-card" open>
112
- <summary><span>Extract</span><small>Core controls</small></summary>
 
113
  <div class="control-group">
114
- <label>Stem
115
  <select id="stem"></select>
 
116
  </label>
117
  </div>
118
 
119
  <div class="control-group sensitivity-group">
120
- <label for="onset_delta">Sensitivity</label>
121
  <input id="onset_delta" type="range" min="0.01" max="0.35" step="0.005" />
122
- <div class="range-caption"><span>Low</span><span>High</span></div>
 
123
  </div>
124
 
125
  <div class="control-group">
126
- <label>Cluster Count</label>
127
- <div class="stepper">
128
- <button id="clusterMinusButton" type="button" class="step-button" aria-label="Decrease cluster count">−</button>
129
- <input id="target_max" type="number" min="0" max="256" step="1" aria-label="Cluster count" />
130
- <button id="clusterPlusButton" type="button" class="step-button" aria-label="Increase cluster count">+</button>
131
- </div>
 
 
 
 
 
 
132
  </div>
133
  </details>
134
 
@@ -140,84 +154,99 @@
140
  </details>
141
 
142
  <details class="tool-panel advanced-controls">
143
- <summary><span>Advanced</span><small>Model and DSP settings</small></summary>
144
- <div class="preset-row">
145
- <button id="usePreviewButton" class="ghost-button" type="button">Online preview mode</button>
146
- <button id="useFastButton" class="ghost-button" type="button">Fast full-mix mode</button>
147
- </div>
148
- <div class="control-grid compact-controls">
149
- <label>Demucs model
150
- <select id="demucs_model"></select>
151
- </label>
152
- <label>Clustering mode
153
- <select id="clustering_mode">
154
- <option value="batch_quality">batch quality</option>
155
- <option value="online_preview">online preview</option>
156
- </select>
157
- </label>
158
- <label>Shifts
159
- <input id="demucs_shifts" type="number" min="0" max="8" step="1" />
160
- </label>
161
- <label>Overlap
162
- <input id="demucs_overlap" type="number" min="0" max="0.9" step="0.05" />
163
- </label>
164
- <label>Onset mode
165
- <select id="onset_mode">
166
- <option value="auto">auto / multiband</option>
167
- <option value="percussive">percussive</option>
168
- <option value="harmonic">harmonic</option>
169
- <option value="broadband">broadband</option>
170
- </select>
171
- </label>
172
- <label>Energy threshold dB
173
- <input id="energy_threshold_db" type="number" min="-100" max="0" step="1" />
174
- </label>
175
- <label>Minimum gap seconds
176
- <input id="min_gap" type="number" min="0.001" max="1" step="0.005" />
177
- </label>
178
- <label>Pre-pad seconds
179
- <input id="pre_pad" type="number" min="0" max="0.25" step="0.001" />
180
- </label>
181
- <label>Min duration seconds
182
- <input id="min_dur" type="number" min="0.001" max="10" step="0.005" />
183
- </label>
184
- <label>Max duration seconds
185
- <input id="max_dur" type="number" min="0.01" max="10" step="0.1" />
186
- </label>
187
- <label>NCC threshold
188
- <input id="ncc_threshold" type="number" min="0" max="1" step="0.01" />
189
- </label>
190
- <label>Attack window ms
191
- <input id="attack_ms" type="number" min="1" max="250" step="1" />
192
- </label>
193
- <label>Mel prefilter
194
- <input id="mel_threshold" type="number" min="0" max="1" step="0.01" />
195
- </label>
196
- <label>Linkage
197
- <select id="linkage">
198
- <option value="average">average</option>
199
- <option value="complete">complete</option>
200
- <option value="single">single</option>
201
- </select>
202
- </label>
203
- <label>Target min clusters
204
- <input id="target_min" type="number" min="0" max="256" step="1" />
205
- </label>
206
- <label>MIDI grid
207
- <select id="subdivision">
208
- <option value="8">8th</option>
209
- <option value="16">16th</option>
210
- <option value="32">32nd</option>
211
- <option value="64">64th</option>
212
- </select>
213
- </label>
214
- </div>
215
- <div class="toggles">
216
- <label><input id="synthesize" type="checkbox" /> synthesize alternates</label>
217
- <label><input id="quantize_midi" type="checkbox" /> quantize MIDI</label>
218
- <label><input id="use_disk_cache" type="checkbox" /> disk cache stems/source loads</label>
219
- </div>
220
- <button id="clearCacheButton" class="ghost-button full-width" type="button">Clear cache</button>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
221
  </details>
222
  </aside>
223
  </main>
 
88
  <button id="transportPlayButton" class="round-play" type="button" aria-label="Play preview">▶</button>
89
  <span id="transportTime" class="transport-time">0:00 / 0:00</span>
90
  <input id="transportSeek" class="transport-seek" type="range" min="0" max="1000" value="0" step="1" aria-label="Seek preview" />
91
+ <div class="preview-tabs" aria-label="Preview source">
92
+ <button id="previewSourceButton" class="preview-tab active" type="button" data-preview-mode="source">Source</button>
93
+ <button id="previewStemButton" class="preview-tab" type="button" data-preview-mode="stem">Stem</button>
94
+ <button id="previewReproductionButton" class="preview-tab" type="button" data-preview-mode="reproduction">Reproduced</button>
95
+ </div>
96
  </div>
97
  <div class="hidden-audio-bank" aria-hidden="true">
98
  <audio id="sourcePreview"></audio>
 
114
 
115
  <aside class="sidebar right-sidebar" aria-label="Right tool sidebar">
116
  <details class="tool-panel control-card" open>
117
+ <summary><span>Common controls</span><small>The 3 knobs to try first</small></summary>
118
+ <p class="panel-help">Start here. These controls decide what gets detected, how many groups are produced, and which stem is sampled.</p>
119
  <div class="control-group">
120
+ <label>Stem to sample
121
  <select id="stem"></select>
122
+ <small class="field-hint">Use <strong>drums</strong> for separated drum hits, or <strong>all</strong> for fast full-mix prototyping.</small>
123
  </label>
124
  </div>
125
 
126
  <div class="control-group sensitivity-group">
127
+ <label for="onset_delta">Hit sensitivity</label>
128
  <input id="onset_delta" type="range" min="0.01" max="0.35" step="0.005" />
129
+ <div class="range-caption"><span>Fewer hits</span><span>More hits</span></div>
130
+ <small class="field-hint">Increase when quiet hits are missed; decrease when bleed or ghost transients are over-detected.</small>
131
  </div>
132
 
133
  <div class="control-group">
134
+ <label>Sample groups
135
+ <div class="stepper">
136
+ <button id="clusterMinusButton" type="button" class="step-button" aria-label="Decrease cluster count">−</button>
137
+ <input id="target_max" type="number" min="0" max="256" step="1" aria-label="Sample group count" />
138
+ <button id="clusterPlusButton" type="button" class="step-button" aria-label="Increase cluster count">+</button>
139
+ </div>
140
+ <small class="field-hint">Approximate maximum number of sample cards to produce.</small>
141
+ </label>
142
+ </div>
143
+ <div class="preset-row common-presets">
144
+ <button id="usePreviewButton" class="ghost-button" type="button">Fast preview</button>
145
+ <button id="useQualityButton" class="ghost-button" type="button">Best quality</button>
146
  </div>
147
  </details>
148
 
 
154
  </details>
155
 
156
  <details class="tool-panel advanced-controls">
157
+ <summary><span>Advanced parameters</span><small>Only adjust when the common controls are not enough</small></summary>
158
+ <p class="panel-help">Advanced controls are grouped by pipeline stage. They are intentionally hidden from the normal extraction loop.</p>
159
+ <section class="advanced-section">
160
+ <h4>Stem separation</h4>
161
+ <div class="control-grid compact-controls">
162
+ <label>Demucs model
163
+ <select id="demucs_model"></select>
164
+ </label>
165
+ <label>Shifts
166
+ <input id="demucs_shifts" type="number" min="0" max="8" step="1" />
167
+ </label>
168
+ <label>Overlap
169
+ <input id="demucs_overlap" type="number" min="0" max="0.9" step="0.05" />
170
+ </label>
171
+ </div>
172
+ </section>
173
+ <section class="advanced-section">
174
+ <h4>Hit detection</h4>
175
+ <div class="control-grid compact-controls">
176
+ <label>Onset mode
177
+ <select id="onset_mode">
178
+ <option value="auto">auto / multiband</option>
179
+ <option value="percussive">percussive</option>
180
+ <option value="harmonic">harmonic</option>
181
+ <option value="broadband">broadband</option>
182
+ </select>
183
+ </label>
184
+ <label>Energy threshold dB
185
+ <input id="energy_threshold_db" type="number" min="-100" max="0" step="1" />
186
+ </label>
187
+ <label>Minimum gap seconds
188
+ <input id="min_gap" type="number" min="0.001" max="1" step="0.005" />
189
+ </label>
190
+ <label>Pre-pad seconds
191
+ <input id="pre_pad" type="number" min="0" max="0.25" step="0.001" />
192
+ </label>
193
+ <label>Min duration seconds
194
+ <input id="min_dur" type="number" min="0.001" max="10" step="0.005" />
195
+ </label>
196
+ <label>Max duration seconds
197
+ <input id="max_dur" type="number" min="0.01" max="10" step="0.1" />
198
+ </label>
199
+ </div>
200
+ </section>
201
+ <section class="advanced-section">
202
+ <h4>Grouping</h4>
203
+ <div class="control-grid compact-controls">
204
+ <label>Clustering mode
205
+ <select id="clustering_mode">
206
+ <option value="batch_quality">batch quality</option>
207
+ <option value="online_preview">online preview</option>
208
+ </select>
209
+ </label>
210
+ <label>Target min clusters
211
+ <input id="target_min" type="number" min="0" max="256" step="1" />
212
+ </label>
213
+ <label>NCC threshold
214
+ <input id="ncc_threshold" type="number" min="0" max="1" step="0.01" />
215
+ </label>
216
+ <label>Attack window ms
217
+ <input id="attack_ms" type="number" min="1" max="250" step="1" />
218
+ </label>
219
+ <label>Mel prefilter
220
+ <input id="mel_threshold" type="number" min="0" max="1" step="0.01" />
221
+ </label>
222
+ <label>Linkage
223
+ <select id="linkage">
224
+ <option value="average">average</option>
225
+ <option value="complete">complete</option>
226
+ <option value="single">single</option>
227
+ </select>
228
+ </label>
229
+ </div>
230
+ </section>
231
+ <section class="advanced-section">
232
+ <h4>Export and cache</h4>
233
+ <div class="control-grid compact-controls">
234
+ <label>MIDI grid
235
+ <select id="subdivision">
236
+ <option value="8">8th</option>
237
+ <option value="16">16th</option>
238
+ <option value="32">32nd</option>
239
+ <option value="64">64th</option>
240
+ </select>
241
+ </label>
242
+ </div>
243
+ <div class="toggles">
244
+ <label><input id="synthesize" type="checkbox" /> synthesize alternates</label>
245
+ <label><input id="quantize_midi" type="checkbox" /> quantize MIDI</label>
246
+ <label><input id="use_disk_cache" type="checkbox" /> disk cache stems/source loads</label>
247
+ </div>
248
+ <button id="clearCacheButton" class="ghost-button full-width" type="button">Clear cache</button>
249
+ </section>
250
  </details>
251
  </aside>
252
  </main>
web/styles.css CHANGED
@@ -857,3 +857,75 @@ tr:last-child td { border-bottom: 0; }
857
  box-shadow: none;
858
  }
859
  .transport-row .transport-seek:hover::-moz-range-thumb { opacity: 1; }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
857
  box-shadow: none;
858
  }
859
  .transport-row .transport-seek:hover::-moz-range-thumb { opacity: 1; }
860
+
861
+ /* Pass 9: clearer preview and parameter hierarchy. */
862
+ .transport-row {
863
+ grid-template-columns: 48px 100px minmax(120px, 1fr) auto;
864
+ gap: 14px;
865
+ }
866
+ .preview-tabs {
867
+ display: inline-flex;
868
+ align-items: center;
869
+ gap: 4px;
870
+ padding: 4px;
871
+ border: 1px solid var(--line);
872
+ border-radius: 999px;
873
+ background: #f7f7fa;
874
+ white-space: nowrap;
875
+ }
876
+ .preview-tab {
877
+ border: 0;
878
+ border-radius: 999px;
879
+ padding: 7px 10px;
880
+ background: transparent;
881
+ color: var(--muted);
882
+ font-size: 11px;
883
+ font-weight: 800;
884
+ cursor: pointer;
885
+ }
886
+ .preview-tab.active {
887
+ background: #fff;
888
+ color: var(--accent-strong);
889
+ box-shadow: 0 2px 8px rgba(18, 21, 30, .06);
890
+ }
891
+ .panel-help,
892
+ .field-hint {
893
+ display: block;
894
+ color: var(--muted);
895
+ font-size: 11px;
896
+ line-height: 1.4;
897
+ font-weight: 560;
898
+ }
899
+ .panel-help {
900
+ margin: 10px 0 4px;
901
+ }
902
+ .field-hint {
903
+ margin-top: 7px;
904
+ }
905
+ .common-presets {
906
+ grid-template-columns: 1fr 1fr;
907
+ }
908
+ .advanced-section {
909
+ margin-top: 14px;
910
+ padding-top: 12px;
911
+ border-top: 1px solid rgba(228, 229, 233, .75);
912
+ }
913
+ .advanced-section:first-of-type {
914
+ border-top: 0;
915
+ padding-top: 0;
916
+ }
917
+ .advanced-section h4 {
918
+ margin: 0 0 8px;
919
+ color: var(--text);
920
+ font-size: 12px;
921
+ letter-spacing: -.01em;
922
+ }
923
+ @media (max-width: 920px) {
924
+ .transport-row {
925
+ grid-template-columns: 44px 86px minmax(0, 1fr);
926
+ }
927
+ .preview-tabs {
928
+ grid-column: 1 / -1;
929
+ justify-content: center;
930
+ }
931
+ }