ChatGPT commited on
Commit
fa35534
·
1 Parent(s): f026127

feat: add spleeter and selected card exports

Browse files
README.md CHANGED
@@ -12,7 +12,7 @@ pinned: false
12
 
13
  A custom FastAPI + browser workstation for extracting, reviewing, and now semantically supervising reusable drum samples from an audio file.
14
 
15
- The pipeline can isolate a stem with Demucs, detect onsets, classify hits, cluster similar transients, choose representative samples, optionally synthesize alternate samples, and export WAVs, MIDI, target-stem reconstruction, full-context reproduced audio, manifests, and a complete ZIP sample pack. The interactive layer stores user corrections as replayable semantic state beside each run manifest.
16
 
17
  ## Current status
18
 
@@ -56,6 +56,12 @@ Implemented:
56
  - restore suppressed hits,
57
  - edited sample-pack export,
58
  - constraint/event log.
 
 
 
 
 
 
59
  - Documentation for features, progress, tasks, API, timing, hit review, realtime suitability, UI, remaining work, and interactive UX.
60
  - Legacy Gradio apps preserved in `legacy/` for reference only.
61
 
@@ -64,7 +70,8 @@ Not fully complete yet:
64
  - No true cached feature-vector local reclustering yet.
65
  - No cluster merge/split/relabel workflow beyond move/pull-to-new-cluster.
66
  - No frontend TypeScript build/test harness yet.
67
- - Demucs remains offline/batch by design.
 
68
 
69
  See:
70
 
@@ -92,12 +99,13 @@ uvicorn app:app --host 0.0.0.0 --port 7860
92
 
93
  Open `http://127.0.0.1:7860`.
94
 
95
- For fast iteration, open `Advanced`, then use `Fast full-mix mode` or set:
96
 
 
97
  - `Stem = all`
98
  - `Clustering mode = online_preview`
99
 
100
- That bypasses Demucs and uses the near-realtime clustering path.
101
 
102
  ## Run checks
103
 
@@ -109,6 +117,7 @@ python3 scripts/test_interactive_supervision.py
109
  python3 scripts/test_supervised_export_and_force_onset.py
110
  python3 scripts/test_progress_contract.py
111
  python3 scripts/test_param_validation_and_api_errors.py
 
112
  ```
113
 
114
  ## Run benchmarks
@@ -125,7 +134,7 @@ The benchmark uses synthetic drum fixtures and `stem=all` so the DSP stages are
125
  curl http://127.0.0.1:7860/api/config
126
 
127
  curl -F 'file=@song.wav' \
128
- -F 'params={"stem":"all","clustering_mode":"online_preview","target_min":4,"target_max":12}' \
129
  http://127.0.0.1:7860/api/jobs
130
  ```
131
 
@@ -149,6 +158,29 @@ curl -X POST http://127.0.0.1:7860/api/jobs/<job-id>/hits/hit%3A00003/move \
149
  -d '{"target_cluster_id":"cluster:0"}'
150
  ```
151
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
152
  List active/completed runs:
153
 
154
  ```bash
@@ -160,19 +192,30 @@ curl http://127.0.0.1:7860/api/jobs
160
  | Path | Purpose |
161
  |---|---|
162
  | `app.py` | FastAPI app, static UI serving, job API, run history, artifact downloads, supervised editing endpoints |
163
- | `pipeline_runner.py` | Timed extraction pipeline, real progress contract, disk stem/source cache, batch/online clustering routing |
164
  | `sample_extractor.py` | Core DSP/sample extraction implementation, including chunk-progress callback support for Demucs stem extraction |
165
  | `supervised_state.py` | Persistent semantic state, confidence, constraints, events, suggestions, force-onset, restore, undo |
166
- | `supervised_export.py` | Renders edited semantic state into supervised WAV/MIDI/reconstruction/ZIP artifacts |
167
  | `web/` | Custom no-build browser frontend with clean fixed non-scrolling workstation layout, explicit upload/whole-page drag-drop, immediate uploaded waveform rendering, real-progress waveform tinting, source/stem/reproduced preview transport, common/advanced parameter separation, collapsed sidebars/bottom dock, sample-card grid, hidden-audio audition, add-onset mode, and edited export |
168
  | `scripts/benchmark_subprocesses.py` | Synthetic benchmark runner for stage timings |
169
  | `scripts/test_interactive_supervision.py` | Smoke test for supervised state endpoints |
170
  | `scripts/test_supervised_export_and_force_onset.py` | Smoke test for force-onset, restore, suggestion diffs, and edited exports |
171
  | `scripts/test_param_validation_and_api_errors.py` | Regression test for browser-style parameter coercion and visible API error details |
 
172
  | `docs/interactive-ux/` | Supplied interactive UX docs aligned to current implementation |
173
  | `docs/` | Review, timing, API, UI, feature, task, progress, and remaining-work documentation |
174
  | `legacy/` | Previous Gradio apps retained for reference |
175
 
 
 
 
 
 
 
 
 
 
 
176
  ## Output per run
177
 
178
  Each run is stored under `.runs/<job-id>/output/`:
@@ -187,6 +230,8 @@ Each run is stored under `.runs/<job-id>/output/`:
187
  - `supervision_state.json`
188
  - `supervised/manifest.json` after edited export
189
  - `supervised/sample-pack.zip` after edited export
 
 
190
  - `supervised/samples/*.wav` after edited export
191
  - `supervised/reconstruction.mid` after edited export
192
  - `supervised/reconstruction.wav` after edited export
@@ -206,7 +251,7 @@ The default UI is now intentionally simple:
206
  3. Upload and extraction start automatically.
207
  4. Automatic tuning chooses practical onset sensitivity and sample-group bounds after the source/stem is available.
208
  5. Sample cards appear in grouped columns as soon as their WAVs are written.
209
- 6. The user can audition, dismiss, draw another candidate, or trim/extend a card and save that edit as a forced hit.
210
 
211
  Advanced parameters, run history, raw tables, and supervised semantic editing remain available in collapsed panels, but they are no longer required for the common path.
212
 
 
12
 
13
  A custom FastAPI + browser workstation for extracting, reviewing, and now semantically supervising reusable drum samples from an audio file.
14
 
15
+ The pipeline defaults to Spleeter for lightweight source separation, can fall back to Demucs for quality, can bypass separation entirely for fast full-mix previews, detects onsets, classifies hits, clusters similar transients, chooses representative samples, optionally synthesizes alternate samples, and exports WAVs, MIDI, target-stem reconstruction, full-context reproduced audio, manifests, selected-only packs, and complete ZIP sample packs. The interactive layer stores user corrections as replayable semantic state beside each run manifest.
16
 
17
  ## Current status
18
 
 
56
  - restore suppressed hits,
57
  - edited sample-pack export,
58
  - constraint/event log.
59
+
60
+ - Spleeter source-separation backend selected by default, with `spleeter:4stems`, `spleeter:2stems`, and `spleeter:5stems` support.
61
+ - Optional Demucs backend and automatic Spleeter→Demucs fallback when enabled.
62
+ - True per-card checkbox selection and selected-only export under `selected/`.
63
+ - Persisted `draw another` card action that pins the next representative hit for the cluster.
64
+ - Immediate trim/extend card edits that rewrite preview WAVs under `overrides/hits/` and persist to supervised state.
65
  - Documentation for features, progress, tasks, API, timing, hit review, realtime suitability, UI, remaining work, and interactive UX.
66
  - Legacy Gradio apps preserved in `legacy/` for reference only.
67
 
 
70
  - No true cached feature-vector local reclustering yet.
71
  - No cluster merge/split/relabel workflow beyond move/pull-to-new-cluster.
72
  - No frontend TypeScript build/test harness yet.
73
+ - Spleeter progress is coarse-grained; Demucs progress exposes chunk-level work where available.
74
+ - Demucs remains offline/batch by design and is now treated as the higher-cost quality/fallback backend.
75
 
76
  See:
77
 
 
99
 
100
  Open `http://127.0.0.1:7860`.
101
 
102
+ For fast iteration, use the default automatic flow. To bypass source separation entirely, open `Advanced`, use `Fast preview`, or set:
103
 
104
+ - `Separation engine = none`
105
  - `Stem = all`
106
  - `Clustering mode = online_preview`
107
 
108
+ That uses the full mix and the near-realtime clustering path. The default engine is Spleeter. Install it separately with `pip install -r requirements-spleeter.txt` in an environment compatible with Spleeter/TensorFlow. If Spleeter is unavailable and fallback is enabled, the app falls back to Demucs.
109
 
110
  ## Run checks
111
 
 
117
  python3 scripts/test_supervised_export_and_force_onset.py
118
  python3 scripts/test_progress_contract.py
119
  python3 scripts/test_param_validation_and_api_errors.py
120
+ python3 scripts/test_selected_export_card_actions.py
121
  ```
122
 
123
  ## Run benchmarks
 
134
  curl http://127.0.0.1:7860/api/config
135
 
136
  curl -F 'file=@song.wav' \
137
+ -F 'params={"separation_backend":"spleeter","spleeter_model":"spleeter:4stems","stem":"drums","clustering_mode":"online_preview","target_min":4,"target_max":12}' \
138
  http://127.0.0.1:7860/api/jobs
139
  ```
140
 
 
158
  -d '{"target_cluster_id":"cluster:0"}'
159
  ```
160
 
161
+
162
+ Export selected cards only:
163
+
164
+ ```bash
165
+ curl -X POST http://127.0.0.1:7860/api/jobs/<job-id>/export-selected \
166
+ -H 'Content-Type: application/json' \
167
+ -d '{"labels":["kick_0","snare_0"],"synthesize":true}'
168
+ ```
169
+
170
+ Draw another representative for a card:
171
+
172
+ ```bash
173
+ curl -X POST http://127.0.0.1:7860/api/jobs/<job-id>/samples/kick_0/draw
174
+ ```
175
+
176
+ Trim/extend the current representative preview:
177
+
178
+ ```bash
179
+ curl -X POST http://127.0.0.1:7860/api/jobs/<job-id>/samples/kick_0/edit \
180
+ -H 'Content-Type: application/json' \
181
+ -d '{"start_offset_ms":-8,"tail_offset_ms":24}'
182
+ ```
183
+
184
  List active/completed runs:
185
 
186
  ```bash
 
192
  | Path | Purpose |
193
  |---|---|
194
  | `app.py` | FastAPI app, static UI serving, job API, run history, artifact downloads, supervised editing endpoints |
195
+ | `pipeline_runner.py` | Timed extraction pipeline, Spleeter/Demucs/none separation backends, real progress contract, disk source/stem/context cache, batch/online clustering routing |
196
  | `sample_extractor.py` | Core DSP/sample extraction implementation, including chunk-progress callback support for Demucs stem extraction |
197
  | `supervised_state.py` | Persistent semantic state, confidence, constraints, events, suggestions, force-onset, restore, undo |
198
+ | `supervised_export.py` | Renders edited semantic state into supervised and selected-only WAV/MIDI/reconstruction/ZIP artifacts |
199
  | `web/` | Custom no-build browser frontend with clean fixed non-scrolling workstation layout, explicit upload/whole-page drag-drop, immediate uploaded waveform rendering, real-progress waveform tinting, source/stem/reproduced preview transport, common/advanced parameter separation, collapsed sidebars/bottom dock, sample-card grid, hidden-audio audition, add-onset mode, and edited export |
200
  | `scripts/benchmark_subprocesses.py` | Synthetic benchmark runner for stage timings |
201
  | `scripts/test_interactive_supervision.py` | Smoke test for supervised state endpoints |
202
  | `scripts/test_supervised_export_and_force_onset.py` | Smoke test for force-onset, restore, suggestion diffs, and edited exports |
203
  | `scripts/test_param_validation_and_api_errors.py` | Regression test for browser-style parameter coercion and visible API error details |
204
+ | `scripts/test_selected_export_card_actions.py` | Smoke test for selected-only export, draw-next persistence, and immediate preview timing edits |
205
  | `docs/interactive-ux/` | Supplied interactive UX docs aligned to current implementation |
206
  | `docs/` | Review, timing, API, UI, feature, task, progress, and remaining-work documentation |
207
  | `legacy/` | Previous Gradio apps retained for reference |
208
 
209
+ ## Optional Spleeter backend
210
+
211
+ Spleeter is the default selected backend because it is much lighter than Demucs for the common path. It is not pinned into `requirements.txt` because TensorFlow/Spleeter compatibility depends on the Python environment. Use:
212
+
213
+ ```bash
214
+ pip install -r requirements-spleeter.txt
215
+ ```
216
+
217
+ Leave `allow_backend_fallback=true` for normal use so missing or failing Spleeter installs automatically fall back to Demucs. Disable fallback only when debugging Spleeter itself.
218
+
219
  ## Output per run
220
 
221
  Each run is stored under `.runs/<job-id>/output/`:
 
230
  - `supervision_state.json`
231
  - `supervised/manifest.json` after edited export
232
  - `supervised/sample-pack.zip` after edited export
233
+ - `selected/sample-pack.zip` after selected-card export
234
+ - `overrides/hits/*.wav` after immediate card trim/extend edits
235
  - `supervised/samples/*.wav` after edited export
236
  - `supervised/reconstruction.mid` after edited export
237
  - `supervised/reconstruction.wav` after edited export
 
251
  3. Upload and extraction start automatically.
252
  4. Automatic tuning chooses practical onset sensitivity and sample-group bounds after the source/stem is available.
253
  5. Sample cards appear in grouped columns as soon as their WAVs are written.
254
+ 6. The user can audition, dismiss, draw another candidate, or trim/extend a card. Draw and timing choices are persisted as semantic overrides and affect selected/edited exports.
255
 
256
  Advanced parameters, run history, raw tables, and supervised semantic editing remain available in collapsed panels, but they are no longer required for the common path.
257
 
app.py CHANGED
@@ -24,7 +24,7 @@ from fastapi.middleware.cors import CORSMiddleware
24
  from fastapi.responses import FileResponse, JSONResponse, StreamingResponse
25
  from fastapi.staticfiles import StaticFiles
26
 
27
- from pipeline_runner import PipelineParams, clear_disk_cache, initial_stages, run_extraction_pipeline
28
  from sample_extractor import DEMUCS_MODELS, DEMUCS_STEMS, cache_clear
29
  from supervised_state import (
30
  accept_suggestion,
@@ -39,9 +39,11 @@ from supervised_state import (
39
  restore_hit as apply_hit_restore,
40
  set_hit_review_status,
41
  suppress_hit as apply_hit_suppression,
 
 
42
  undo_last as apply_undo,
43
  )
44
- from supervised_export import export_supervised_state
45
 
46
  ROOT = Path(__file__).resolve().parent
47
  WEB_DIR = ROOT / "web"
@@ -249,6 +251,9 @@ def health() -> dict[str, str]:
249
  @app.get("/api/config")
250
  def config() -> dict[str, Any]:
251
  return {
 
 
 
252
  "demucs_models": DEMUCS_MODELS,
253
  "demucs_stems": {key: value + ["all"] for key, value in DEMUCS_STEMS.items()},
254
  "defaults": asdict(PipelineParams()),
@@ -361,6 +366,62 @@ def _state_payload(job_id: str) -> dict[str, Any]:
361
  def _json_patch(payload: dict[str, Any] | None) -> dict[str, Any]:
362
  return dict(payload or {})
363
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
364
  @app.get("/api/jobs/{job_id}/events")
365
  def get_job_events(job_id: str) -> StreamingResponse:
366
  with jobs_lock:
@@ -422,6 +483,75 @@ def post_supervised_export(job_id: str, payload: dict[str, Any] = Body(default_f
422
  return {"export": _serialise_export(job_id, export_manifest), "state": _state_payload(job_id)}
423
 
424
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
425
  @app.post("/api/jobs/{job_id}/hits/force-onset")
426
  def post_force_onset(job_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
427
  patch = _json_patch(payload)
 
24
  from fastapi.responses import FileResponse, JSONResponse, StreamingResponse
25
  from fastapi.staticfiles import StaticFiles
26
 
27
+ from pipeline_runner import PipelineParams, SPLEETER_MODELS, SPLEETER_STEMS, SEPARATION_BACKENDS, clear_disk_cache, initial_stages, run_extraction_pipeline
28
  from sample_extractor import DEMUCS_MODELS, DEMUCS_STEMS, cache_clear
29
  from supervised_state import (
30
  accept_suggestion,
 
39
  restore_hit as apply_hit_restore,
40
  set_hit_review_status,
41
  suppress_hit as apply_hit_suppression,
42
+ draw_next_representative as apply_draw_next_representative,
43
+ edit_hit_timing as apply_hit_timing_edit,
44
  undo_last as apply_undo,
45
  )
46
+ from supervised_export import export_selected_samples, export_supervised_state
47
 
48
  ROOT = Path(__file__).resolve().parent
49
  WEB_DIR = ROOT / "web"
 
251
  @app.get("/api/config")
252
  def config() -> dict[str, Any]:
253
  return {
254
+ "separation_backends": SEPARATION_BACKENDS,
255
+ "spleeter_models": SPLEETER_MODELS,
256
+ "spleeter_stems": {key: value + ["all"] for key, value in SPLEETER_STEMS.items()},
257
  "demucs_models": DEMUCS_MODELS,
258
  "demucs_stems": {key: value + ["all"] for key, value in DEMUCS_STEMS.items()},
259
  "defaults": asdict(PipelineParams()),
 
366
  def _json_patch(payload: dict[str, Any] | None) -> dict[str, Any]:
367
  return dict(payload or {})
368
 
369
+
370
+ def _state_for_mutation(job_id: str) -> tuple[Path, dict[str, Any]]:
371
+ out = _job_output_dir(job_id)
372
+ try:
373
+ return out, load_or_create_state(job_id, out)
374
+ except FileNotFoundError as exc:
375
+ raise HTTPException(status_code=409, detail="Job has no manifest yet; wait until extraction completes") from exc
376
+
377
+
378
+ def _cluster_id_for_sample_label(state: dict[str, Any], sample_label: str) -> str:
379
+ clusters = state.get("clusters", {})
380
+ exact = [cid for cid, cluster in clusters.items() if str(cluster.get("label")) == str(sample_label)]
381
+ if exact:
382
+ return exact[0]
383
+ # Fall back to a classification/base-name match for labels that have been renamed by user edits.
384
+ base = str(sample_label).rsplit("_", 1)[0]
385
+ fuzzy = [cid for cid, cluster in clusters.items() if str(cluster.get("classification") or "") == base]
386
+ if len(fuzzy) == 1:
387
+ return fuzzy[0]
388
+ raise HTTPException(status_code=404, detail=f"Sample label not found in current state: {sample_label}")
389
+
390
+
391
+ def _public_sample_from_cluster(job_id: str, state: dict[str, Any], cluster_id: str, label_override: str | None = None) -> dict[str, Any]:
392
+ clusters = state.get("clusters", {})
393
+ hits = state.get("hits", {})
394
+ if cluster_id not in clusters:
395
+ raise HTTPException(status_code=404, detail=f"Unknown cluster: {cluster_id}")
396
+ cluster = clusters[cluster_id]
397
+ active_ids = [hid for hid in cluster.get("hit_ids", []) if hid in hits and not hits[hid].get("suppressed")]
398
+ if not active_ids:
399
+ raise HTTPException(status_code=409, detail=f"Cluster {cluster.get('label', cluster_id)} has no active hits")
400
+ rep_id = cluster.get("representative_hit_id") if cluster.get("representative_hit_id") in active_ids else active_ids[0]
401
+ hit = hits[rep_id]
402
+ raw_cluster_id = cluster_id.split(":", 1)[1] if ":" in cluster_id else cluster_id
403
+ try:
404
+ raw_cluster_id_value: int | str = int(raw_cluster_id)
405
+ except ValueError:
406
+ raw_cluster_id_value = raw_cluster_id
407
+ file_path = str(hit.get("file") or "")
408
+ first_onset = min(float(hits[hid].get("onset_sec") or 0.0) for hid in active_ids)
409
+ return {
410
+ "label": label_override or cluster.get("label") or str(cluster_id),
411
+ "classification": cluster.get("classification") or str(cluster.get("label") or "other").rsplit("_", 1)[0],
412
+ "hits": len(active_ids),
413
+ "midi_note": cluster.get("midi_note", 60),
414
+ "score": "edited",
415
+ "duration_ms": round(float(hit.get("duration_ms") or 0.0), 1),
416
+ "first_onset_sec": round(first_onset, 4),
417
+ "representative_hit_index": int(hit.get("index") or 0),
418
+ "state_hit_id": rep_id,
419
+ "cluster_id": raw_cluster_id_value,
420
+ "state_cluster_id": cluster_id,
421
+ "file": file_path,
422
+ "url": _job_url(job_id, file_path) if file_path else None,
423
+ }
424
+
425
  @app.get("/api/jobs/{job_id}/events")
426
  def get_job_events(job_id: str) -> StreamingResponse:
427
  with jobs_lock:
 
483
  return {"export": _serialise_export(job_id, export_manifest), "state": _state_payload(job_id)}
484
 
485
 
486
+ @app.post("/api/jobs/{job_id}/export-selected")
487
+ def post_selected_export(job_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
488
+ patch = _json_patch(payload)
489
+ labels = [str(item) for item in patch.get("labels", []) if str(item).strip()]
490
+ if not labels:
491
+ raise HTTPException(status_code=400, detail="labels must contain at least one selected sample label")
492
+ try:
493
+ export_manifest = export_selected_samples(
494
+ _job_output_dir(job_id),
495
+ job_id,
496
+ selected_labels=labels,
497
+ synthesize=bool(patch.get("synthesize", True)),
498
+ quantize=patch.get("quantize"),
499
+ subdivision=patch.get("subdivision"),
500
+ )
501
+ except ValueError as exc:
502
+ raise HTTPException(status_code=400, detail=str(exc)) from exc
503
+ except Exception as exc:
504
+ raise HTTPException(status_code=500, detail=str(exc)) from exc
505
+ return {"export": _serialise_export(job_id, export_manifest), "state": _state_payload(job_id)}
506
+
507
+
508
+ @app.post("/api/jobs/{job_id}/samples/{sample_label:path}/draw")
509
+ def post_draw_sample(job_id: str, sample_label: str) -> dict[str, Any]:
510
+ try:
511
+ out, state = _state_for_mutation(job_id)
512
+ cluster_id = _cluster_id_for_sample_label(state, sample_label)
513
+ state = apply_draw_next_representative(out, job_id, cluster_id, source="sample-card")
514
+ return {"sample": _public_sample_from_cluster(job_id, state, cluster_id, label_override=sample_label), "state": public_state(state, url_for=lambda rel: _job_url(job_id, rel))}
515
+ except KeyError as exc:
516
+ raise HTTPException(status_code=404, detail=str(exc)) from exc
517
+ except ValueError as exc:
518
+ raise HTTPException(status_code=409, detail=str(exc)) from exc
519
+ except HTTPException:
520
+ raise
521
+ except Exception as exc:
522
+ raise HTTPException(status_code=500, detail=str(exc)) from exc
523
+
524
+
525
+ @app.post("/api/jobs/{job_id}/samples/{sample_label:path}/edit")
526
+ def post_edit_sample(job_id: str, sample_label: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
527
+ patch = _json_patch(payload)
528
+ try:
529
+ out, state = _state_for_mutation(job_id)
530
+ cluster_id = _cluster_id_for_sample_label(state, sample_label)
531
+ cluster = state.get("clusters", {}).get(cluster_id) or {}
532
+ active_ids = [hid for hid in cluster.get("hit_ids", []) if hid in state.get("hits", {}) and not state["hits"][hid].get("suppressed")]
533
+ rep_id = cluster.get("representative_hit_id") if cluster.get("representative_hit_id") in active_ids else (active_ids[0] if active_ids else None)
534
+ if not rep_id:
535
+ raise HTTPException(status_code=409, detail=f"Sample {sample_label} has no active representative hit")
536
+ state = apply_hit_timing_edit(
537
+ out,
538
+ job_id,
539
+ rep_id,
540
+ start_offset_ms=float(patch.get("start_offset_ms", 0.0)),
541
+ tail_offset_ms=float(patch.get("tail_offset_ms", 0.0)),
542
+ source="sample-card",
543
+ )
544
+ return {"sample": _public_sample_from_cluster(job_id, state, cluster_id, label_override=sample_label), "state": public_state(state, url_for=lambda rel: _job_url(job_id, rel))}
545
+ except KeyError as exc:
546
+ raise HTTPException(status_code=404, detail=str(exc)) from exc
547
+ except ValueError as exc:
548
+ raise HTTPException(status_code=400, detail=str(exc)) from exc
549
+ except HTTPException:
550
+ raise
551
+ except Exception as exc:
552
+ raise HTTPException(status_code=500, detail=str(exc)) from exc
553
+
554
+
555
  @app.post("/api/jobs/{job_id}/hits/force-onset")
556
  def post_force_onset(job_id: str, payload: dict[str, Any] = Body(default_factory=dict)) -> dict[str, Any]:
557
  patch = _json_patch(payload)
docs/API.md CHANGED
@@ -30,8 +30,11 @@ Important response keys:
30
 
31
  | Key | Meaning |
32
  |---|---|
 
 
 
33
  | `demucs_models` | Supported Demucs model names. |
34
- | `demucs_stems` | Valid stems per model, plus `all` for bypassing Demucs. |
35
  | `defaults` | Default `PipelineParams`. |
36
  | `stages` | Pipeline stage definitions. |
37
  | `clustering_modes` | Human-readable labels for batch and online clustering modes. |
@@ -90,7 +93,7 @@ Example:
90
 
91
  ```bash
92
  curl -F 'file=@song.wav' \
93
- -F 'params={"stem":"all","clustering_mode":"online_preview","target_min":4,"target_max":12,"synthesize":true}' \
94
  http://127.0.0.1:7860/api/jobs
95
  ```
96
 
@@ -211,8 +214,10 @@ Defined in `pipeline_runner.PipelineParams`.
211
 
212
  | Parameter | Default | Meaning |
213
  |---|---:|---|
214
- | `stem` | `drums` | Demucs source to extract, or `all` to bypass Demucs. |
215
- | `demucs_model` | `htdemucs_ft` | Demucs model. |
 
 
216
  | `demucs_shifts` | `1` | Test-time shifts for Demucs quality/speed tradeoff. |
217
  | `demucs_overlap` | `0.25` | Demucs chunk overlap. |
218
  | `onset_mode` | `auto` | `auto`, `percussive`, `harmonic`, or `broadband`. |
@@ -234,6 +239,65 @@ Defined in `pipeline_runner.PipelineParams`.
234
  | `subdivision` | `16` | MIDI grid subdivision. |
235
  | `device` | `cpu` | Torch device for Demucs. |
236
  | `use_disk_cache` | `true` | Cache decoded full mix/stems by source digest and extraction settings. |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
237
 
238
  ## Interactive supervision API
239
 
@@ -504,11 +568,11 @@ Example:
504
  "completed_units": 12.0,
505
  "total_units": 64.0,
506
  "stage_key": "stem",
507
- "stage_label": "Stem extraction / source load",
508
  "stage_fraction": 0.5,
509
  "stage_work_done": 4,
510
  "stage_work_total": 8,
511
- "basis": "exact completed work units: Demucs chunks when available, otherwise stage boundary units; no time-based estimates"
512
  }
513
  ```
514
 
@@ -518,6 +582,7 @@ Semantics:
518
  - `stage_fraction` is the current stage-local progress when known.
519
  - `stage_work_done` and `stage_work_total` are exact work-unit counters when a stage exposes work units.
520
  - Demucs separated-stem extraction exposes exact completed split chunks.
 
521
  - Non-instrumented stages update at exact stage boundaries only.
522
  - The API does not provide guessed ETA or interpolated time progress.
523
 
 
30
 
31
  | Key | Meaning |
32
  |---|---|
33
+ | `separation_backends` | Supported separation engines: `spleeter`, `demucs`, and `none`. |
34
+ | `spleeter_models` | Supported Spleeter model profiles. |
35
+ | `spleeter_stems` | Valid stems per Spleeter model, plus `all`. |
36
  | `demucs_models` | Supported Demucs model names. |
37
+ | `demucs_stems` | Valid stems per Demucs model, plus `all`. |
38
  | `defaults` | Default `PipelineParams`. |
39
  | `stages` | Pipeline stage definitions. |
40
  | `clustering_modes` | Human-readable labels for batch and online clustering modes. |
 
93
 
94
  ```bash
95
  curl -F 'file=@song.wav' \
96
+ -F 'params={"separation_backend":"spleeter","spleeter_model":"spleeter:4stems","stem":"drums","clustering_mode":"online_preview","target_min":4,"target_max":12,"synthesize":true}' \
97
  http://127.0.0.1:7860/api/jobs
98
  ```
99
 
 
214
 
215
  | Parameter | Default | Meaning |
216
  |---|---:|---|
217
+ | `stem` | `drums` | Source/stem to extract, or `all` to bypass source separation. Valid values depend on the selected backend/model. |
218
+ | `separation_backend` | `spleeter` | Source-separation engine: `spleeter`, `demucs`, or `none`. |
219
+ | `spleeter_model` | `spleeter:4stems` | Spleeter model profile used by the default backend. |
220
+ | `demucs_model` | `htdemucs_ft` | Demucs model used when `separation_backend=demucs` or fallback is needed. |
221
  | `demucs_shifts` | `1` | Test-time shifts for Demucs quality/speed tradeoff. |
222
  | `demucs_overlap` | `0.25` | Demucs chunk overlap. |
223
  | `onset_mode` | `auto` | `auto`, `percussive`, `harmonic`, or `broadband`. |
 
239
  | `subdivision` | `16` | MIDI grid subdivision. |
240
  | `device` | `cpu` | Torch device for Demucs. |
241
  | `use_disk_cache` | `true` | Cache decoded full mix/stems by source digest and extraction settings. |
242
+ | `allow_backend_fallback` | `true` | If Spleeter is selected but unavailable/fails, fall back to Demucs instead of failing the job. |
243
+
244
+
245
+ ## Sample-card action API
246
+
247
+ These endpoints back the simplified card workflow in the reference-style UI. They mutate `supervision_state.json` and preserve the original batch manifest.
248
+
249
+ ### `POST /api/jobs/{job_id}/export-selected`
250
+
251
+ Exports only the currently selected representative sample labels into `selected/` artifacts.
252
+
253
+ Body:
254
+
255
+ ```json
256
+ {"labels":["kick_0","snare_0"],"synthesize":true}
257
+ ```
258
+
259
+ Response shape:
260
+
261
+ ```json
262
+ {
263
+ "export": {
264
+ "kind": "selected-sample-export",
265
+ "files": {"archive": "selected/sample-pack.zip", "midi": "selected/reconstruction.mid"},
266
+ "file_urls": {}
267
+ },
268
+ "state": {}
269
+ }
270
+ ```
271
+
272
+ Rules:
273
+
274
+ - `labels` must contain at least one visible sample label.
275
+ - Only selected semantic clusters are rendered.
276
+ - Suppressed hits remain excluded.
277
+ - Pinned/drawn representatives are honored.
278
+ - The export is written under `.runs/<job-id>/output/selected/` and does not mutate the original pack.
279
+
280
+ ### `POST /api/jobs/{job_id}/samples/{sample_label}/draw`
281
+
282
+ Cycles a card to the next active representative hit in that semantic cluster. The chosen hit is persisted as a representative override, so later selected/all edited exports use the same choice.
283
+
284
+ Response:
285
+
286
+ ```json
287
+ {"sample": {"label": "kick_0", "url": "..."}, "state": {}}
288
+ ```
289
+
290
+ ### `POST /api/jobs/{job_id}/samples/{sample_label}/edit`
291
+
292
+ Applies a timing edit to the current representative and rewrites its preview WAV immediately.
293
+
294
+ Body:
295
+
296
+ ```json
297
+ {"start_offset_ms":-8,"tail_offset_ms":24}
298
+ ```
299
+
300
+ The backend slices from `stem.wav`, writes `overrides/hits/*_edited.wav`, updates the representative hit in semantic state, and returns a refreshed card row.
301
 
302
  ## Interactive supervision API
303
 
 
568
  "completed_units": 12.0,
569
  "total_units": 64.0,
570
  "stage_key": "stem",
571
+ "stage_label": "Stem separation / source load",
572
  "stage_fraction": 0.5,
573
  "stage_work_done": 4,
574
  "stage_work_total": 8,
575
+ "basis": "exact completed work units: Demucs chunks when available; Spleeter and non-instrumented stages advance only at real stage boundaries; no time-based estimates"
576
  }
577
  ```
578
 
 
582
  - `stage_fraction` is the current stage-local progress when known.
583
  - `stage_work_done` and `stage_work_total` are exact work-unit counters when a stage exposes work units.
584
  - Demucs separated-stem extraction exposes exact completed split chunks.
585
+ - Spleeter reports coarse start/complete boundaries because the backend does not expose reliable chunk callbacks here.
586
  - Non-instrumented stages update at exact stage boundaries only.
587
  - The API does not provide guessed ETA or interpolated time progress.
588
 
docs/CARD_SELECTION_EXPORT_AND_EDITING.md ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Card selection, selected export, draw-next, and immediate timing edits
2
+
3
+ Last updated: 2026-05-12
4
+
5
+ ## Goal
6
+
7
+ The default workflow should feel like reviewing cards rather than configuring a batch pipeline:
8
+
9
+ ```text
10
+ drop audio → cards appear → keep/dismiss/draw/trim → export selected
11
+ ```
12
+
13
+ ## Implemented
14
+
15
+ ### Per-card selection
16
+
17
+ Each visible sample card now has a real checkbox. Newly produced cards are selected by default until the user changes selection manually. After the user clears or changes the selection, the UI respects that manual state.
18
+
19
+ ### Selected-only export
20
+
21
+ `POST /api/jobs/{job_id}/export-selected` renders only the selected card labels into a separate `selected/` export directory:
22
+
23
+ - `selected/sample-pack.zip`,
24
+ - `selected/samples/*.wav`,
25
+ - `selected/reconstruction.mid`,
26
+ - `selected/target_reconstruction.wav`,
27
+ - `selected/reconstruction.wav`,
28
+ - `selected/manifest.json`.
29
+
30
+ The original batch export is left untouched.
31
+
32
+ ### Persisted draw-next
33
+
34
+ The card “draw another” action now calls:
35
+
36
+ ```text
37
+ POST /api/jobs/{job_id}/samples/{sample_label}/draw
38
+ ```
39
+
40
+ The backend cycles the semantic cluster representative to the next active hit and records that as a pinned representative override. Later selected/edited exports honor this choice.
41
+
42
+ ### Immediate trim/extend preview
43
+
44
+ Trim/extend actions now call:
45
+
46
+ ```text
47
+ POST /api/jobs/{job_id}/samples/{sample_label}/edit
48
+ ```
49
+
50
+ The backend slices from `stem.wav`, writes an edited preview under `overrides/hits/`, updates the representative hit audio path, and returns a refreshed sample card. The user hears the edited clip immediately.
51
+
52
+ ## Validation
53
+
54
+ Covered by:
55
+
56
+ ```bash
57
+ python3 scripts/test_selected_export_card_actions.py
58
+ ```
59
+
60
+ The test verifies:
61
+
62
+ 1. extraction succeeds,
63
+ 2. selected-only export writes a selected pack,
64
+ 3. draw-next returns a playable representative WAV,
65
+ 4. trim/extend writes a playable edited override WAV.
66
+
67
+ ## Remaining work
68
+
69
+ - Add true cluster relabel/merge/split from the card columns.
70
+ - Add batch restore and bulk card operations.
71
+ - Add browser-level tests for checkbox selection and selected export.
72
+ - Add visual diff between original representative and edited representative.
docs/FEATURES.md CHANGED
@@ -150,3 +150,19 @@ Status: implemented.
150
  - The default web UI is now a reference-style sample extractor workspace: compact top bar, large waveform, persistent settings panel, grouped sample columns, and bottom selection bar.
151
  - Users can still just drop audio anywhere; waveform rendering and extraction begin automatically.
152
  - Expert parameters and semantic editing tools are available without cluttering the default path.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
150
  - The default web UI is now a reference-style sample extractor workspace: compact top bar, large waveform, persistent settings panel, grouped sample columns, and bottom selection bar.
151
  - Users can still just drop audio anywhere; waveform rendering and extraction begin automatically.
152
  - Expert parameters and semantic editing tools are available without cluttering the default path.
153
+
154
+ ## Selected cards and backend simplification update
155
+
156
+ Implemented after the reference-image UI pass:
157
+
158
+ | Area | Feature | Status | Notes |
159
+ |---|---|---:|---|
160
+ | Separation | Spleeter backend | Implemented | Default backend, with `spleeter:4stems` selected by default. |
161
+ | Separation | Demucs backend | Implemented | Explicit higher-cost backend and automatic fallback when enabled. |
162
+ | Separation | No-separation backend | Implemented | Full-mix preview path for fastest iteration. |
163
+ | Export | Per-card selection | Implemented | Cards have real checkbox state; selected count drives `Export Selected`. |
164
+ | Export | Selected-only export | Implemented | `POST /api/jobs/{job_id}/export-selected` writes `selected/sample-pack.zip`. |
165
+ | Cards | Draw another | Implemented | Persists the next representative hit as a semantic override. |
166
+ | Cards | Trim/extend preview | Implemented | Rewrites a playable preview WAV immediately under `overrides/hits/`. |
167
+ | Docs | Separation backend docs | Implemented | See `docs/SPLEETER_AND_SEPARATION_BACKENDS.md`. |
168
+ | Docs | Card action docs | Implemented | See `docs/CARD_SELECTION_EXPORT_AND_EDITING.md`. |
docs/PROGRESS.md CHANGED
@@ -370,3 +370,21 @@ Validation performed:
370
  - Added centered file picker/current filename, right-aligned export actions, persistent right settings panel, waveform-first canvas, grouped sample columns, and compact bottom selection bar.
371
  - Kept automatic drop-to-process behavior and progressive sample-card rendering.
372
  - Moved secondary pipeline/history/supervision/tables into a compact tools drawer.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
370
  - Added centered file picker/current filename, right-aligned export actions, persistent right settings panel, waveform-first canvas, grouped sample columns, and compact bottom selection bar.
371
  - Kept automatic drop-to-process behavior and progressive sample-card rendering.
372
  - Moved secondary pipeline/history/supervision/tables into a compact tools drawer.
373
+
374
+
375
+ ## Pass 14: selected cards and Spleeter backend
376
+
377
+ Completed in this pass:
378
+
379
+ 1. Added `spleeter` as the default separation backend, with selectable `spleeter:2stems`, `spleeter:4stems`, and `spleeter:5stems` profiles.
380
+ 2. Kept `demucs` as a quality/fallback backend and `none` as the full-mix preview backend.
381
+ 3. Added optional `requirements-spleeter.txt` instead of forcing TensorFlow/Spleeter into the base install.
382
+ 4. Added per-card checkbox state with manual select-all/clear behavior.
383
+ 5. Added selected-only backend export via `POST /api/jobs/{job_id}/export-selected`.
384
+ 6. Made `draw another` persist the chosen representative in semantic state.
385
+ 7. Made trim/extend rewrite playable preview audio immediately under `overrides/hits/`.
386
+ 8. Added `scripts/test_selected_export_card_actions.py`.
387
+
388
+ Outcome:
389
+
390
+ The default app now behaves more like a card review tool: drop audio, let Spleeter/fallback separation run, review grouped cards, select/dismiss/draw/trim, and export only the selected pack.
docs/REMAINING_WORK.md CHANGED
@@ -103,3 +103,20 @@ The default UI is now a cleaner fixed, non-scrolling workstation layout with col
103
  - Add selected-only backend export so `Export Selected` creates an artifact containing only selected representatives.
104
  - Replace Unicode icons with a small icon system if exact visual parity is required.
105
  - Validate with a real browser screenshot comparison against the supplied reference image.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103
  - Add selected-only backend export so `Export Selected` creates an artifact containing only selected representatives.
104
  - Replace Unicode icons with a small icon system if exact visual parity is required.
105
  - Validate with a real browser screenshot comparison against the supplied reference image.
106
+
107
+ ## Closed after selected-card/Spleeter pass
108
+
109
+ - `Export Selected` now renders a selected-only backend artifact instead of downloading the full generated pack.
110
+ - Sample card checkboxes are real per-card state.
111
+ - Draw-next is persisted as a representative override in `supervision_state.json`.
112
+ - Trim/extend rewrites preview audio immediately and persists the edited representative hit.
113
+ - Spleeter is now the default backend, with Demucs fallback and full-mix preview still available.
114
+
115
+ ## Remaining after selected-card/Spleeter pass
116
+
117
+ 1. Add cluster column merge/split/relabel directly in the card UI.
118
+ 2. Add localized high-quality separation refinement: run Demucs or another backend on short candidate regions instead of the entire file.
119
+ 3. Investigate AudioSep-like query-guided separation for overlapping drum events as an optional refinement path.
120
+ 4. Investigate inpainting for cleaning hit tails/bleed after localization, not for first-pass discovery.
121
+ 5. Add browser-level regression tests for drop-to-process, waveform zoom/pan, card selection, selected export, draw-next, and trim/extend.
122
+ 6. Add source-vs-stem-vs-reproduced diagnostics for cards where overlaps remain audible.
docs/SPLEETER_AND_SEPARATION_BACKENDS.md ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Spleeter and separation backends
2
+
3
+ Last updated: 2026-05-12
4
+
5
+ ## Decision
6
+
7
+ The application now defaults to Spleeter:
8
+
9
+ ```json
10
+ {
11
+ "separation_backend": "spleeter",
12
+ "spleeter_model": "spleeter:4stems",
13
+ "stem": "drums",
14
+ "allow_backend_fallback": true
15
+ }
16
+ ```
17
+
18
+ Spleeter is treated as the normal first-pass separation backend because it is much lighter for the common UX: drop a track, get drum-card candidates quickly, and only escalate to heavier processing when necessary.
19
+
20
+ Demucs remains available as a higher-cost quality/fallback backend:
21
+
22
+ ```json
23
+ {"separation_backend":"demucs","demucs_model":"htdemucs_ft"}
24
+ ```
25
+
26
+ Full-mix preview remains available for the fastest possible iteration:
27
+
28
+ ```json
29
+ {"separation_backend":"none","stem":"all","clustering_mode":"online_preview"}
30
+ ```
31
+
32
+ ## Supported engines
33
+
34
+ | Backend | Status | Use |
35
+ |---|---:|---|
36
+ | `spleeter` | Default | Lightweight drum/source separation for the common automatic workflow. |
37
+ | `demucs` | Supported | Higher-cost quality backend and fallback when Spleeter is unavailable or insufficient. |
38
+ | `none` | Supported | Bypass source separation and process the full mix. Best for quick UI/debug iteration. |
39
+
40
+ ## Spleeter models
41
+
42
+ | Model | Stems exposed |
43
+ |---|---|
44
+ | `spleeter:2stems` | `vocals`, `accompaniment`, `all` |
45
+ | `spleeter:4stems` | `vocals`, `drums`, `bass`, `other`, `all` |
46
+ | `spleeter:5stems` | `vocals`, `drums`, `bass`, `piano`, `other`, `all` |
47
+
48
+ ## Installation
49
+
50
+ Spleeter is optional and intentionally not installed by the main `requirements.txt`, because TensorFlow/Spleeter compatibility can be environment-sensitive.
51
+
52
+ Install it only when needed:
53
+
54
+ ```bash
55
+ pip install -r requirements-spleeter.txt
56
+ ```
57
+
58
+ For the normal local app, leave `allow_backend_fallback=true`. If Spleeter is unavailable or fails, the job falls back to Demucs and logs that fallback in the stage details. Disable fallback only when actively debugging Spleeter.
59
+
60
+ ## Caching
61
+
62
+ The disk cache key includes:
63
+
64
+ - source digest,
65
+ - selected stem,
66
+ - separation backend,
67
+ - Spleeter model,
68
+ - Demucs model,
69
+ - Demucs shifts/overlap.
70
+
71
+ This avoids accidentally reusing stems from a different engine or model.
72
+
73
+ ## Progress behavior
74
+
75
+ Demucs exposes chunk progress through the existing extraction callback, so the waveform can advance during stem separation when chunk data is available.
76
+
77
+ Spleeter does not expose reliable per-chunk progress through the current backend path. The app therefore reports only real start/completion boundaries for Spleeter. It does not interpolate fake progress.
78
+
79
+ ## Future research: localized separation
80
+
81
+ The recommended next architecture is not “run Demucs on the whole track after Spleeter.” The better path is:
82
+
83
+ 1. Use Spleeter or full-mix onset detection to find candidate hit regions.
84
+ 2. Expand each candidate region with context padding.
85
+ 3. Run heavier separation only on those short windows.
86
+ 4. Stitch or use the refined region only for the card/export preview.
87
+
88
+ That could make Demucs feasible as a local refinement step instead of an expensive full-track prerequisite.
89
+
90
+ ## Future research: overlapping samples
91
+
92
+ AudioSep-like text/query-guided separation may be useful for overlaps where source classes matter, for example “kick drum”, “closed hi-hat”, or “snare transient”. It should be investigated as an optional refinement tool, not as the default first-pass extractor.
93
+
94
+ USEF-TSE and target-speaker-extraction style systems are mostly speech-targeted. They are not a good near-term default for drum sample extraction, but the conditioning pattern is relevant if the app later supports “extract more sounds like this selected example.”
95
+
96
+ Audio inpainting is more promising for cleaning card tails/gaps and removing overlap residue after a hit has already been localized than for first-pass sample discovery.
docs/TASKS.md CHANGED
@@ -202,3 +202,25 @@ Next:
202
  - [x] Preserve existing review/edit tools in a secondary drawer.
203
  - [ ] Implement true per-card selection and selected-only export artifacts.
204
  - [ ] Run browser screenshot comparison in an environment that allows localhost rendering.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
202
  - [x] Preserve existing review/edit tools in a secondary drawer.
203
  - [ ] Implement true per-card selection and selected-only export artifacts.
204
  - [ ] Run browser screenshot comparison in an environment that allows localhost rendering.
205
+
206
+
207
+ ## Selected cards and separation backends
208
+
209
+ Completed:
210
+
211
+ - [x] Add Spleeter backend option.
212
+ - [x] Default new jobs to Spleeter `spleeter:4stems` + `drums`.
213
+ - [x] Keep Demucs as explicit quality/fallback backend.
214
+ - [x] Add `none` backend for full-mix preview.
215
+ - [x] Add optional `requirements-spleeter.txt`.
216
+ - [x] Add per-card checkbox selection.
217
+ - [x] Add selected-only backend export.
218
+ - [x] Persist draw-next representative overrides.
219
+ - [x] Rewrite preview audio immediately for trim/extend edits.
220
+ - [x] Add regression smoke test for selected export/card actions.
221
+
222
+ Next:
223
+
224
+ - [ ] Add card-column relabel/merge/split actions.
225
+ - [ ] Add browser-level tests for card selection/export/edit flows.
226
+ - [ ] Add localized high-quality separation refinement on short candidate windows.
docs/interactive-ux/PROGRESS.md CHANGED
@@ -107,3 +107,21 @@ The default UX now follows the interactive-doc direction more closely by hiding
107
  - waveform zoom/pan supports close inspection without leaving the main flow.
108
 
109
  The remaining mismatch is that drawn candidate cards are still frontend candidate previews, not persisted representative-selection constraints. That should be promoted into the semantic state model next.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
  - waveform zoom/pan supports close inspection without leaving the main flow.
108
 
109
  The remaining mismatch is that drawn candidate cards are still frontend candidate previews, not persisted representative-selection constraints. That should be promoted into the semantic state model next.
110
+
111
+
112
+ ## Pass 14: selected cards and Spleeter backend
113
+
114
+ Completed in this pass:
115
+
116
+ 1. Added `spleeter` as the default separation backend, with selectable `spleeter:2stems`, `spleeter:4stems`, and `spleeter:5stems` profiles.
117
+ 2. Kept `demucs` as a quality/fallback backend and `none` as the full-mix preview backend.
118
+ 3. Added optional `requirements-spleeter.txt` instead of forcing TensorFlow/Spleeter into the base install.
119
+ 4. Added per-card checkbox state with manual select-all/clear behavior.
120
+ 5. Added selected-only backend export via `POST /api/jobs/{job_id}/export-selected`.
121
+ 6. Made `draw another` persist the chosen representative in semantic state.
122
+ 7. Made trim/extend rewrite playable preview audio immediately under `overrides/hits/`.
123
+ 8. Added `scripts/test_selected_export_card_actions.py`.
124
+
125
+ Outcome:
126
+
127
+ The default app now behaves more like a card review tool: drop audio, let Spleeter/fallback separation run, review grouped cards, select/dismiss/draw/trim, and export only the selected pack.
docs/interactive-ux/TASKS.md CHANGED
@@ -131,3 +131,25 @@ The project now has a replayable state/events/constraints foundation plus superv
131
  - [ ] Persist drawn candidate cards as representative overrides.
132
  - [ ] Recluster locally after card-level decisions.
133
  - [ ] Add browser-level tests for the card flow.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
131
  - [ ] Persist drawn candidate cards as representative overrides.
132
  - [ ] Recluster locally after card-level decisions.
133
  - [ ] Add browser-level tests for the card flow.
134
+
135
+
136
+ ## Selected cards and separation backends
137
+
138
+ Completed:
139
+
140
+ - [x] Add Spleeter backend option.
141
+ - [x] Default new jobs to Spleeter `spleeter:4stems` + `drums`.
142
+ - [x] Keep Demucs as explicit quality/fallback backend.
143
+ - [x] Add `none` backend for full-mix preview.
144
+ - [x] Add optional `requirements-spleeter.txt`.
145
+ - [x] Add per-card checkbox selection.
146
+ - [x] Add selected-only backend export.
147
+ - [x] Persist draw-next representative overrides.
148
+ - [x] Rewrite preview audio immediately for trim/extend edits.
149
+ - [x] Add regression smoke test for selected export/card actions.
150
+
151
+ Next:
152
+
153
+ - [ ] Add card-column relabel/merge/split actions.
154
+ - [ ] Add browser-level tests for card selection/export/edit flows.
155
+ - [ ] Add localized high-quality separation refinement on short candidate windows.
pipeline_runner.py CHANGED
@@ -7,6 +7,8 @@ import hashlib
7
  import json
8
  import os
9
  import shutil
 
 
10
  import tempfile
11
  import time
12
  from contextlib import contextmanager
@@ -37,10 +39,20 @@ from sample_extractor import (
37
 
38
  ProgressCallback = Callable[[dict[str, Any]], None]
39
 
 
 
 
 
 
 
 
 
40
 
41
  @dataclass
42
  class PipelineParams:
43
  stem: str = "drums"
 
 
44
  demucs_model: str = "htdemucs_ft"
45
  demucs_shifts: int = 1
46
  demucs_overlap: float = 0.25
@@ -64,6 +76,7 @@ class PipelineParams:
64
  device: str = "cpu"
65
  auto_tune: bool = True
66
  use_disk_cache: bool = True
 
67
 
68
  @classmethod
69
  def from_mapping(cls, data: dict[str, Any] | None) -> "PipelineParams":
@@ -86,7 +99,7 @@ class PipelineParams:
86
  "attack_ms",
87
  "mel_threshold",
88
  }
89
- bool_fields = {"synthesize", "quantize_midi", "auto_tune", "use_disk_cache"}
90
 
91
  def coerce_bool(name: str, value: Any) -> bool:
92
  if isinstance(value, bool):
@@ -128,11 +141,23 @@ class PipelineParams:
128
  return params
129
 
130
  def validate(self) -> None:
 
 
 
 
131
  if self.demucs_model not in DEMUCS_MODELS:
132
  raise ValueError(f"Unsupported Demucs model: {self.demucs_model}")
133
- allowed_stems = set(DEMUCS_STEMS.get(self.demucs_model, [])) | {"all"}
 
 
 
 
 
 
 
 
134
  if self.stem not in allowed_stems:
135
- raise ValueError(f"Stem '{self.stem}' is not available for {self.demucs_model}")
136
  if self.onset_mode not in {"auto", "percussive", "harmonic", "broadband"}:
137
  raise ValueError(f"Unsupported onset mode: {self.onset_mode}")
138
  if self.linkage not in {"average", "complete", "single"}:
@@ -197,7 +222,7 @@ class PipelineResult:
197
 
198
 
199
  STAGE_DEFS = [
200
- ("stem", "Stem extraction / source load"),
201
  ("auto_tune", "Automatic parameter tuning"),
202
  ("bpm", "Tempo detection"),
203
  ("onsets", "Onset detection + slicing"),
@@ -244,7 +269,7 @@ def _progress_payload(stages: list[StageTiming]) -> dict[str, Any]:
244
  "stage_fraction": round(running_progress, 6),
245
  "stage_work_done": running.work_done if running else None,
246
  "stage_work_total": running.work_total if running else None,
247
- "basis": "exact completed work units: Demucs chunks when available, otherwise stage boundary units; no time-based estimates",
248
  }
249
 
250
 
@@ -371,7 +396,7 @@ def _make_reproduction_mix(target_reconstruction: np.ndarray, context_bed: np.nd
371
  MODULE_ROOT = Path(__file__).resolve().parent
372
  CACHE_DIR = Path(os.environ["DSE_CACHE_DIR"]) if os.environ.get("DSE_CACHE_DIR") else MODULE_ROOT / ".cache"
373
  STEM_CACHE_DIR = CACHE_DIR / "stems"
374
- CACHE_VERSION = "dse-cache-v2"
375
 
376
 
377
  def _write_audio(path: Path, audio: np.ndarray, sr: int, subtype: str = "PCM_24") -> None:
@@ -392,6 +417,8 @@ def _stem_cache_path(audio_path: str | os.PathLike[str], params: PipelineParams)
392
  "version": CACHE_VERSION,
393
  "source_sha256": _sha256_file(audio_path),
394
  "stem": params.stem,
 
 
395
  "demucs_model": params.demucs_model,
396
  "demucs_shifts": params.demucs_shifts,
397
  "demucs_overlap": params.demucs_overlap,
@@ -401,19 +428,109 @@ def _stem_cache_path(audio_path: str | os.PathLike[str], params: PipelineParams)
401
  return STEM_CACHE_DIR / f"{key}.wav"
402
 
403
 
 
 
 
 
404
  def clear_disk_cache() -> None:
405
  if CACHE_DIR.exists():
406
  shutil.rmtree(CACHE_DIR)
407
 
408
 
409
- def _load_or_extract_stem(audio_path: str | os.PathLike[str], params: PipelineParams, progress_cb: Callable[[dict[str, Any]], None] | None = None) -> tuple[np.ndarray, int, str]:
410
- if params.use_disk_cache:
411
- cache_path = _stem_cache_path(audio_path, params)
412
- if cache_path.exists():
413
- audio, sr = sf.read(cache_path, dtype="float32", always_2d=False)
414
- if progress_cb:
415
- progress_cb({"fraction": 1.0, "completed_units": 1, "total_units": 1, "detail": f"{params.stem} disk-cache hit"})
416
- return np.asarray(audio, dtype=np.float32), int(sr), f"{params.stem} disk-cache hit"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
417
  audio, sr = extract_stem(
418
  str(audio_path),
419
  stem=params.stem,
@@ -423,12 +540,50 @@ def _load_or_extract_stem(audio_path: str | os.PathLike[str], params: PipelinePa
423
  overlap=float(params.demucs_overlap),
424
  progress_cb=progress_cb,
425
  )
426
- detail = f"{params.stem} via {params.demucs_model}" if params.stem != "all" else "loaded full mix"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
427
  if params.use_disk_cache:
428
- cache_path = _stem_cache_path(audio_path, params)
429
  _write_audio(cache_path, audio, sr, subtype="PCM_16")
 
 
430
  detail += " · cached"
431
- return audio, sr, detail
432
 
433
 
434
 
@@ -605,7 +760,7 @@ def run_extraction_pipeline(
605
  work_total=event.get("total_units"),
606
  )
607
 
608
- raw_stem_audio, stem_sr, stem_detail = _load_or_extract_stem(audio_path, params, progress_cb=_stem_progress)
609
  source_raw = _load_source_mix(audio_path, stem_sr)
610
  length = max(len(raw_stem_audio), len(source_raw))
611
  raw_stem_audio = _pad_or_trim(raw_stem_audio, length)
@@ -613,8 +768,9 @@ def run_extraction_pipeline(
613
  gain = _common_gain(raw_stem_audio if params.stem != "all" else source_raw, source_raw)
614
  stem_audio = (raw_stem_audio / gain).astype(np.float32)
615
  source_audio = (source_raw / gain).astype(np.float32)
616
- context_bed = np.zeros_like(source_audio) if params.stem == "all" else (source_audio - stem_audio).astype(np.float32)
617
- stage.detail = stem_detail + (" · reproduction uses full mix" if params.stem == "all" else " · reproduction uses residual non-target stems")
 
618
  _write_audio(out / "source.wav", _soft_limit(source_audio), stem_sr, subtype="PCM_16")
619
  _write_audio(out / "stem.wav", _soft_limit(stem_audio), stem_sr, subtype="PCM_16")
620
  _write_audio(out / "context_bed.wav", _soft_limit(context_bed), stem_sr, subtype="PCM_16")
 
7
  import json
8
  import os
9
  import shutil
10
+ import subprocess
11
+ import sys
12
  import tempfile
13
  import time
14
  from contextlib import contextmanager
 
39
 
40
  ProgressCallback = Callable[[dict[str, Any]], None]
41
 
42
+ SPLEETER_MODELS = ["spleeter:4stems", "spleeter:2stems", "spleeter:5stems"]
43
+ SPLEETER_STEMS = {
44
+ "spleeter:2stems": ["vocals", "accompaniment"],
45
+ "spleeter:4stems": ["vocals", "drums", "bass", "other"],
46
+ "spleeter:5stems": ["vocals", "drums", "bass", "piano", "other"],
47
+ }
48
+ SEPARATION_BACKENDS = ["spleeter", "demucs", "none"]
49
+
50
 
51
  @dataclass
52
  class PipelineParams:
53
  stem: str = "drums"
54
+ separation_backend: str = "spleeter"
55
+ spleeter_model: str = "spleeter:4stems"
56
  demucs_model: str = "htdemucs_ft"
57
  demucs_shifts: int = 1
58
  demucs_overlap: float = 0.25
 
76
  device: str = "cpu"
77
  auto_tune: bool = True
78
  use_disk_cache: bool = True
79
+ allow_backend_fallback: bool = True
80
 
81
  @classmethod
82
  def from_mapping(cls, data: dict[str, Any] | None) -> "PipelineParams":
 
99
  "attack_ms",
100
  "mel_threshold",
101
  }
102
+ bool_fields = {"synthesize", "quantize_midi", "auto_tune", "use_disk_cache", "allow_backend_fallback"}
103
 
104
  def coerce_bool(name: str, value: Any) -> bool:
105
  if isinstance(value, bool):
 
141
  return params
142
 
143
  def validate(self) -> None:
144
+ if self.separation_backend not in set(SEPARATION_BACKENDS):
145
+ raise ValueError(f"Unsupported separation backend: {self.separation_backend}")
146
+ if self.spleeter_model not in SPLEETER_MODELS:
147
+ raise ValueError(f"Unsupported Spleeter model: {self.spleeter_model}")
148
  if self.demucs_model not in DEMUCS_MODELS:
149
  raise ValueError(f"Unsupported Demucs model: {self.demucs_model}")
150
+ if self.separation_backend == "demucs":
151
+ allowed_stems = set(DEMUCS_STEMS.get(self.demucs_model, [])) | {"all"}
152
+ backend_label = self.demucs_model
153
+ elif self.separation_backend == "spleeter":
154
+ allowed_stems = set(SPLEETER_STEMS.get(self.spleeter_model, [])) | {"all"}
155
+ backend_label = self.spleeter_model
156
+ else:
157
+ allowed_stems = {"all"}
158
+ backend_label = "full mix"
159
  if self.stem not in allowed_stems:
160
+ raise ValueError(f"Stem '{self.stem}' is not available for {backend_label}")
161
  if self.onset_mode not in {"auto", "percussive", "harmonic", "broadband"}:
162
  raise ValueError(f"Unsupported onset mode: {self.onset_mode}")
163
  if self.linkage not in {"average", "complete", "single"}:
 
222
 
223
 
224
  STAGE_DEFS = [
225
+ ("stem", "Stem separation / source load"),
226
  ("auto_tune", "Automatic parameter tuning"),
227
  ("bpm", "Tempo detection"),
228
  ("onsets", "Onset detection + slicing"),
 
269
  "stage_fraction": round(running_progress, 6),
270
  "stage_work_done": running.work_done if running else None,
271
  "stage_work_total": running.work_total if running else None,
272
+ "basis": "exact completed work units: Demucs chunks when available; Spleeter and non-instrumented stages advance only at real stage boundaries; no time-based estimates",
273
  }
274
 
275
 
 
396
  MODULE_ROOT = Path(__file__).resolve().parent
397
  CACHE_DIR = Path(os.environ["DSE_CACHE_DIR"]) if os.environ.get("DSE_CACHE_DIR") else MODULE_ROOT / ".cache"
398
  STEM_CACHE_DIR = CACHE_DIR / "stems"
399
+ CACHE_VERSION = "dse-cache-v3-separation-backends"
400
 
401
 
402
  def _write_audio(path: Path, audio: np.ndarray, sr: int, subtype: str = "PCM_24") -> None:
 
417
  "version": CACHE_VERSION,
418
  "source_sha256": _sha256_file(audio_path),
419
  "stem": params.stem,
420
+ "separation_backend": params.separation_backend,
421
+ "spleeter_model": params.spleeter_model,
422
  "demucs_model": params.demucs_model,
423
  "demucs_shifts": params.demucs_shifts,
424
  "demucs_overlap": params.demucs_overlap,
 
428
  return STEM_CACHE_DIR / f"{key}.wav"
429
 
430
 
431
+ def _context_cache_path(stem_cache_path: Path) -> Path:
432
+ return stem_cache_path.with_name(f"{stem_cache_path.stem}.context.wav")
433
+
434
+
435
  def clear_disk_cache() -> None:
436
  if CACHE_DIR.exists():
437
  shutil.rmtree(CACHE_DIR)
438
 
439
 
440
+ def _load_spleeter_audio(path: Path, sr: int | None = None) -> tuple[np.ndarray, int]:
441
+ audio, loaded_sr = librosa.load(path, sr=sr, mono=True)
442
+ return _mono(audio), int(loaded_sr)
443
+
444
+
445
+ def _spleeter_output_file(root: Path, source_stem: str, stem: str) -> Path | None:
446
+ candidates = [root / source_stem / f"{stem}.wav", root / f"{stem}.wav"]
447
+ candidates.extend(root.glob(f"**/{stem}.wav"))
448
+ for candidate in candidates:
449
+ if candidate.exists():
450
+ return candidate
451
+ return None
452
+
453
+
454
+ def _extract_spleeter_separation(
455
+ audio_path: str | os.PathLike[str],
456
+ params: PipelineParams,
457
+ progress_cb: Callable[[dict[str, Any]], None] | None = None,
458
+ ) -> tuple[np.ndarray, int, np.ndarray | None, str]:
459
+ """Run Spleeter and return target stem plus the sum of non-target stems.
460
+
461
+ Progress is deliberately coarse because Spleeter does not expose reliable
462
+ chunk callbacks through the CLI/Python API. We report start/completion only;
463
+ the global UI therefore never interpolates fake progress.
464
+ """
465
+ if progress_cb:
466
+ progress_cb({"fraction": 0.0, "completed_units": 0, "total_units": 1, "detail": f"Spleeter {params.spleeter_model} starting"})
467
+
468
+ with tempfile.TemporaryDirectory(prefix="dse_spleeter_") as tmp:
469
+ tmpdir = Path(tmp)
470
+ commands = [
471
+ [
472
+ sys.executable,
473
+ "-m",
474
+ "spleeter",
475
+ "separate",
476
+ "-p",
477
+ params.spleeter_model,
478
+ "-o",
479
+ str(tmpdir),
480
+ str(audio_path),
481
+ ],
482
+ [
483
+ "spleeter",
484
+ "separate",
485
+ "-p",
486
+ params.spleeter_model,
487
+ "-o",
488
+ str(tmpdir),
489
+ str(audio_path),
490
+ ],
491
+ ]
492
+ failures: list[str] = []
493
+ completed = None
494
+ for cmd in commands:
495
+ try:
496
+ completed = subprocess.run(cmd, capture_output=True, text=True, check=False, timeout=60 * 30)
497
+ except FileNotFoundError as exc:
498
+ failures.append(str(exc))
499
+ continue
500
+ if completed.returncode == 0:
501
+ break
502
+ failures.append((completed.stderr or completed.stdout or "Spleeter failed").strip()[-1200:])
503
+ if completed is None or completed.returncode != 0:
504
+ raise RuntimeError("; ".join(part for part in failures if part) or "Spleeter failed")
505
+
506
+ source_stem = Path(audio_path).stem
507
+ stems = SPLEETER_STEMS[params.spleeter_model]
508
+ paths = {stem: _spleeter_output_file(tmpdir, source_stem, stem) for stem in stems}
509
+ missing = [stem for stem, path in paths.items() if path is None]
510
+ if missing:
511
+ raise RuntimeError(f"Spleeter finished but did not write expected stem(s): {', '.join(missing)}")
512
+ if params.stem not in paths:
513
+ raise RuntimeError(f"Spleeter model {params.spleeter_model} does not provide stem '{params.stem}'")
514
+
515
+ target, sr = _load_spleeter_audio(paths[params.stem]) # type: ignore[arg-type]
516
+ context_parts: list[np.ndarray] = []
517
+ for name, path in paths.items():
518
+ if name == params.stem or path is None:
519
+ continue
520
+ part, _ = _load_spleeter_audio(path, sr=sr)
521
+ context_parts.append(_pad_or_trim(part, len(target)))
522
+ context = np.sum(np.stack(context_parts), axis=0).astype(np.float32) if context_parts else None
523
+
524
+ if progress_cb:
525
+ progress_cb({"fraction": 1.0, "completed_units": 1, "total_units": 1, "detail": f"Spleeter {params.spleeter_model} complete"})
526
+ return target.astype(np.float32), sr, context, f"{params.stem} via {params.spleeter_model}"
527
+
528
+
529
+ def _extract_demucs_separation(
530
+ audio_path: str | os.PathLike[str],
531
+ params: PipelineParams,
532
+ progress_cb: Callable[[dict[str, Any]], None] | None = None,
533
+ ) -> tuple[np.ndarray, int, np.ndarray | None, str]:
534
  audio, sr = extract_stem(
535
  str(audio_path),
536
  stem=params.stem,
 
540
  overlap=float(params.demucs_overlap),
541
  progress_cb=progress_cb,
542
  )
543
+ return audio, sr, None, f"{params.stem} via Demucs {params.demucs_model}"
544
+
545
+
546
+ def _load_or_extract_separation(audio_path: str | os.PathLike[str], params: PipelineParams, progress_cb: Callable[[dict[str, Any]], None] | None = None) -> tuple[np.ndarray, int, str, np.ndarray | None]:
547
+ if params.stem == "all" or params.separation_backend == "none":
548
+ audio, sr = librosa.load(audio_path, sr=44100, mono=True)
549
+ if progress_cb:
550
+ progress_cb({"fraction": 1.0, "completed_units": 1, "total_units": 1, "detail": "loaded full mix"})
551
+ return _mono(audio), int(sr), "loaded full mix", None
552
+
553
+ cache_path = _stem_cache_path(audio_path, params)
554
+ context_cache = _context_cache_path(cache_path)
555
+ if params.use_disk_cache and cache_path.exists():
556
+ audio, sr = sf.read(cache_path, dtype="float32", always_2d=False)
557
+ context = None
558
+ if context_cache.exists():
559
+ context_audio, context_sr = sf.read(context_cache, dtype="float32", always_2d=False)
560
+ context = _pad_or_trim(_mono(context_audio), len(_mono(audio))) if int(context_sr) == int(sr) else _mono(librosa.resample(_mono(context_audio), orig_sr=int(context_sr), target_sr=int(sr)))
561
+ if progress_cb:
562
+ progress_cb({"fraction": 1.0, "completed_units": 1, "total_units": 1, "detail": f"{params.stem} disk-cache hit"})
563
+ return np.asarray(audio, dtype=np.float32), int(sr), f"{params.stem} disk-cache hit", context
564
+
565
+ context: np.ndarray | None = None
566
+ if params.separation_backend == "spleeter":
567
+ try:
568
+ audio, sr, context, detail = _extract_spleeter_separation(audio_path, params, progress_cb=progress_cb)
569
+ except Exception as exc:
570
+ if not params.allow_backend_fallback:
571
+ raise
572
+ if progress_cb:
573
+ progress_cb({"fraction": 0.0, "completed_units": 0, "total_units": 1, "detail": f"Spleeter unavailable; falling back to Demucs: {exc}"})
574
+ audio, sr, context, demucs_detail = _extract_demucs_separation(audio_path, params, progress_cb=progress_cb)
575
+ detail = f"Spleeter unavailable ({exc}); fallback {demucs_detail}"
576
+ elif params.separation_backend == "demucs":
577
+ audio, sr, context, detail = _extract_demucs_separation(audio_path, params, progress_cb=progress_cb)
578
+ else:
579
+ raise ValueError(f"Unsupported separation backend: {params.separation_backend}")
580
+
581
  if params.use_disk_cache:
 
582
  _write_audio(cache_path, audio, sr, subtype="PCM_16")
583
+ if context is not None:
584
+ _write_audio(context_cache, context, sr, subtype="PCM_16")
585
  detail += " · cached"
586
+ return audio, sr, detail, context
587
 
588
 
589
 
 
760
  work_total=event.get("total_units"),
761
  )
762
 
763
+ raw_stem_audio, stem_sr, stem_detail, separated_context = _load_or_extract_separation(audio_path, params, progress_cb=_stem_progress)
764
  source_raw = _load_source_mix(audio_path, stem_sr)
765
  length = max(len(raw_stem_audio), len(source_raw))
766
  raw_stem_audio = _pad_or_trim(raw_stem_audio, length)
 
768
  gain = _common_gain(raw_stem_audio if params.stem != "all" else source_raw, source_raw)
769
  stem_audio = (raw_stem_audio / gain).astype(np.float32)
770
  source_audio = (source_raw / gain).astype(np.float32)
771
+ context_bed = np.zeros_like(source_audio) if params.stem == "all" else (_pad_or_trim(separated_context, length) / gain if separated_context is not None else (source_audio - stem_audio)).astype(np.float32)
772
+ context_detail = "full mix" if params.stem == "all" else ("separated non-target stems" if separated_context is not None else "residual non-target stems")
773
+ stage.detail = stem_detail + f" · reproduction uses {context_detail}"
774
  _write_audio(out / "source.wav", _soft_limit(source_audio), stem_sr, subtype="PCM_16")
775
  _write_audio(out / "stem.wav", _soft_limit(stem_audio), stem_sr, subtype="PCM_16")
776
  _write_audio(out / "context_bed.wav", _soft_limit(context_bed), stem_sr, subtype="PCM_16")
requirements-spleeter.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ # Optional Spleeter backend.
2
+ # Install in a dedicated Python environment compatible with Spleeter/TensorFlow.
3
+ spleeter
scripts/test_selected_export_card_actions.py ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env python3
2
+ """Smoke-test card selection semantics, selected-only export, draw-next, and immediate clip edits."""
3
+ from __future__ import annotations
4
+
5
+ import io
6
+ import json
7
+ import sys
8
+ import time
9
+ import zipfile
10
+ from pathlib import Path
11
+ from urllib.parse import quote
12
+
13
+ import soundfile as sf
14
+ from fastapi.testclient import TestClient
15
+
16
+ ROOT = Path(__file__).resolve().parents[1]
17
+ if str(ROOT) not in sys.path:
18
+ sys.path.insert(0, str(ROOT))
19
+
20
+ from app import app # noqa: E402
21
+ from synth_generator import generate_test_song # noqa: E402
22
+
23
+
24
+ def wait_for_job(client: TestClient, job_id: str) -> dict:
25
+ for _ in range(120):
26
+ payload = client.get(f"/api/jobs/{job_id}").json()
27
+ if payload["status"] in {"complete", "error"}:
28
+ return payload
29
+ time.sleep(0.1)
30
+ raise TimeoutError(job_id)
31
+
32
+
33
+ def main() -> int:
34
+ song = generate_test_song(pattern_name="funk", bars=1, bpm=124, add_bass=False)
35
+ buf = io.BytesIO()
36
+ sf.write(buf, song.drums_only, song.sr, format="WAV")
37
+ buf.seek(0)
38
+
39
+ client = TestClient(app)
40
+ response = client.post(
41
+ "/api/jobs",
42
+ files={"file": ("cards.wav", buf, "audio/wav")},
43
+ data={"params": json.dumps({"stem": "all", "clustering_mode": "online_preview", "target_min": 3, "target_max": 10})},
44
+ )
45
+ response.raise_for_status()
46
+ job_id = response.json()["id"]
47
+ job = wait_for_job(client, job_id)
48
+ assert job["status"] == "complete", job.get("error")
49
+ samples = job["result"]["samples"]
50
+ assert samples, "expected at least one sample"
51
+ labels = [sample["label"] for sample in samples[: min(2, len(samples))]]
52
+
53
+ selected = client.post(f"/api/jobs/{job_id}/export-selected", json={"labels": labels})
54
+ selected.raise_for_status()
55
+ selected_payload = selected.json()["export"]
56
+ assert selected_payload["kind"] == "selected-sample-export"
57
+ assert selected_payload["selected_labels"] == sorted(labels)
58
+ archive_response = client.get(selected_payload["file_urls"]["archive"])
59
+ archive_response.raise_for_status()
60
+ with zipfile.ZipFile(io.BytesIO(archive_response.content)) as zf:
61
+ names = zf.namelist()
62
+ assert any(name.endswith(".wav") for name in names), names
63
+
64
+ label = labels[0]
65
+ draw = client.post(f"/api/jobs/{job_id}/samples/{quote(label, safe='')}/draw", json={})
66
+ draw.raise_for_status()
67
+ drawn = draw.json()["sample"]
68
+ assert drawn["label"] == label
69
+ assert drawn["url"]
70
+ drawn_audio = client.get(drawn["url"])
71
+ drawn_audio.raise_for_status()
72
+ assert drawn_audio.content[:4] == b"RIFF"
73
+
74
+ edit = client.post(f"/api/jobs/{job_id}/samples/{quote(label, safe='')}/edit", json={"start_offset_ms": 5, "tail_offset_ms": 30})
75
+ edit.raise_for_status()
76
+ edited = edit.json()["sample"]
77
+ assert edited["label"] == label
78
+ assert "overrides/hits" in edited["file"], edited
79
+ edited_audio = client.get(edited["url"])
80
+ edited_audio.raise_for_status()
81
+ assert edited_audio.content[:4] == b"RIFF"
82
+
83
+ print(json.dumps({
84
+ "status": "ok",
85
+ "job_id": job_id,
86
+ "selected_labels": labels,
87
+ "drawn_representative_hit_index": drawn["representative_hit_index"],
88
+ "edited_file": edited["file"],
89
+ "archive": selected_payload["files"]["archive"],
90
+ }, indent=2))
91
+ return 0
92
+
93
+
94
+ if __name__ == "__main__":
95
+ raise SystemExit(main())
supervised_export.py CHANGED
@@ -178,6 +178,9 @@ def export_supervised_state(
178
  synthesize: bool = True,
179
  quantize: bool | None = None,
180
  subdivision: int | None = None,
 
 
 
181
  ) -> dict[str, Any]:
182
  """Create edited artifacts from ``supervision_state.json``.
183
 
@@ -188,13 +191,22 @@ def export_supervised_state(
188
  state = load_or_create_state(job_id, out)
189
  recompute_scores(state)
190
 
191
- export_dir = out / "supervised"
 
 
 
 
192
  if export_dir.exists():
193
  shutil.rmtree(export_dir)
194
  samples_dir = export_dir / "samples"
195
  samples_dir.mkdir(parents=True, exist_ok=True)
196
 
197
  clusters = _state_to_clusters(out, state)
 
 
 
 
 
198
  bpm = float(manifest.get("bpm") or 120.0)
199
  sr = int(manifest.get("sample_rate") or 44100)
200
  params = manifest.get("params") or {}
@@ -238,13 +250,13 @@ def export_supervised_state(
238
  rendered = _make_reproduction_mix(target_rendered, context_bed, max(source_length, len(target_rendered)))
239
  _write_audio(export_dir / "target_reconstruction.wav", _soft_limit(target_rendered), sr, subtype="PCM_16")
240
  _write_audio(export_dir / "reconstruction.wav", rendered, sr, subtype="PCM_16")
241
- files["midi"] = "supervised/reconstruction.mid"
242
- files["target_reconstruction"] = "supervised/target_reconstruction.wav"
243
- files["reconstruction"] = "supervised/reconstruction.wav"
244
 
245
  for cluster in sorted(clusters, key=lambda item: item.count, reverse=True):
246
  best = cluster.best_hit
247
- sample_file = f"supervised/samples/{cluster.label}.wav"
248
  best.save(str(out / sample_file))
249
  quality = sample_quality_score(best.audio, best.sr, cluster.label.rsplit("_", 1)[0])
250
  samples.append(
@@ -262,7 +274,7 @@ def export_supervised_state(
262
  }
263
  )
264
  if cluster.synthesized is not None:
265
- _write_audio(out / f"supervised/samples/{cluster.label}__synth.wav", cluster.synthesized, sr, subtype="PCM_24")
266
 
267
  archive_tmp = build_archive(
268
  clusters,
@@ -272,7 +284,7 @@ def export_supervised_state(
272
  rendered_audio=rendered,
273
  target_rendered_audio=target_rendered,
274
  )
275
- archive_rel = "supervised/sample-pack.zip"
276
  shutil.copyfile(archive_tmp, out / archive_rel)
277
  try:
278
  os.unlink(archive_tmp)
@@ -282,7 +294,7 @@ def export_supervised_state(
282
 
283
  active_hits = [hit for hit in state.get("hits", {}).values() if not hit.get("suppressed")]
284
  export_manifest = {
285
- "kind": "supervised-export",
286
  "job_id": job_id,
287
  "created_at": now(),
288
  "duration_sec": round(time.perf_counter() - started, 6),
@@ -293,6 +305,7 @@ def export_supervised_state(
293
  "hit_count": len(active_hits),
294
  "suppressed_hit_count": sum(1 for hit in state.get("hits", {}).values() if hit.get("suppressed")),
295
  "cluster_count": len(clusters),
 
296
  "quantize_midi": bool(quantize),
297
  "subdivision": int(subdivision),
298
  "samples": samples,
@@ -308,7 +321,8 @@ def export_supervised_state(
308
  state.setdefault("exports", []).append(
309
  {
310
  "created_at": export_manifest["created_at"],
311
- "path": "supervised/manifest.json",
 
312
  "hit_count": export_manifest["hit_count"],
313
  "cluster_count": export_manifest["cluster_count"],
314
  "suppressed_hit_count": export_manifest["suppressed_hit_count"],
@@ -333,3 +347,26 @@ def export_supervised_state(
333
  state_path.write_text(json.dumps(state, indent=2, sort_keys=True), encoding="utf-8")
334
 
335
  return export_manifest
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
178
  synthesize: bool = True,
179
  quantize: bool | None = None,
180
  subdivision: int | None = None,
181
+ selected_labels: set[str] | list[str] | None = None,
182
+ export_dir_name: str = "supervised",
183
+ kind: str = "supervised-export",
184
  ) -> dict[str, Any]:
185
  """Create edited artifacts from ``supervision_state.json``.
186
 
 
191
  state = load_or_create_state(job_id, out)
192
  recompute_scores(state)
193
 
194
+ safe_export_dir_name = "".join(ch if ch.isalnum() or ch in {"-", "_"} else "_" for ch in str(export_dir_name or "supervised")).strip("_") or "supervised"
195
+ export_prefix = safe_export_dir_name
196
+ selected_label_set = {str(label) for label in selected_labels} if selected_labels else None
197
+
198
+ export_dir = out / safe_export_dir_name
199
  if export_dir.exists():
200
  shutil.rmtree(export_dir)
201
  samples_dir = export_dir / "samples"
202
  samples_dir.mkdir(parents=True, exist_ok=True)
203
 
204
  clusters = _state_to_clusters(out, state)
205
+ if selected_label_set is not None:
206
+ clusters = [cluster for cluster in clusters if cluster.label in selected_label_set]
207
+ missing = sorted(selected_label_set - {cluster.label for cluster in clusters})
208
+ if missing:
209
+ raise ValueError(f"Selected sample label(s) not found in current state: {', '.join(missing)}")
210
  bpm = float(manifest.get("bpm") or 120.0)
211
  sr = int(manifest.get("sample_rate") or 44100)
212
  params = manifest.get("params") or {}
 
250
  rendered = _make_reproduction_mix(target_rendered, context_bed, max(source_length, len(target_rendered)))
251
  _write_audio(export_dir / "target_reconstruction.wav", _soft_limit(target_rendered), sr, subtype="PCM_16")
252
  _write_audio(export_dir / "reconstruction.wav", rendered, sr, subtype="PCM_16")
253
+ files["midi"] = f"{export_prefix}/reconstruction.mid"
254
+ files["target_reconstruction"] = f"{export_prefix}/target_reconstruction.wav"
255
+ files["reconstruction"] = f"{export_prefix}/reconstruction.wav"
256
 
257
  for cluster in sorted(clusters, key=lambda item: item.count, reverse=True):
258
  best = cluster.best_hit
259
+ sample_file = f"{export_prefix}/samples/{cluster.label}.wav"
260
  best.save(str(out / sample_file))
261
  quality = sample_quality_score(best.audio, best.sr, cluster.label.rsplit("_", 1)[0])
262
  samples.append(
 
274
  }
275
  )
276
  if cluster.synthesized is not None:
277
+ _write_audio(out / f"{export_prefix}/samples/{cluster.label}__synth.wav", cluster.synthesized, sr, subtype="PCM_24")
278
 
279
  archive_tmp = build_archive(
280
  clusters,
 
284
  rendered_audio=rendered,
285
  target_rendered_audio=target_rendered,
286
  )
287
+ archive_rel = f"{export_prefix}/sample-pack.zip"
288
  shutil.copyfile(archive_tmp, out / archive_rel)
289
  try:
290
  os.unlink(archive_tmp)
 
294
 
295
  active_hits = [hit for hit in state.get("hits", {}).values() if not hit.get("suppressed")]
296
  export_manifest = {
297
+ "kind": kind,
298
  "job_id": job_id,
299
  "created_at": now(),
300
  "duration_sec": round(time.perf_counter() - started, 6),
 
305
  "hit_count": len(active_hits),
306
  "suppressed_hit_count": sum(1 for hit in state.get("hits", {}).values() if hit.get("suppressed")),
307
  "cluster_count": len(clusters),
308
+ "selected_labels": sorted(selected_label_set) if selected_label_set is not None else None,
309
  "quantize_midi": bool(quantize),
310
  "subdivision": int(subdivision),
311
  "samples": samples,
 
321
  state.setdefault("exports", []).append(
322
  {
323
  "created_at": export_manifest["created_at"],
324
+ "path": f"{export_prefix}/manifest.json",
325
+ "kind": kind,
326
  "hit_count": export_manifest["hit_count"],
327
  "cluster_count": export_manifest["cluster_count"],
328
  "suppressed_hit_count": export_manifest["suppressed_hit_count"],
 
347
  state_path.write_text(json.dumps(state, indent=2, sort_keys=True), encoding="utf-8")
348
 
349
  return export_manifest
350
+
351
+
352
+ def export_selected_samples(
353
+ output_dir: str | os.PathLike[str],
354
+ job_id: str,
355
+ *,
356
+ selected_labels: list[str] | set[str],
357
+ synthesize: bool = True,
358
+ quantize: bool | None = None,
359
+ subdivision: int | None = None,
360
+ ) -> dict[str, Any]:
361
+ if not selected_labels:
362
+ raise ValueError("selected_labels must contain at least one sample label")
363
+ return export_supervised_state(
364
+ output_dir,
365
+ job_id,
366
+ synthesize=synthesize,
367
+ quantize=quantize,
368
+ subdivision=subdivision,
369
+ selected_labels=set(map(str, selected_labels)),
370
+ export_dir_name="selected",
371
+ kind="selected-sample-export",
372
+ )
supervised_state.py CHANGED
@@ -904,3 +904,117 @@ def public_state(state: dict[str, Any], url_for: Callable[[str], str] | None = N
904
  "suggestions": open_suggestions[:50],
905
  "review_queue": review_queue(state, review_limit),
906
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
904
  "suggestions": open_suggestions[:50],
905
  "review_queue": review_queue(state, review_limit),
906
  }
907
+
908
+
909
+ def pin_representative(output_dir: str | Path, job_id: str, cluster_id: str, hit_id: str, source: str = "user") -> dict[str, Any]:
910
+ """Persistently choose a representative hit for a cluster/card."""
911
+ state = load_or_create_state(job_id, output_dir)
912
+ clusters = state.get("clusters", {})
913
+ hits = state.get("hits", {})
914
+ if cluster_id not in clusters:
915
+ raise KeyError(f"Unknown cluster: {cluster_id}")
916
+ if hit_id not in hits:
917
+ raise KeyError(f"Unknown hit: {hit_id}")
918
+ if hit_id not in clusters[cluster_id].get("hit_ids", []):
919
+ raise ValueError(f"Hit {hit_id} is not a member of {cluster_id}")
920
+ _push_undo(state)
921
+ for hid in clusters[cluster_id].get("hit_ids", []):
922
+ if hid in hits:
923
+ hits[hid]["is_representative"] = (hid == hit_id)
924
+ clusters[cluster_id]["representative_hit_id"] = hit_id
925
+ hits[hit_id]["favorite"] = True
926
+ hits[hit_id]["review_status"] = "favorite"
927
+ hits[hit_id]["explicit"] = True
928
+ _constraint(state, "pin-representative", {"hit_id": hit_id, "cluster_id": cluster_id}, source=source)
929
+ _event(state, "cluster.representative_pinned", {"hit_id": hit_id, "cluster_id": cluster_id}, source=source)
930
+ recompute_scores(state)
931
+ return _write_state(output_dir, state)
932
+
933
+
934
+ def draw_next_representative(output_dir: str | Path, job_id: str, cluster_id: str, source: str = "user") -> dict[str, Any]:
935
+ """Cycle a cluster/card to the next available non-suppressed candidate."""
936
+ state = load_or_create_state(job_id, output_dir)
937
+ clusters = state.get("clusters", {})
938
+ hits = state.get("hits", {})
939
+ if cluster_id not in clusters:
940
+ raise KeyError(f"Unknown cluster: {cluster_id}")
941
+ cluster = clusters[cluster_id]
942
+ active_ids = [hid for hid in cluster.get("hit_ids", []) if hid in hits and not hits[hid].get("suppressed")]
943
+ if not active_ids:
944
+ raise ValueError(f"Cluster {cluster_id} has no active hits")
945
+ current = cluster.get("representative_hit_id")
946
+ if current in active_ids:
947
+ next_id = active_ids[(active_ids.index(current) + 1) % len(active_ids)]
948
+ else:
949
+ next_id = active_ids[0]
950
+ return pin_representative(output_dir, job_id, cluster_id, next_id, source=source)
951
+
952
+
953
+ def edit_hit_timing(
954
+ output_dir: str | Path,
955
+ job_id: str,
956
+ hit_id: str,
957
+ *,
958
+ start_offset_ms: float = 0.0,
959
+ tail_offset_ms: float = 0.0,
960
+ source: str = "user",
961
+ ) -> dict[str, Any]:
962
+ """Rewrite one hit preview from stem.wav and persist the timing edit.
963
+
964
+ ``start_offset_ms`` trims from the front when positive and extends earlier when
965
+ negative. ``tail_offset_ms`` extends when positive and trims the tail when
966
+ negative. The selected hit's file path is replaced so cards and supervised
967
+ exports immediately use the edited audio.
968
+ """
969
+ import numpy as np
970
+ import soundfile as sf
971
+ import librosa
972
+
973
+ out = Path(output_dir)
974
+ state = load_or_create_state(job_id, out)
975
+ hits = state.get("hits", {})
976
+ if hit_id not in hits:
977
+ raise KeyError(f"Unknown hit: {hit_id}")
978
+ hit = hits[hit_id]
979
+ stem_path = out / "stem.wav"
980
+ if not stem_path.exists():
981
+ raise FileNotFoundError("stem.wav is required for timing edits")
982
+ audio, sr = sf.read(stem_path, dtype="float32", always_2d=False)
983
+ if audio.ndim > 1:
984
+ audio = audio.mean(axis=1)
985
+ audio = np.asarray(audio, dtype=np.float32)
986
+
987
+ original_onset = _safe_float(hit.get("onset_sec"))
988
+ original_duration = max(0.02, _safe_float(hit.get("duration_ms"), 100.0) / 1000.0)
989
+ start_offset = _safe_float(start_offset_ms) / 1000.0
990
+ tail_offset = _safe_float(tail_offset_ms) / 1000.0
991
+ new_onset = max(0.0, original_onset + start_offset)
992
+ new_duration = max(0.02, original_duration - start_offset + tail_offset)
993
+ start = max(0, int(round(new_onset * sr)))
994
+ end = min(len(audio), start + int(round(new_duration * sr)))
995
+ if end <= start:
996
+ raise ValueError("Edited sample range is outside the available stem audio")
997
+ segment = audio[start:end].copy()
998
+ fade_len = min(int(0.003 * sr), len(segment) // 4)
999
+ if fade_len > 0:
1000
+ segment[-fade_len:] *= np.linspace(1, 0, fade_len)
1001
+ rms = float(np.sqrt(np.mean(segment**2))) if len(segment) else 0.0
1002
+ spectral_centroid = float(librosa.feature.spectral_centroid(y=segment, sr=sr).mean()) if len(segment) >= 32 else 0.0
1003
+ safe_label = _safe_file_component(hit.get("label") or "edited")
1004
+ rel_file = f"overrides/hits/hit_{_safe_int(hit.get('index')):05d}_{safe_label}_edited.wav"
1005
+ full_path = out / rel_file
1006
+ full_path.parent.mkdir(parents=True, exist_ok=True)
1007
+ sf.write(full_path, segment, sr, subtype="PCM_24")
1008
+
1009
+ _push_undo(state)
1010
+ hit["onset_sec"] = round(new_onset, 6)
1011
+ hit["duration_ms"] = round((len(segment) / sr) * 1000.0, 1)
1012
+ hit["rms_energy"] = round(rms, 6)
1013
+ hit["spectral_centroid_hz"] = round(spectral_centroid, 1)
1014
+ hit["file"] = rel_file
1015
+ hit["explicit"] = True
1016
+ hit["review_status"] = "accepted"
1017
+ _constraint(state, "edit-hit-timing", {"hit_id": hit_id, "start_offset_ms": round(_safe_float(start_offset_ms), 3), "tail_offset_ms": round(_safe_float(tail_offset_ms), 3)}, source=source)
1018
+ _event(state, "hit.timing_edited", {"hit_id": hit_id, "file": rel_file, "onset_sec": hit["onset_sec"], "duration_ms": hit["duration_ms"]}, source=source)
1019
+ recompute_scores(state)
1020
+ return _write_state(out, state)
web/app.js CHANGED
@@ -1,10 +1,10 @@
1
  const $ = (id) => document.getElementById(id);
2
 
3
  const fields = [
4
- "stem", "demucs_model", "clustering_mode", "demucs_shifts", "demucs_overlap", "onset_mode", "onset_delta",
5
  "energy_threshold_db", "pre_pad", "min_dur", "max_dur", "min_gap", "ncc_threshold",
6
  "attack_ms", "mel_threshold", "linkage", "target_min", "target_max", "subdivision",
7
- "synthesize", "quantize_midi", "auto_tune", "use_disk_cache"
8
  ];
9
 
10
  let config = null;
@@ -27,6 +27,9 @@ let autoRunToken = 0;
27
  let dismissedSampleKeys = new Set();
28
  let extraDrawnSamples = [];
29
  let sampleEdits = new Map();
 
 
 
30
  let waveZoom = 1;
31
  let waveOffset = 0;
32
 
@@ -353,6 +356,8 @@ function setSelectOptions(select, values, labels = null) {
353
  }
354
 
355
  function populateConfig() {
 
 
356
  setSelectOptions($("demucs_model"), config.demucs_models);
357
  setSelectOptions($("clustering_mode"), Object.keys(config.clustering_modes ?? { batch_quality: "", online_preview: "" }), config.clustering_modes);
358
  const defaults = config.defaults;
@@ -367,13 +372,23 @@ function populateConfig() {
367
  }
368
 
369
  function updateStemOptions() {
370
- const model = $("demucs_model").value || config.defaults.demucs_model;
371
- const stems = config.demucs_stems[model] ?? ["drums", "bass", "other", "vocals", "all"];
 
 
 
 
 
 
 
 
 
372
  const current = $("stem").value || config.defaults.stem;
373
  setSelectOptions($("stem"), stems);
374
  $("stem").value = stems.includes(current) ? current : stems[0];
375
  }
376
 
 
377
  function collectParams() {
378
  const params = {};
379
  const defaults = config?.defaults ?? {};
@@ -701,11 +716,30 @@ function sampleType(sample) {
701
 
702
  function visibleSamples(result) {
703
  const base = [...(result?.samples ?? []), ...extraDrawnSamples];
704
- return base
 
 
 
 
705
  .map((sample) => ({ ...sample, _key: sampleKey(sample), _type: sampleType(sample), _edit: sampleEdits.get(sampleKey(sample)) || { startMs: 0, tailMs: 0 } }))
706
  .filter((sample) => !dismissedSampleKeys.has(sample._key));
707
  }
708
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
709
  function groupedSamples(samples) {
710
  const preferred = ["kick", "snare", "hihat", "cymbal", "tom", "perc", "other"];
711
  const map = new Map();
@@ -721,14 +755,38 @@ function groupedSamples(samples) {
721
  });
722
  }
723
 
724
- function updateSelectedExportCount(count) {
 
 
 
725
  const text = `${count} Selected`;
726
  if ($("selectedCountTop")) $("selectedCountTop").textContent = `(${count})`;
727
  if ($("selectedCountBottom")) $("selectedCountBottom").textContent = text;
728
- if ($("exportSelectedButton")) $("exportSelectedButton").disabled = count === 0 || !lastResult;
729
  if ($("exportAllButton")) $("exportAllButton").disabled = !lastResult;
730
  }
731
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
732
  function updateControlOutputs() {
733
  const pct = Math.round((Number($("onset_delta")?.value || 0) / 0.35) * 100);
734
  if ($("sensitivityOutput")) $("sensitivityOutput").textContent = Number.isFinite(pct) ? `${pct}%` : "Auto";
@@ -739,7 +797,9 @@ function updateControlOutputs() {
739
  }
740
 
741
  async function dismissSample(sample) {
742
- dismissedSampleKeys.add(sample._key || sampleKey(sample));
 
 
743
  renderSamples(lastResult || { samples: [] });
744
  const index = sample.representative_hit_index;
745
  if (activeJobId && index !== undefined && index !== null) {
@@ -751,32 +811,29 @@ async function dismissSample(sample) {
751
  }
752
  }
753
 
754
- function drawAnotherSample(type) {
755
- const used = new Set([...(lastResult?.samples ?? []), ...extraDrawnSamples].map((sample) => Number(sample.representative_hit_index)).filter(Number.isFinite));
756
- const hit = (lastResult?.hits ?? [])
757
- .filter((item) => sampleType(item) === type || sampleType({ classification: item.label }) === type)
758
- .filter((item) => !used.has(Number(item.index)))
759
- .sort((a, b) => Number(b.rms_energy || 0) - Number(a.rms_energy || 0))[0];
760
- if (!hit) {
761
- showError("No more candidates", new Error(`No additional ${type} candidates are available yet.`), "Try adding a missing onset on the waveform or rerun with higher sensitivity.");
762
  return;
763
  }
764
- extraDrawnSamples.push({
765
- label: `${type}_draw_${hit.index}`,
766
- classification: type,
767
- hits: 1,
768
- score: "candidate",
769
- duration_ms: hit.duration_ms,
770
- first_onset_sec: hit.onset_sec,
771
- representative_hit_index: hit.index,
772
- cluster_id: hit.cluster_id,
773
- file: hit.file,
774
- url: hit.url,
775
- _drawn: true,
776
- });
777
- renderSamples(lastResult || { samples: [] });
 
778
  }
779
 
 
780
  function updateSampleEdit(sample, patch) {
781
  const key = sample._key || sampleKey(sample);
782
  const current = sampleEdits.get(key) || { startMs: 0, tailMs: 0 };
@@ -787,25 +844,39 @@ function updateSampleEdit(sample, patch) {
787
  renderSamples(lastResult || { samples: [] });
788
  }
789
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
790
  async function saveSampleEdit(sample) {
791
  if (!activeJobId) return;
792
  const edit = sampleEdits.get(sample._key || sampleKey(sample));
793
  if (!edit) return;
794
- const start = Math.max(0, Number(sample.first_onset_sec || 0) + Number(edit.startMs || 0) / 1000);
795
- const duration = Math.max(25, Number(sample.duration_ms || 100) - Number(edit.startMs || 0) + Number(edit.tailMs || 0));
796
- const state = await jsonApi(`/api/jobs/${encodeURIComponent(activeJobId)}/hits/force-onset`, {
797
- onset_sec: start,
798
- duration_ms: duration,
799
- label: sample.classification || sample._type || "hit",
800
- });
801
- renderSupervisionState(state);
802
- showError("Edited clip saved", new Error("A forced hit was added from the adjusted card."), "Use Export edited pack to render the edited state.");
803
  }
804
 
 
805
  function renderSamples(result) {
806
  const samples = visibleSamples(result);
 
807
  if ($("sampleCountLabel")) $("sampleCountLabel").textContent = `(${samples.length})`;
808
- updateSelectedExportCount(samples.length);
809
 
810
  const grid = $("samplesGrid");
811
  if (grid) {
@@ -824,7 +895,8 @@ function renderSamples(result) {
824
  const edit = sample._edit || { startMs: 0, tailMs: 0 };
825
  const editLabel = (edit.startMs || edit.tailMs) ? ` · edit ${edit.startMs >= 0 ? "+" : ""}${edit.startMs}ms/${edit.tailMs >= 0 ? "+" : ""}${edit.tailMs}ms` : "";
826
  return `
827
- <article class="sample-card ${absoluteIndex === selectedSampleIndex ? "selected" : ""}" style="--card-color: ${esc(color)}" data-sample-card="${absoluteIndex}">
 
828
  <button class="sample-play-zone" type="button" data-sample-audition="${absoluteIndex}">
829
  <canvas class="sample-wave" data-wave-url="${esc(sample.url)}" data-wave-color="${esc(color)}"></canvas>
830
  <span class="sample-card-footer">
@@ -833,10 +905,11 @@ function renderSamples(result) {
833
  </span>
834
  </button>
835
  <div class="sample-card-actions">
836
- <button type="button" data-sample-dismiss="${absoluteIndex}">Dismiss</button>
837
- <button type="button" data-sample-trim-start="${absoluteIndex}">Trim start</button>
838
- <button type="button" data-sample-extend-tail="${absoluteIndex}">Extend tail</button>
839
- <button type="button" data-sample-save-edit="${absoluteIndex}" ${edit.startMs || edit.tailMs ? "" : "disabled"}>Save edit</button>
 
840
  </div>
841
  </article>
842
  `;
@@ -853,6 +926,14 @@ function renderSamples(result) {
853
  renderSamples(result);
854
  });
855
  }
 
 
 
 
 
 
 
 
856
  for (const button of grid.querySelectorAll("[data-sample-dismiss]")) {
857
  button.addEventListener("click", (event) => {
858
  event.stopPropagation();
@@ -863,19 +944,26 @@ function renderSamples(result) {
863
  for (const button of grid.querySelectorAll("[data-draw-type]")) {
864
  button.addEventListener("click", (event) => {
865
  event.stopPropagation();
866
- drawAnotherSample(button.dataset.drawType);
 
 
 
 
 
 
 
867
  });
868
  }
869
  for (const button of grid.querySelectorAll("[data-sample-trim-start]")) {
870
  button.addEventListener("click", (event) => {
871
  event.stopPropagation();
872
- updateSampleEdit(samples[Number(button.dataset.sampleTrimStart)], { startMs: 10 });
873
  });
874
  }
875
  for (const button of grid.querySelectorAll("[data-sample-extend-tail]")) {
876
  button.addEventListener("click", (event) => {
877
  event.stopPropagation();
878
- updateSampleEdit(samples[Number(button.dataset.sampleExtendTail)], { tailMs: 20 });
879
  });
880
  }
881
  for (const button of grid.querySelectorAll("[data-sample-save-edit]")) {
@@ -1341,6 +1429,9 @@ function clearRunViews() {
1341
  dismissedSampleKeys = new Set();
1342
  extraDrawnSamples = [];
1343
  sampleEdits = new Map();
 
 
 
1344
  $("downloads").innerHTML = "";
1345
  $("editedDownloads").innerHTML = "";
1346
  $("supervisionSummary").textContent = "No interactive state loaded.";
@@ -1422,10 +1513,14 @@ async function boot() {
1422
  }
1423
 
1424
  $("dismissErrorButton").addEventListener("click", clearError);
 
 
1425
  $("demucs_model").addEventListener("change", updateStemOptions);
1426
  $("fileInput").addEventListener("change", (event) => setFile(event.target.files?.[0] ?? null));
1427
  $("runButton").addEventListener("click", () => runExtraction({ automatic: false }));
1428
  $("usePreviewButton").addEventListener("click", () => {
 
 
1429
  $("stem").value = "all";
1430
  $("clustering_mode").value = "online_preview";
1431
  $("demucs_shifts").value = 0;
@@ -1436,6 +1531,8 @@ $("usePreviewButton").addEventListener("click", () => {
1436
  $("resultSummary").textContent = "Fast preview preset applied: full mix, online grouping, no Demucs shifts.";
1437
  });
1438
  $("useQualityButton").addEventListener("click", () => {
 
 
1439
  if (($("stem").value || "") === "all") $("stem").value = "drums";
1440
  $("clustering_mode").value = "batch_quality";
1441
  $("demucs_shifts").value = 1;
@@ -1587,15 +1684,36 @@ for (const button of document.querySelectorAll("[data-zoom-command]")) {
1587
  button.addEventListener("click", () => zoomWaveformAround($("waveform").getBoundingClientRect().left + $("waveform").getBoundingClientRect().width / 2, button.dataset.zoomCommand === "in" ? 1.35 : 1 / 1.35));
1588
  }
1589
  if ($("openToolsButton")) $("openToolsButton").addEventListener("click", () => { const drawer = $("toolsDrawer"); drawer.hidden = !drawer.hidden; });
1590
- if ($("selectAllSamplesButton")) $("selectAllSamplesButton").addEventListener("click", () => updateSelectedExportCount(visibleSamples(lastResult || { samples: [] }).length));
1591
- if ($("clearSelectionButton")) $("clearSelectionButton").addEventListener("click", () => updateSelectedExportCount(0));
1592
  function clickArchiveDownload() {
 
 
1593
  const link = $("downloads")?.querySelector('a[href*="sample-pack"], a[download], a');
1594
  if (link) link.click();
1595
  else showError("Nothing to export yet", new Error("Run extraction first; sample-pack ZIP appears when processing completes."));
1596
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1597
  if ($("exportAllButton")) $("exportAllButton").addEventListener("click", clickArchiveDownload);
1598
- if ($("exportSelectedButton")) $("exportSelectedButton").addEventListener("click", clickArchiveDownload);
 
1599
  if ($("resetUiButton")) $("resetUiButton").addEventListener("click", () => { populateConfig(); updateControlOutputs(); });
1600
  if ($("groupSimilarToggle")) $("groupSimilarToggle").addEventListener("change", () => { $("clustering_mode").value = $("groupSimilarToggle").checked ? "batch_quality" : "online_preview"; });
1601
  updateControlOutputs();
 
1
  const $ = (id) => document.getElementById(id);
2
 
3
  const fields = [
4
+ "stem", "separation_backend", "spleeter_model", "demucs_model", "clustering_mode", "demucs_shifts", "demucs_overlap", "onset_mode", "onset_delta",
5
  "energy_threshold_db", "pre_pad", "min_dur", "max_dur", "min_gap", "ncc_threshold",
6
  "attack_ms", "mel_threshold", "linkage", "target_min", "target_max", "subdivision",
7
+ "synthesize", "quantize_midi", "auto_tune", "use_disk_cache", "allow_backend_fallback"
8
  ];
9
 
10
  let config = null;
 
27
  let dismissedSampleKeys = new Set();
28
  let extraDrawnSamples = [];
29
  let sampleEdits = new Map();
30
+ let selectedSampleKeys = new Set();
31
+ let sampleOverrides = new Map();
32
+ let userChangedSampleSelection = false;
33
  let waveZoom = 1;
34
  let waveOffset = 0;
35
 
 
356
  }
357
 
358
  function populateConfig() {
359
+ if ($("separation_backend")) setSelectOptions($("separation_backend"), config.separation_backends ?? ["spleeter", "demucs", "none"], { spleeter: "Spleeter (default)", demucs: "Demucs", none: "No separation / full mix" });
360
+ if ($("spleeter_model")) setSelectOptions($("spleeter_model"), config.spleeter_models ?? ["spleeter:4stems"]);
361
  setSelectOptions($("demucs_model"), config.demucs_models);
362
  setSelectOptions($("clustering_mode"), Object.keys(config.clustering_modes ?? { batch_quality: "", online_preview: "" }), config.clustering_modes);
363
  const defaults = config.defaults;
 
372
  }
373
 
374
  function updateStemOptions() {
375
+ const backend = $("separation_backend")?.value || config.defaults.separation_backend || "spleeter";
376
+ let stems = ["drums", "bass", "other", "vocals", "all"];
377
+ if (backend === "spleeter") {
378
+ const model = $("spleeter_model")?.value || config.defaults.spleeter_model || "spleeter:4stems";
379
+ stems = config.spleeter_stems?.[model] ?? stems;
380
+ } else if (backend === "demucs") {
381
+ const model = $("demucs_model")?.value || config.defaults.demucs_model;
382
+ stems = config.demucs_stems?.[model] ?? stems;
383
+ } else {
384
+ stems = ["all"];
385
+ }
386
  const current = $("stem").value || config.defaults.stem;
387
  setSelectOptions($("stem"), stems);
388
  $("stem").value = stems.includes(current) ? current : stems[0];
389
  }
390
 
391
+
392
  function collectParams() {
393
  const params = {};
394
  const defaults = config?.defaults ?? {};
 
716
 
717
  function visibleSamples(result) {
718
  const base = [...(result?.samples ?? []), ...extraDrawnSamples];
719
+ const merged = base.map((sample) => {
720
+ const key = sampleKey(sample);
721
+ return { ...sample, ...(sampleOverrides.get(key) || {}) };
722
+ });
723
+ return merged
724
  .map((sample) => ({ ...sample, _key: sampleKey(sample), _type: sampleType(sample), _edit: sampleEdits.get(sampleKey(sample)) || { startMs: 0, tailMs: 0 } }))
725
  .filter((sample) => !dismissedSampleKeys.has(sample._key));
726
  }
727
 
728
+ function ensureSelectionForSamples(samples) {
729
+ if (userChangedSampleSelection) return;
730
+ for (const sample of samples) {
731
+ if (!sample._key) continue;
732
+ if (!dismissedSampleKeys.has(sample._key) && !selectedSampleKeys.has(sample._key) && sample._autoSelected !== false) {
733
+ selectedSampleKeys.add(sample._key);
734
+ }
735
+ }
736
+ }
737
+
738
+ function selectedVisibleSamples(samples = visibleSamples(lastResult || { samples: [] })) {
739
+ return samples.filter((sample) => selectedSampleKeys.has(sample._key));
740
+ }
741
+
742
+
743
  function groupedSamples(samples) {
744
  const preferred = ["kick", "snare", "hihat", "cymbal", "tom", "perc", "other"];
745
  const map = new Map();
 
755
  });
756
  }
757
 
758
+ function updateSelectedExportCount(_count = null) {
759
+ const visible = visibleSamples(lastResult || { samples: [] });
760
+ const selected = selectedVisibleSamples(visible);
761
+ const count = selected.length;
762
  const text = `${count} Selected`;
763
  if ($("selectedCountTop")) $("selectedCountTop").textContent = `(${count})`;
764
  if ($("selectedCountBottom")) $("selectedCountBottom").textContent = text;
765
+ if ($("exportSelectedButton")) $("exportSelectedButton").disabled = count === 0 || !activeJobId;
766
  if ($("exportAllButton")) $("exportAllButton").disabled = !lastResult;
767
  }
768
 
769
+ function setSampleSelected(sample, selected) {
770
+ userChangedSampleSelection = true;
771
+ const key = sample._key || sampleKey(sample);
772
+ if (selected) selectedSampleKeys.add(key);
773
+ else selectedSampleKeys.delete(key);
774
+ updateSelectedExportCount();
775
+ }
776
+
777
+ function selectAllVisibleSamples() {
778
+ userChangedSampleSelection = true;
779
+ for (const sample of visibleSamples(lastResult || { samples: [] })) selectedSampleKeys.add(sample._key);
780
+ renderSamples(lastResult || { samples: [] });
781
+ }
782
+
783
+ function clearSampleSelection() {
784
+ userChangedSampleSelection = true;
785
+ selectedSampleKeys.clear();
786
+ renderSamples(lastResult || { samples: [] });
787
+ }
788
+
789
+
790
  function updateControlOutputs() {
791
  const pct = Math.round((Number($("onset_delta")?.value || 0) / 0.35) * 100);
792
  if ($("sensitivityOutput")) $("sensitivityOutput").textContent = Number.isFinite(pct) ? `${pct}%` : "Auto";
 
797
  }
798
 
799
  async function dismissSample(sample) {
800
+ const key = sample._key || sampleKey(sample);
801
+ dismissedSampleKeys.add(key);
802
+ selectedSampleKeys.delete(key);
803
  renderSamples(lastResult || { samples: [] });
804
  const index = sample.representative_hit_index;
805
  if (activeJobId && index !== undefined && index !== null) {
 
811
  }
812
  }
813
 
814
+ async function drawAnotherSample(type, sample = null) {
815
+ if (!activeJobId) {
816
+ showError("No active extraction", new Error("Run extraction before drawing replacement cards."));
 
 
 
 
 
817
  return;
818
  }
819
+ const sourceSample = sample || visibleSamples(lastResult || { samples: [] }).find((item) => item._type === type);
820
+ if (!sourceSample?.label) {
821
+ showError("No card to redraw", new Error(`No ${type} card exists yet. Try a higher sensitivity or force a missing onset.`));
822
+ return;
823
+ }
824
+ try {
825
+ const payload = await jsonApi(`/api/jobs/${encodeURIComponent(activeJobId)}/samples/${encodeURIComponent(sourceSample.label)}/draw`, {});
826
+ const key = sampleKey(sourceSample);
827
+ sampleOverrides.set(key, { ...payload.sample, _autoSelected: true });
828
+ selectedSampleKeys.add(key);
829
+ renderSupervisionState(payload.state);
830
+ renderSamples(lastResult || { samples: [] });
831
+ } catch (error) {
832
+ showError("No more candidates", error, "Try adding a missing onset on the waveform or rerun with higher sensitivity.");
833
+ }
834
  }
835
 
836
+
837
  function updateSampleEdit(sample, patch) {
838
  const key = sample._key || sampleKey(sample);
839
  const current = sampleEdits.get(key) || { startMs: 0, tailMs: 0 };
 
844
  renderSamples(lastResult || { samples: [] });
845
  }
846
 
847
+ async function persistSampleEdit(sample, patch) {
848
+ if (!activeJobId || !sample?.label) return;
849
+ const key = sample._key || sampleKey(sample);
850
+ const current = sampleEdits.get(key) || { startMs: 0, tailMs: 0 };
851
+ const next = {
852
+ startMs: Math.max(-120, Math.min(250, Number(current.startMs || 0) + Number(patch.startMs || 0))),
853
+ tailMs: Math.max(-250, Math.min(500, Number(current.tailMs || 0) + Number(patch.tailMs || 0))),
854
+ };
855
+ sampleEdits.set(key, next);
856
+ const payload = await jsonApi(`/api/jobs/${encodeURIComponent(activeJobId)}/samples/${encodeURIComponent(sample.label)}/edit`, {
857
+ start_offset_ms: next.startMs,
858
+ tail_offset_ms: next.tailMs,
859
+ });
860
+ sampleOverrides.set(key, { ...payload.sample, _autoSelected: true });
861
+ selectedSampleKeys.add(key);
862
+ sampleEdits.set(key, { startMs: 0, tailMs: 0 });
863
+ renderSupervisionState(payload.state);
864
+ renderSamples(lastResult || { samples: [] });
865
+ }
866
+
867
  async function saveSampleEdit(sample) {
868
  if (!activeJobId) return;
869
  const edit = sampleEdits.get(sample._key || sampleKey(sample));
870
  if (!edit) return;
871
+ await persistSampleEdit(sample, { startMs: 0, tailMs: 0 });
 
 
 
 
 
 
 
 
872
  }
873
 
874
+
875
  function renderSamples(result) {
876
  const samples = visibleSamples(result);
877
+ ensureSelectionForSamples(samples);
878
  if ($("sampleCountLabel")) $("sampleCountLabel").textContent = `(${samples.length})`;
879
+ updateSelectedExportCount();
880
 
881
  const grid = $("samplesGrid");
882
  if (grid) {
 
895
  const edit = sample._edit || { startMs: 0, tailMs: 0 };
896
  const editLabel = (edit.startMs || edit.tailMs) ? ` · edit ${edit.startMs >= 0 ? "+" : ""}${edit.startMs}ms/${edit.tailMs >= 0 ? "+" : ""}${edit.tailMs}ms` : "";
897
  return `
898
+ <article class="sample-card ${absoluteIndex === selectedSampleIndex ? "selected" : ""} ${selectedSampleKeys.has(sample._key) ? "checked" : ""}" style="--card-color: ${esc(color)}" data-sample-card="${absoluteIndex}">
899
+ <label class="sample-select" title="Include in Export Selected"><input type="checkbox" data-sample-select="${absoluteIndex}" ${selectedSampleKeys.has(sample._key) ? "checked" : ""} /> <span></span></label>
900
  <button class="sample-play-zone" type="button" data-sample-audition="${absoluteIndex}">
901
  <canvas class="sample-wave" data-wave-url="${esc(sample.url)}" data-wave-color="${esc(color)}"></canvas>
902
  <span class="sample-card-footer">
 
905
  </span>
906
  </button>
907
  <div class="sample-card-actions">
908
+ <button type="button" data-sample-dismiss="${absoluteIndex}" title="Dismiss">Dismiss</button>
909
+ <button type="button" data-sample-draw="${absoluteIndex}" title="Draw another">Draw</button>
910
+ <button type="button" data-sample-trim-start="${absoluteIndex}" title="Trim start">Trim start</button>
911
+ <button type="button" data-sample-extend-tail="${absoluteIndex}" title="Extend tail">Extend tail</button>
912
+ <button type="button" data-sample-save-edit="${absoluteIndex}" title="Save timing edit" ${edit.startMs || edit.tailMs ? "" : "disabled"}>Save edit</button>
913
  </div>
914
  </article>
915
  `;
 
926
  renderSamples(result);
927
  });
928
  }
929
+ for (const input of grid.querySelectorAll("[data-sample-select]")) {
930
+ input.addEventListener("click", (event) => event.stopPropagation());
931
+ input.addEventListener("change", () => {
932
+ const sample = samples[Number(input.dataset.sampleSelect)];
933
+ setSampleSelected(sample, input.checked);
934
+ renderSamples(result);
935
+ });
936
+ }
937
  for (const button of grid.querySelectorAll("[data-sample-dismiss]")) {
938
  button.addEventListener("click", (event) => {
939
  event.stopPropagation();
 
944
  for (const button of grid.querySelectorAll("[data-draw-type]")) {
945
  button.addEventListener("click", (event) => {
946
  event.stopPropagation();
947
+ drawAnotherSample(button.dataset.drawType).catch((error) => showError("Could not draw another sample", error));
948
+ });
949
+ }
950
+ for (const button of grid.querySelectorAll("[data-sample-draw]")) {
951
+ button.addEventListener("click", (event) => {
952
+ event.stopPropagation();
953
+ const sample = samples[Number(button.dataset.sampleDraw)];
954
+ drawAnotherSample(sample._type, sample).catch((error) => showError("Could not draw another sample", error));
955
  });
956
  }
957
  for (const button of grid.querySelectorAll("[data-sample-trim-start]")) {
958
  button.addEventListener("click", (event) => {
959
  event.stopPropagation();
960
+ persistSampleEdit(samples[Number(button.dataset.sampleTrimStart)], { startMs: 10 }).catch((error) => showError("Could not trim sample", error));
961
  });
962
  }
963
  for (const button of grid.querySelectorAll("[data-sample-extend-tail]")) {
964
  button.addEventListener("click", (event) => {
965
  event.stopPropagation();
966
+ persistSampleEdit(samples[Number(button.dataset.sampleExtendTail)], { tailMs: 20 }).catch((error) => showError("Could not extend sample", error));
967
  });
968
  }
969
  for (const button of grid.querySelectorAll("[data-sample-save-edit]")) {
 
1429
  dismissedSampleKeys = new Set();
1430
  extraDrawnSamples = [];
1431
  sampleEdits = new Map();
1432
+ selectedSampleKeys = new Set();
1433
+ sampleOverrides = new Map();
1434
+ userChangedSampleSelection = false;
1435
  $("downloads").innerHTML = "";
1436
  $("editedDownloads").innerHTML = "";
1437
  $("supervisionSummary").textContent = "No interactive state loaded.";
 
1513
  }
1514
 
1515
  $("dismissErrorButton").addEventListener("click", clearError);
1516
+ if ($("separation_backend")) $("separation_backend").addEventListener("change", updateStemOptions);
1517
+ if ($("spleeter_model")) $("spleeter_model").addEventListener("change", updateStemOptions);
1518
  $("demucs_model").addEventListener("change", updateStemOptions);
1519
  $("fileInput").addEventListener("change", (event) => setFile(event.target.files?.[0] ?? null));
1520
  $("runButton").addEventListener("click", () => runExtraction({ automatic: false }));
1521
  $("usePreviewButton").addEventListener("click", () => {
1522
+ $("separation_backend").value = "none";
1523
+ updateStemOptions();
1524
  $("stem").value = "all";
1525
  $("clustering_mode").value = "online_preview";
1526
  $("demucs_shifts").value = 0;
 
1531
  $("resultSummary").textContent = "Fast preview preset applied: full mix, online grouping, no Demucs shifts.";
1532
  });
1533
  $("useQualityButton").addEventListener("click", () => {
1534
+ $("separation_backend").value = "demucs";
1535
+ updateStemOptions();
1536
  if (($("stem").value || "") === "all") $("stem").value = "drums";
1537
  $("clustering_mode").value = "batch_quality";
1538
  $("demucs_shifts").value = 1;
 
1684
  button.addEventListener("click", () => zoomWaveformAround($("waveform").getBoundingClientRect().left + $("waveform").getBoundingClientRect().width / 2, button.dataset.zoomCommand === "in" ? 1.35 : 1 / 1.35));
1685
  }
1686
  if ($("openToolsButton")) $("openToolsButton").addEventListener("click", () => { const drawer = $("toolsDrawer"); drawer.hidden = !drawer.hidden; });
1687
+ if ($("selectAllSamplesButton")) $("selectAllSamplesButton").addEventListener("click", selectAllVisibleSamples);
1688
+ if ($("clearSelectionButton")) $("clearSelectionButton").addEventListener("click", clearSampleSelection);
1689
  function clickArchiveDownload() {
1690
+ const url = lastResult?.file_urls?.archive;
1691
+ if (url) { window.location.href = url; return; }
1692
  const link = $("downloads")?.querySelector('a[href*="sample-pack"], a[download], a');
1693
  if (link) link.click();
1694
  else showError("Nothing to export yet", new Error("Run extraction first; sample-pack ZIP appears when processing completes."));
1695
  }
1696
+
1697
+ async function exportSelectedSamples() {
1698
+ if (!activeJobId) {
1699
+ showError("Nothing selected", new Error("Run extraction first."));
1700
+ return;
1701
+ }
1702
+ const samples = selectedVisibleSamples();
1703
+ const labels = samples.map((sample) => sample.label).filter(Boolean);
1704
+ if (!labels.length) {
1705
+ showError("Nothing selected", new Error("Select at least one sample card."));
1706
+ return;
1707
+ }
1708
+ const payload = await jsonApi(`/api/jobs/${encodeURIComponent(activeJobId)}/export-selected`, { labels, synthesize: true });
1709
+ renderEditedExport(payload.export);
1710
+ if (payload.state) renderSupervisionState(payload.state);
1711
+ const archiveUrl = payload.export?.file_urls?.archive;
1712
+ if (archiveUrl) window.location.href = archiveUrl;
1713
+ }
1714
  if ($("exportAllButton")) $("exportAllButton").addEventListener("click", clickArchiveDownload);
1715
+ if ($("exportSelectedButton")) $("exportSelectedButton").addEventListener("click", () => exportSelectedSamples().catch((error) => showError("Could not export selected samples", error)));
1716
+
1717
  if ($("resetUiButton")) $("resetUiButton").addEventListener("click", () => { populateConfig(); updateControlOutputs(); });
1718
  if ($("groupSimilarToggle")) $("groupSimilarToggle").addEventListener("change", () => { $("clustering_mode").value = $("groupSimilarToggle").checked ? "batch_quality" : "online_preview"; });
1719
  updateControlOutputs();
web/index.html CHANGED
@@ -142,6 +142,8 @@
142
  <details class="settings-section advanced-fold">
143
  <summary>Expert pipeline controls</summary>
144
  <div class="expert-grid">
 
 
145
  <label>Demucs model<select id="demucs_model"></select></label>
146
  <label>Clustering mode<select id="clustering_mode"><option value="batch_quality">batch quality</option><option value="online_preview">online preview</option></select></label>
147
  <label>Shifts<input id="demucs_shifts" type="number" min="0" max="8" step="1" /></label>
@@ -161,6 +163,7 @@
161
  <label><input id="quantize_midi" type="checkbox" /> quantize MIDI</label>
162
  <label><input id="auto_tune" type="checkbox" checked /> automatic parameter tuning</label>
163
  <label><input id="use_disk_cache" type="checkbox" /> disk cache stems/source loads</label>
 
164
  </div>
165
  <div class="preset-row">
166
  <button id="usePreviewButton" class="secondary-action" type="button">Fast preview</button>
 
142
  <details class="settings-section advanced-fold">
143
  <summary>Expert pipeline controls</summary>
144
  <div class="expert-grid">
145
+ <label>Separation engine<select id="separation_backend"><option value="spleeter">Spleeter (default)</option><option value="demucs">Demucs</option><option value="none">No separation / full mix</option></select></label>
146
+ <label>Spleeter model<select id="spleeter_model"></select></label>
147
  <label>Demucs model<select id="demucs_model"></select></label>
148
  <label>Clustering mode<select id="clustering_mode"><option value="batch_quality">batch quality</option><option value="online_preview">online preview</option></select></label>
149
  <label>Shifts<input id="demucs_shifts" type="number" min="0" max="8" step="1" /></label>
 
163
  <label><input id="quantize_midi" type="checkbox" /> quantize MIDI</label>
164
  <label><input id="auto_tune" type="checkbox" checked /> automatic parameter tuning</label>
165
  <label><input id="use_disk_cache" type="checkbox" /> disk cache stems/source loads</label>
166
+ <label><input id="allow_backend_fallback" type="checkbox" /> fallback to Demucs if Spleeter is unavailable</label>
167
  </div>
168
  <div class="preset-row">
169
  <button id="usePreviewButton" class="secondary-action" type="button">Fast preview</button>
web/styles.css CHANGED
@@ -86,6 +86,12 @@ button:disabled { cursor: not-allowed; opacity: .48; }
86
  .draw-card-button { border: 0; background: transparent; color: #596070; font-size: 20px; line-height: 1; padding: 0 2px; }
87
  .sample-column-list { min-height: 0; overflow-y: auto; padding: 12px; display: flex; flex-direction: column; gap: 10px; }
88
  .sample-card { position: relative; border: 1px solid color-mix(in srgb, var(--card-color, var(--purple)) 58%, var(--line)); border-radius: 9px; background: var(--panel); box-shadow: 0 8px 20px rgba(18, 21, 28, .05); overflow: hidden; }
 
 
 
 
 
 
89
  .sample-card.selected { box-shadow: 0 0 0 2px color-mix(in srgb, var(--card-color, var(--purple)) 24%, transparent), 0 10px 26px rgba(18,21,28,.08); }
90
  .sample-play-zone { width: 100%; border: 0; background: transparent; padding: 0; text-align: left; }
91
  .sample-wave { width: 100%; height: 74px; display: block; }
@@ -93,13 +99,14 @@ button:disabled { cursor: not-allowed; opacity: .48; }
93
  .play-dot { width: 14px; height: 14px; display: inline-grid; place-items: center; color: var(--purple); font-size: 10px; }
94
  .sample-name { display: block; color: #2c303a; font-size: 13px; line-height: 1.2; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; }
95
  .sample-meta { display: flex; justify-content: space-between; color: var(--muted); font-size: 11px; font-variant-numeric: tabular-nums; }
96
- .sample-card-actions { display: grid; grid-template-columns: repeat(4,1fr); border-top: 1px solid var(--line); }
97
  .sample-card-actions button { height: 30px; border: 0; border-right: 1px solid var(--line); background: #fff; color: #3e4350; font-size: 0; }
98
  .sample-card-actions button::before { font-size: 13px; }
99
  .sample-card-actions button:nth-child(1)::before { content: "×"; }
100
- .sample-card-actions button:nth-child(2)::before { content: ""; }
101
- .sample-card-actions button:nth-child(3)::before { content: ""; }
102
- .sample-card-actions button:nth-child(4)::before { content: ""; }
 
103
  .sample-card-actions button:last-child { border-right: 0; }
104
  .empty-drop-state, .empty { color: var(--muted); padding: 18px; font-size: 13px; }
105
 
 
86
  .draw-card-button { border: 0; background: transparent; color: #596070; font-size: 20px; line-height: 1; padding: 0 2px; }
87
  .sample-column-list { min-height: 0; overflow-y: auto; padding: 12px; display: flex; flex-direction: column; gap: 10px; }
88
  .sample-card { position: relative; border: 1px solid color-mix(in srgb, var(--card-color, var(--purple)) 58%, var(--line)); border-radius: 9px; background: var(--panel); box-shadow: 0 8px 20px rgba(18, 21, 28, .05); overflow: hidden; }
89
+ .sample-card.checked { box-shadow: inset 0 0 0 1px color-mix(in srgb, var(--card-color, var(--purple)) 34%, transparent), 0 8px 20px rgba(18, 21, 28, .05); }
90
+ .sample-select { position: absolute; z-index: 2; top: 8px; left: 8px; width: 18px; height: 18px; display: grid; place-items: center; }
91
+ .sample-select input { position: absolute; opacity: 0; pointer-events: none; }
92
+ .sample-select span { width: 16px; height: 16px; border-radius: 4px; border: 1px solid color-mix(in srgb, var(--card-color, var(--purple)) 70%, var(--line)); background: rgba(255,255,255,.92); box-shadow: 0 1px 2px rgba(18,21,28,.1); }
93
+ .sample-select input:checked + span { border-color: transparent; background: var(--purple); }
94
+ .sample-select input:checked + span::after { content: "✓"; display: block; color: #fff; font-size: 11px; line-height: 16px; text-align: center; font-weight: 800; }
95
  .sample-card.selected { box-shadow: 0 0 0 2px color-mix(in srgb, var(--card-color, var(--purple)) 24%, transparent), 0 10px 26px rgba(18,21,28,.08); }
96
  .sample-play-zone { width: 100%; border: 0; background: transparent; padding: 0; text-align: left; }
97
  .sample-wave { width: 100%; height: 74px; display: block; }
 
99
  .play-dot { width: 14px; height: 14px; display: inline-grid; place-items: center; color: var(--purple); font-size: 10px; }
100
  .sample-name { display: block; color: #2c303a; font-size: 13px; line-height: 1.2; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; }
101
  .sample-meta { display: flex; justify-content: space-between; color: var(--muted); font-size: 11px; font-variant-numeric: tabular-nums; }
102
+ .sample-card-actions { display: grid; grid-template-columns: repeat(5,1fr); border-top: 1px solid var(--line); }
103
  .sample-card-actions button { height: 30px; border: 0; border-right: 1px solid var(--line); background: #fff; color: #3e4350; font-size: 0; }
104
  .sample-card-actions button::before { font-size: 13px; }
105
  .sample-card-actions button:nth-child(1)::before { content: "×"; }
106
+ .sample-card-actions button:nth-child(2)::before { content: ""; }
107
+ .sample-card-actions button:nth-child(3)::before { content: ""; }
108
+ .sample-card-actions button:nth-child(4)::before { content: ""; }
109
+ .sample-card-actions button:nth-child(5)::before { content: "✓"; }
110
  .sample-card-actions button:last-child { border-right: 0; }
111
  .empty-drop-state, .empty { color: var(--muted); padding: 18px; font-size: 13px; }
112