Spaces:
Running
Plan: Standalone Stem Separator App for HuggingFace Spaces
Context
The user wants a standalone web app (separate from Studio13-v3) deployed on HuggingFace Spaces for audio stem separation. Users upload audio, choose stems, run BS-RoFormer model inference, then play/download individual stems with SoundCloud-style waveform visualization. The app must be mobile-responsive.
Architecture
HuggingFace Docker Space (port 7860)
βββ FastAPI backend (Python)
β βββ File upload + temp storage
β βββ audio-separator inference (BS-RoFormer 6-stem)
β βββ SSE progress streaming
β βββ Serves built React frontend as static files
βββ React frontend (Vite build β static)
βββ Upload zone (drag-drop)
βββ WaveSurfer.js v7 waveforms (SoundCloud-style bars)
βββ Stem selection checkboxes
βββ Progress bar (SSE-driven)
βββ Stem result rows (waveform + play + download)
Model: jarredou/BS-ROFO-SW-Fixed (699MB .ckpt) - BS-RoFormer, 6 stems: Vocals, Drums, Bass, Guitar, Piano, Other
Deliverable
A self-contained prompt (below) the user can paste into another Claude Code window to build the entire app from scratch.
Prompt to use in another window
The prompt is designed to be comprehensive and self-contained. Copy everything between the ---START PROMPT--- and ---END PROMPT--- markers.
---START PROMPT---
Build a standalone web app for HuggingFace Spaces that does audio stem separation. The app should be production-ready, responsive, and polished.
What the app does
- User uploads an audio file (drag-drop or file picker)
- Original track appears with a SoundCloud-style scrolling peak waveform + play button
- User selects which stems to separate via checkboxes (Vocals, Drums, Bass, Guitar, Piano, Other)
- User clicks "Separate" - progress bar shows real-time progress via SSE
- Once done, each stem appears in its own row: colored label + waveform (flex-grow) + play + download
- "Download All" button creates a ZIP of all stems
Tech Stack
- Frontend: React 19, TypeScript, Tailwind CSS v4, Vite 5
- Backend: Python 3.11, FastAPI, uvicorn
- Waveform: WaveSurfer.js v7 (
wavesurfer.jsnpm package) - Model:
jarredou/BS-ROFO-SW-Fixedfrom HuggingFace (BS-RoFormer, 699MB .ckpt, 6 stems) - Inference:
audio-separatorPython package - Progress: SSE (Server-Sent Events) via
sse-starlette - Deploy: HuggingFace Docker Space, port 7860
Directory Structure
stem-separator/
βββ Dockerfile
βββ README.md # HF Spaces YAML front matter
βββ .dockerignore
βββ backend/
β βββ main.py # FastAPI: routes, SSE, static serving
β βββ separator.py # audio-separator wrapper with progress callback
β βββ file_manager.py # Temp file lifecycle, cleanup
β βββ task_queue.py # asyncio queue (1 concurrent separation)
β βββ requirements.txt
βββ frontend/
β βββ index.html
β βββ package.json
β βββ vite.config.ts
β βββ tsconfig.json
β βββ src/
β βββ main.tsx
β βββ App.tsx
β βββ index.css # Tailwind theme (dark, music-oriented)
β βββ api.ts # fetch wrappers + SSE EventSource
β βββ types.ts # Shared interfaces
β βββ hooks/
β β βββ useWaveSurfer.ts # WaveSurfer.js v7 hook
β β βββ useSeparation.ts # Upload->Separate->Results state machine
β βββ components/
β βββ UploadZone.tsx
β βββ OriginalTrack.tsx
β βββ StemCheckboxes.tsx
β βββ SeparateButton.tsx
β βββ ProgressBar.tsx
β βββ WaveformPlayer.tsx # Reusable: [play] [waveform===] [time] [download?]
β βββ StemRow.tsx
β βββ StemResults.tsx
β βββ Footer.tsx
Backend Details
API Endpoints
POST /api/upload - Multipart file upload (max 100MB), returns { job_id, filename }
POST /api/separate - Body: { job_id, stems: string[] }, enqueues task
GET /api/progress/{id} - SSE stream: { state, progress, message, stems? }
GET /api/audio/{id}/{f} - Serve audio for WaveSurfer playback
GET /api/download/{id}/{f} - Download stem with Content-Disposition: attachment
GET /api/download/{id}/all - ZIP of all stems, streamed
DELETE /api/job/{id} - Manual cleanup
backend/separator.py - Separation Logic
# Singleton pattern - keep model loaded between requests
# Adapted from this working pattern:
from audio_separator.separator import Separator
class StemSeparatorService:
_instance = None
_model_loaded = False
def __new__(cls):
if cls._instance is None:
cls._instance = super().__new__(cls)
return cls._instance
def load_model(self):
if self._model_loaded:
return
self.separator = Separator(
output_dir="/tmp/output",
output_format="WAV",
output_single_stem=None,
)
self.separator.load_model(model_filename="BS-Rofo-SW-Fixed.ckpt")
self._model_loaded = True
def separate(self, input_path, output_dir, stems, progress_callback):
# Run separation
self.separator.output_dir = output_dir
output_files = self.separator.separate(input_path)
# Map output files to stem names using aliases:
# vocals: [vocals, vocal, voice, singing]
# drums: [drums, drum, percussion]
# bass: [bass]
# guitar: [guitar, guitars]
# piano: [piano, keys, keyboard]
# other: [other, instrumental, residual, remainder, no_]
# Rename files from "input_(Vocals).wav" -> "Vocals.wav"
# Return dict of stem_name -> file_path
tqdm monkey-patching for progress: Before importing audio-separator, patch tqdm.std.tqdm with a subclass that calls progress_callback("analyzing", fraction) in its update() method. Map tqdm progress 0-1 to overall progress 0.2-0.9.
backend/task_queue.py - Concurrency
asyncio.Queue(maxsize=5)- max 5 pending jobs, return 429 if full- Single worker consuming tasks sequentially (BS-RoFormer needs ~4-6GB RAM)
- Job progress stored in a dict, consumed by SSE endpoints
backend/file_manager.py - File Lifecycle
- Base dir:
/tmp/stem-sep/ - Each job:
/tmp/stem-sep/{uuid}/withinput.{ext}and stem outputs - Auto-cleanup: background task every 5 minutes, deletes dirs older than 30 minutes
backend/main.py - FastAPI App
- Register API routes BEFORE the static file mount
- Mount
frontend/dist/at/withhtml=Truefor SPA fallback - On startup: launch queue worker + cleanup loop as
asyncio.create_task - SSE via
sse-starlette'sEventSourceResponse
backend/requirements.txt
fastapi>=0.104.0
uvicorn[standard]>=0.24.0
python-multipart>=0.0.6
sse-starlette>=1.8.0
audio-separator[cpu]>=0.17.0
pydub>=0.25.1
aiofiles>=23.2.1
Frontend Details
State Machine (useSeparation hook)
type AppState =
| { phase: "idle" }
| { phase: "uploading"; progress: number }
| { phase: "uploaded"; jobId: string; filename: string }
| { phase: "separating"; jobId: string; state: string; progress: number; message: string }
| { phase: "done"; jobId: string; stems: StemResult[] }
| { phase: "error"; message: string }
Use useReducer for clean state transitions. SSE subscription in separate() action.
WaveSurfer.js v7 Configuration (SoundCloud-style)
WaveSurfer.create({
container: containerRef.current,
url: audioUrl,
waveColor: color + "66", // 40% opacity
progressColor: color, // full opacity for played portion
height: 64, // 48 for stem rows
barWidth: 2,
barGap: 1,
barRadius: 2,
cursorWidth: 1,
cursorColor: "#ffffff40",
normalize: true,
interact: true, // click to seek
});
Import: import WaveSurfer from 'wavesurfer.js'
WaveformPlayer.tsx - Reusable Component
Layout: [play/pause circle] [waveform div (flex-grow)] [MM:SS / MM:SS] [download icon?]
- Play button: circle with play/pause icon
- Waveform container:
flex-growdiv, WaveSurfer renders into it - Time:
currentTime / durationinM:SSformat - Download: optional, shown via
onDownloadprop
Exclusive playback: When one player starts, dispatch window.dispatchEvent(new CustomEvent("stem-play", { detail: instanceId })). All other players listen and pause.
Stem Colors
const STEM_CONFIG = {
Vocals: { color: "#ec4899", icon: "mic" }, // pink
Drums: { color: "#f97316", icon: "drum" }, // orange
Bass: { color: "#3b82f6", icon: "music" }, // blue
Guitar: { color: "#a855f7", icon: "guitar" }, // purple
Piano: { color: "#06b6d4", icon: "piano" }, // cyan
Other: { color: "#22c55e", icon: "waveform" }, // green
};
UploadZone.tsx
Drag-and-drop zone with dashed border. Accepts: wav, mp3, flac, ogg, m4a, aac (max 100MB).
Shows file icon + "Drop audio file here or click to browse" + supported formats.
Drag-over state: border color changes to accent. Hidden <input type="file" accept="audio/*">.
StemRow.tsx
Desktop layout: [colored dot + label (w-24)] [WaveformPlayer (flex-grow)]
Mobile layout: label on top row, waveform on bottom row (flex-col sm:flex-row)
Mobile Responsive Strategy
- Main container:
max-w-3xl mx-auto px-4 StemCheckboxes:grid-cols-2 md:grid-cols-3StemRow:flex-col sm:flex-row(label stacks above waveform on mobile)- Waveform height:
h-12 md:h-16 - Touch targets: minimum 44px
- Font sizes:
text-sm md:text-base
Theme (index.css)
@import "tailwindcss";
@theme {
--color-bg-primary: #0a0a0f;
--color-bg-secondary: #13131a;
--color-bg-card: #1a1a24;
--color-bg-hover: #252530;
--color-text-primary: #e8e8ef;
--color-text-secondary: #8888a0;
--color-accent: #7c3aed;
--color-accent-hover: #6d28d9;
--color-border: #2a2a38;
}
body {
background-color: var(--color-bg-primary);
color: var(--color-text-primary);
}
Docker Setup
Dockerfile
FROM python:3.11-slim
RUN apt-get update && apt-get install -y --no-install-recommends \
ffmpeg curl && rm -rf /var/lib/apt/lists/*
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
&& apt-get install -y nodejs && rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY backend/requirements.txt backend/requirements.txt
RUN pip install --no-cache-dir -r backend/requirements.txt
COPY frontend/ frontend/
RUN cd frontend && npm ci && npm run build
COPY backend/ backend/
EXPOSE 7860
RUN useradd -m -u 1000 user
USER user
ENV HOME=/home/user PATH=/home/user/.local/bin:$PATH
CMD ["python", "-m", "uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "7860"]
README.md (HF Spaces metadata)
---
title: Stem Separator
emoji: π΅
colorFrom: purple
colorTo: pink
sdk: docker
app_port: 7860
pinned: false
license: mit
---
.dockerignore
frontend/node_modules
frontend/dist
**/__pycache__
*.pyc
.git
Implementation Order
- Scaffold project structure (all dirs + config files)
backend/requirements.txt+backend/file_manager.py+backend/separator.pybackend/task_queue.py+backend/main.py(all API endpoints + SSE)- Frontend scaffold:
package.json,vite.config.ts,tsconfig.json,index.html,index.css types.ts+api.ts(API client + SSE subscription)useWaveSurfer.tshookuseSeparation.tshook (state machine)- Components:
UploadZone->WaveformPlayer->OriginalTrack->StemCheckboxes->SeparateButton->ProgressBar->StemRow->StemResults->Footer->App.tsx Dockerfile+README.md+.dockerignore
Verification
- Local dev:
cd frontend && npm run dev(with Vite proxy to backend) - Local backend:
cd backend && uvicorn main:app --port 7860 - Docker build:
docker build -t stem-sep . - Docker run:
docker run -p 7860:7860 stem-sep - Test: upload a song, select all 6 stems, verify progress + waveforms + play + download
---END PROMPT---