# Plan: Standalone Stem Separator App for HuggingFace Spaces

## Context

The user wants a standalone web app (separate from Studio13-v3) deployed on HuggingFace Spaces for audio stem separation. Users upload audio, choose stems, run BS-RoFormer model inference, then play/download individual stems with SoundCloud-style waveform visualization. The app must be mobile-responsive.

## Architecture

```
HuggingFace Docker Space (port 7860)
├── FastAPI backend (Python)
│   ├── File upload + temp storage
│   ├── audio-separator inference (BS-RoFormer 6-stem)
│   ├── SSE progress streaming
│   └── Serves built React frontend as static files
└── React frontend (Vite build → static)
    ├── Upload zone (drag-drop)
    ├── WaveSurfer.js v7 waveforms (SoundCloud-style bars)
    ├── Stem selection checkboxes
    ├── Progress bar (SSE-driven)
    └── Stem result rows (waveform + play + download)
```

**Model**: `jarredou/BS-ROFO-SW-Fixed` (699MB .ckpt) - BS-RoFormer, 6 stems: Vocals, Drums, Bass, Guitar, Piano, Other

## Deliverable

A self-contained prompt (below) the user can paste into another Claude Code window to build the entire app from scratch.

---

## Prompt to use in another window

The prompt is designed to be comprehensive and self-contained. Copy everything between the `---START PROMPT---` and `---END PROMPT---` markers.

---START PROMPT---

Build a standalone web app for HuggingFace Spaces that does audio stem separation. The app should be production-ready, responsive, and polished.

## What the app does

1. User uploads an audio file (drag-drop or file picker)
2. Original track appears with a SoundCloud-style scrolling peak waveform + play button
3. User selects which stems to separate via checkboxes (Vocals, Drums, Bass, Guitar, Piano, Other)
4. User clicks "Separate" - progress bar shows real-time progress via SSE
5. Once done, each stem appears in its own row: colored label + waveform (flex-grow) + play + download
6. "Download All" button creates a ZIP of all stems

## Tech Stack

- **Frontend**: React 19, TypeScript, Tailwind CSS v4, Vite 5
- **Backend**: Python 3.11, FastAPI, uvicorn
- **Waveform**: WaveSurfer.js v7 (`wavesurfer.js` npm package)
- **Model**: `jarredou/BS-ROFO-SW-Fixed` from HuggingFace (BS-RoFormer, 699MB .ckpt, 6 stems)
- **Inference**: `audio-separator` Python package
- **Progress**: SSE (Server-Sent Events) via `sse-starlette`
- **Deploy**: HuggingFace Docker Space, port 7860

## Directory Structure

```
stem-separator/
├── Dockerfile
├── README.md                     # HF Spaces YAML front matter
├── .dockerignore
├── backend/
│   ├── main.py                   # FastAPI: routes, SSE, static serving
│   ├── separator.py              # audio-separator wrapper with progress callback
│   ├── file_manager.py           # Temp file lifecycle, cleanup
│   ├── task_queue.py             # asyncio queue (1 concurrent separation)
│   └── requirements.txt
├── frontend/
│   ├── index.html
│   ├── package.json
│   ├── vite.config.ts
│   ├── tsconfig.json
│   └── src/
│       ├── main.tsx
│       ├── App.tsx
│       ├── index.css             # Tailwind theme (dark, music-oriented)
│       ├── api.ts                # fetch wrappers + SSE EventSource
│       ├── types.ts              # Shared interfaces
│       ├── hooks/
│       │   ├── useWaveSurfer.ts  # WaveSurfer.js v7 hook
│       │   └── useSeparation.ts  # Upload->Separate->Results state machine
│       └── components/
│           ├── UploadZone.tsx
│           ├── OriginalTrack.tsx
│           ├── StemCheckboxes.tsx
│           ├── SeparateButton.tsx
│           ├── ProgressBar.tsx
│           ├── WaveformPlayer.tsx # Reusable: [play] [waveform===] [time] [download?]
│           ├── StemRow.tsx
│           ├── StemResults.tsx
│           └── Footer.tsx
```

## Backend Details

### API Endpoints

```
POST /api/upload          - Multipart file upload (max 100MB), returns { job_id, filename }
POST /api/separate        - Body: { job_id, stems: string[] }, enqueues task
GET  /api/progress/{id}   - SSE stream: { state, progress, message, stems? }
GET  /api/audio/{id}/{f}  - Serve audio for WaveSurfer playback
GET  /api/download/{id}/{f} - Download stem with Content-Disposition: attachment
GET  /api/download/{id}/all - ZIP of all stems, streamed
DELETE /api/job/{id}      - Manual cleanup
```

### `backend/separator.py` - Separation Logic

```python
# Singleton pattern - keep model loaded between requests
# Adapted from this working pattern:

from audio_separator.separator import Separator

class StemSeparatorService:
    _instance = None
    _model_loaded = False

    def __new__(cls):
        if cls._instance is None:
            cls._instance = super().__new__(cls)
        return cls._instance

    def load_model(self):
        if self._model_loaded:
            return
        self.separator = Separator(
            output_dir="/tmp/output",
            output_format="WAV",
            output_single_stem=None,
        )
        self.separator.load_model(model_filename="BS-Rofo-SW-Fixed.ckpt")
        self._model_loaded = True

    def separate(self, input_path, output_dir, stems, progress_callback):
        # Run separation
        self.separator.output_dir = output_dir
        output_files = self.separator.separate(input_path)

        # Map output files to stem names using aliases:
        # vocals: [vocals, vocal, voice, singing]
        # drums: [drums, drum, percussion]
        # bass: [bass]
        # guitar: [guitar, guitars]
        # piano: [piano, keys, keyboard]
        # other: [other, instrumental, residual, remainder, no_]

        # Rename files from "input_(Vocals).wav" -> "Vocals.wav"
        # Return dict of stem_name -> file_path
```

**tqdm monkey-patching for progress**: Before importing audio-separator, patch `tqdm.std.tqdm` with a subclass that calls `progress_callback("analyzing", fraction)` in its `update()` method. Map tqdm progress 0-1 to overall progress 0.2-0.9.

### `backend/task_queue.py` - Concurrency

- `asyncio.Queue(maxsize=5)` - max 5 pending jobs, return 429 if full
- Single worker consuming tasks sequentially (BS-RoFormer needs ~4-6GB RAM)
- Job progress stored in a dict, consumed by SSE endpoints

### `backend/file_manager.py` - File Lifecycle

- Base dir: `/tmp/stem-sep/`
- Each job: `/tmp/stem-sep/{uuid}/` with `input.{ext}` and stem outputs
- Auto-cleanup: background task every 5 minutes, deletes dirs older than 30 minutes

### `backend/main.py` - FastAPI App

- Register API routes BEFORE the static file mount
- Mount `frontend/dist/` at `/` with `html=True` for SPA fallback
- On startup: launch queue worker + cleanup loop as `asyncio.create_task`
- SSE via `sse-starlette`'s `EventSourceResponse`

### `backend/requirements.txt`

```
fastapi>=0.104.0
uvicorn[standard]>=0.24.0
python-multipart>=0.0.6
sse-starlette>=1.8.0
audio-separator[cpu]>=0.17.0
pydub>=0.25.1
aiofiles>=23.2.1
```

## Frontend Details

### State Machine (`useSeparation` hook)

```typescript
type AppState =
  | { phase: "idle" }
  | { phase: "uploading"; progress: number }
  | { phase: "uploaded"; jobId: string; filename: string }
  | { phase: "separating"; jobId: string; state: string; progress: number; message: string }
  | { phase: "done"; jobId: string; stems: StemResult[] }
  | { phase: "error"; message: string }
```

Use `useReducer` for clean state transitions. SSE subscription in `separate()` action.

### WaveSurfer.js v7 Configuration (SoundCloud-style)

```typescript
WaveSurfer.create({
  container: containerRef.current,
  url: audioUrl,
  waveColor: color + "66",      // 40% opacity
  progressColor: color,          // full opacity for played portion
  height: 64,                    // 48 for stem rows
  barWidth: 2,
  barGap: 1,
  barRadius: 2,
  cursorWidth: 1,
  cursorColor: "#ffffff40",
  normalize: true,
  interact: true,                // click to seek
});
```

Import: `import WaveSurfer from 'wavesurfer.js'`

### `WaveformPlayer.tsx` - Reusable Component

Layout: `[play/pause circle] [waveform div (flex-grow)] [MM:SS / MM:SS] [download icon?]`

- Play button: circle with play/pause icon
- Waveform container: `flex-grow` div, WaveSurfer renders into it
- Time: `currentTime / duration` in `M:SS` format
- Download: optional, shown via `onDownload` prop

**Exclusive playback**: When one player starts, dispatch `window.dispatchEvent(new CustomEvent("stem-play", { detail: instanceId }))`. All other players listen and pause.

### Stem Colors

```typescript
const STEM_CONFIG = {
  Vocals: { color: "#ec4899", icon: "mic" },       // pink
  Drums:  { color: "#f97316", icon: "drum" },       // orange
  Bass:   { color: "#3b82f6", icon: "music" },      // blue
  Guitar: { color: "#a855f7", icon: "guitar" },     // purple
  Piano:  { color: "#06b6d4", icon: "piano" },      // cyan
  Other:  { color: "#22c55e", icon: "waveform" },   // green
};
```

### `UploadZone.tsx`

Drag-and-drop zone with dashed border. Accepts: wav, mp3, flac, ogg, m4a, aac (max 100MB).
Shows file icon + "Drop audio file here or click to browse" + supported formats.
Drag-over state: border color changes to accent. Hidden `<input type="file" accept="audio/*">`.

### `StemRow.tsx`

Desktop layout: `[colored dot + label (w-24)] [WaveformPlayer (flex-grow)]`
Mobile layout: label on top row, waveform on bottom row (`flex-col sm:flex-row`)

### Mobile Responsive Strategy

- Main container: `max-w-3xl mx-auto px-4`
- `StemCheckboxes`: `grid-cols-2 md:grid-cols-3`
- `StemRow`: `flex-col sm:flex-row` (label stacks above waveform on mobile)
- Waveform height: `h-12 md:h-16`
- Touch targets: minimum 44px
- Font sizes: `text-sm md:text-base`

### Theme (index.css)

```css
@import "tailwindcss";

@theme {
  --color-bg-primary: #0a0a0f;
  --color-bg-secondary: #13131a;
  --color-bg-card: #1a1a24;
  --color-bg-hover: #252530;
  --color-text-primary: #e8e8ef;
  --color-text-secondary: #8888a0;
  --color-accent: #7c3aed;
  --color-accent-hover: #6d28d9;
  --color-border: #2a2a38;
}

body {
  background-color: var(--color-bg-primary);
  color: var(--color-text-primary);
}
```

## Docker Setup

### Dockerfile

```dockerfile
FROM python:3.11-slim

RUN apt-get update && apt-get install -y --no-install-recommends \
    ffmpeg curl && rm -rf /var/lib/apt/lists/*

RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
    && apt-get install -y nodejs && rm -rf /var/lib/apt/lists/*

WORKDIR /app

COPY backend/requirements.txt backend/requirements.txt
RUN pip install --no-cache-dir -r backend/requirements.txt

COPY frontend/ frontend/
RUN cd frontend && npm ci && npm run build

COPY backend/ backend/

EXPOSE 7860

RUN useradd -m -u 1000 user
USER user
ENV HOME=/home/user PATH=/home/user/.local/bin:$PATH

CMD ["python", "-m", "uvicorn", "backend.main:app", "--host", "0.0.0.0", "--port", "7860"]
```

### README.md (HF Spaces metadata)

```yaml
---
title: Stem Separator
emoji: 🎵
colorFrom: purple
colorTo: pink
sdk: docker
app_port: 7860
pinned: false
license: mit
---
```

### .dockerignore

```
frontend/node_modules
frontend/dist
**/__pycache__
*.pyc
.git
```

## Implementation Order

1. Scaffold project structure (all dirs + config files)
2. `backend/requirements.txt` + `backend/file_manager.py` + `backend/separator.py`
3. `backend/task_queue.py` + `backend/main.py` (all API endpoints + SSE)
4. Frontend scaffold: `package.json`, `vite.config.ts`, `tsconfig.json`, `index.html`, `index.css`
5. `types.ts` + `api.ts` (API client + SSE subscription)
6. `useWaveSurfer.ts` hook
7. `useSeparation.ts` hook (state machine)
8. Components: `UploadZone` -> `WaveformPlayer` -> `OriginalTrack` -> `StemCheckboxes` -> `SeparateButton` -> `ProgressBar` -> `StemRow` -> `StemResults` -> `Footer` -> `App.tsx`
9. `Dockerfile` + `README.md` + `.dockerignore`

## Verification

1. Local dev: `cd frontend && npm run dev` (with Vite proxy to backend)
2. Local backend: `cd backend && uvicorn main:app --port 7860`
3. Docker build: `docker build -t stem-sep .`
4. Docker run: `docker run -p 7860:7860 stem-sep`
5. Test: upload a song, select all 6 stems, verify progress + waveforms + play + download

---END PROMPT---