Spaces:
No application file
No application file
File size: 4,971 Bytes
cad5a52 1824ea0 cad5a52 1824ea0 cad5a52 1824ea0 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 | ---
title: StemSplitter
app_file: /Users/YaronMcNabb_1/Documents/StemSplitter/src/stemsplitter/web.py
sdk: gradio
sdk_version: 6.6.0
---
# StemSplitter
Audio stem separation tool that splits songs into individual components (vocals, drums, bass, instruments). Provides both a command-line interface and a Gradio web UI.
Powered by open-source models via [audio-separator](https://github.com/nomadkaraoke/python-audio-separator):
| Mode | Stems | Default Model |
|------|-------|---------------|
| 2-stem | Vocals, Instrumental | MelBand-RoFormer |
| 4-stem | Vocals, Drums, Bass, Other | Demucs htdemucs_ft |
## Prerequisites
- Python 3.10+
- [uv](https://docs.astral.sh/uv/getting-started/installation/) for dependency management
- FFmpeg (required by audio-separator for reading various audio formats)
Install FFmpeg if you don't have it:
```bash
# macOS
brew install ffmpeg
# Ubuntu/Debian
sudo apt install ffmpeg
# Windows (via chocolatey)
choco install ffmpeg
```
## Installation
```bash
git clone <repo-url>
cd StemSplitter
# Copy the example env file and adjust as needed
cp .env.example .env
# Install dependencies (CPU inference)
uv sync --extra dev
# Or, for GPU-accelerated inference (NVIDIA CUDA)
uv sync --extra dev --extra gpu
```
Models are downloaded automatically on first use (~200 MB for 2-stem, ~800 MB for 4-stem).
## Usage
### CLI
```bash
# Basic 2-stem separation (vocals + instrumental), outputs WAV
uv run stemsplitter song.mp3
# 4-stem separation with FLAC output
uv run stemsplitter song.wav -m 4stem -f FLAC
# MP3 output to a custom directory
uv run stemsplitter song.flac -m 2stem -f MP3 -o ./my_stems/
# Override the model
uv run stemsplitter song.mp3 --model htdemucs.yaml
# Show all options
uv run stemsplitter --help
```
**Supported input formats:** MP3, WAV, FLAC, OGG, M4A, and anything FFmpeg can decode.
**Supported output formats:** WAV, MP3, FLAC (set via `-f` flag or `STEMSPLITTER_OUTPUT_FORMAT` in `.env`).
### Web UI
```bash
uv run stemsplitter-web
```
Opens a Gradio interface (default: `http://127.0.0.1:7860`) where you can:
1. Upload an audio file
2. Choose separation mode (2-stem or 4-stem)
3. Choose output format (WAV, MP3, FLAC)
4. Click **Separate** and download individual stems
A public share link is also generated automatically.
## Project Structure
```
src/stemsplitter/
__init__.py # Package version
config.py # Settings loaded from .env with sensible defaults
separator.py # Core StemSplitter class wrapping audio-separator
cli.py # Click-based CLI entry point
web.py # Gradio web UI
tests/
conftest.py # Shared fixtures (mock separator, synthetic audio)
test_config.py # Configuration loading tests
test_separator.py# Core separation logic tests
test_cli.py # CLI invocation tests
test_web.py # Web UI handler tests
```
### Components
- **config.py** -- Loads settings from a `.env` file using `python-dotenv`. All values are exposed as a frozen `Settings` dataclass. See `.env.example` for the full list of options.
- **separator.py** -- Wraps `audio-separator` with a `StemSplitter` class that handles model selection per mode, lazy initialization (so imports are fast), and model caching (the model stays loaded between calls).
- **cli.py** -- A Click command that accepts an input file and flags for mode, format, output directory, and model override.
- **web.py** -- A Gradio Blocks app with audio upload, mode/format radio buttons, and per-stem audio outputs. The 4-stem outputs (drums, bass) are hidden in 2-stem mode.
## Configuration
All settings are configurable via environment variables in `.env`:
| Variable | Default | Description |
|----------|---------|-------------|
| `STEMSPLITTER_OUTPUT_DIR` | `./output` | Directory for separated stems |
| `STEMSPLITTER_MODEL_DIR` | `/tmp/audio-separator-models/` | Where downloaded models are cached |
| `STEMSPLITTER_2STEM_MODEL` | `model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt` | Model for 2-stem separation |
| `STEMSPLITTER_4STEM_MODEL` | `htdemucs_ft.yaml` | Model for 4-stem separation |
| `STEMSPLITTER_OUTPUT_FORMAT` | `WAV` | Default output format (WAV, MP3, FLAC) |
| `STEMSPLITTER_OUTPUT_BITRATE` | `320k` | Bitrate for MP3 output |
| `STEMSPLITTER_SAMPLE_RATE` | `44100` | Output sample rate |
| `STEMSPLITTER_NORMALIZATION` | `0.9` | Peak normalization threshold |
| `STEMSPLITTER_LOG_LEVEL` | `WARNING` | Logging verbosity (DEBUG, INFO, WARNING, ERROR) |
| `STEMSPLITTER_WEB_HOST` | `127.0.0.1` | Web UI bind address |
| `STEMSPLITTER_WEB_PORT` | `7860` | Web UI port |
## Running Tests
```bash
# Run all tests
uv run pytest
# Verbose output
uv run pytest -v
# With coverage report
uv run pytest -v --cov=stemsplitter --cov-report=term-missing
# Run a specific test file
uv run pytest tests/test_separator.py
```
Tests use mocked models so no GPU or model downloads are required.
## License
MIT
|