Spaces:
No application file
No application file
| title: StemSplitter | |
| app_file: /Users/YaronMcNabb_1/Documents/StemSplitter/src/stemsplitter/web.py | |
| sdk: gradio | |
| sdk_version: 6.6.0 | |
| # StemSplitter | |
| Audio stem separation tool that splits songs into individual components (vocals, drums, bass, instruments). Provides both a command-line interface and a Gradio web UI. | |
| Powered by open-source models via [audio-separator](https://github.com/nomadkaraoke/python-audio-separator): | |
| | Mode | Stems | Default Model | | |
| |------|-------|---------------| | |
| | 2-stem | Vocals, Instrumental | MelBand-RoFormer | | |
| | 4-stem | Vocals, Drums, Bass, Other | Demucs htdemucs_ft | | |
| ## Prerequisites | |
| - Python 3.10+ | |
| - [uv](https://docs.astral.sh/uv/getting-started/installation/) for dependency management | |
| - FFmpeg (required by audio-separator for reading various audio formats) | |
| Install FFmpeg if you don't have it: | |
| ```bash | |
| # macOS | |
| brew install ffmpeg | |
| # Ubuntu/Debian | |
| sudo apt install ffmpeg | |
| # Windows (via chocolatey) | |
| choco install ffmpeg | |
| ``` | |
| ## Installation | |
| ```bash | |
| git clone <repo-url> | |
| cd StemSplitter | |
| # Copy the example env file and adjust as needed | |
| cp .env.example .env | |
| # Install dependencies (CPU inference) | |
| uv sync --extra dev | |
| # Or, for GPU-accelerated inference (NVIDIA CUDA) | |
| uv sync --extra dev --extra gpu | |
| ``` | |
| Models are downloaded automatically on first use (~200 MB for 2-stem, ~800 MB for 4-stem). | |
| ## Usage | |
| ### CLI | |
| ```bash | |
| # Basic 2-stem separation (vocals + instrumental), outputs WAV | |
| uv run stemsplitter song.mp3 | |
| # 4-stem separation with FLAC output | |
| uv run stemsplitter song.wav -m 4stem -f FLAC | |
| # MP3 output to a custom directory | |
| uv run stemsplitter song.flac -m 2stem -f MP3 -o ./my_stems/ | |
| # Override the model | |
| uv run stemsplitter song.mp3 --model htdemucs.yaml | |
| # Show all options | |
| uv run stemsplitter --help | |
| ``` | |
| **Supported input formats:** MP3, WAV, FLAC, OGG, M4A, and anything FFmpeg can decode. | |
| **Supported output formats:** WAV, MP3, FLAC (set via `-f` flag or `STEMSPLITTER_OUTPUT_FORMAT` in `.env`). | |
| ### Web UI | |
| ```bash | |
| uv run stemsplitter-web | |
| ``` | |
| Opens a Gradio interface (default: `http://127.0.0.1:7860`) where you can: | |
| 1. Upload an audio file | |
| 2. Choose separation mode (2-stem or 4-stem) | |
| 3. Choose output format (WAV, MP3, FLAC) | |
| 4. Click **Separate** and download individual stems | |
| A public share link is also generated automatically. | |
| ## Project Structure | |
| ``` | |
| src/stemsplitter/ | |
| __init__.py # Package version | |
| config.py # Settings loaded from .env with sensible defaults | |
| separator.py # Core StemSplitter class wrapping audio-separator | |
| cli.py # Click-based CLI entry point | |
| web.py # Gradio web UI | |
| tests/ | |
| conftest.py # Shared fixtures (mock separator, synthetic audio) | |
| test_config.py # Configuration loading tests | |
| test_separator.py# Core separation logic tests | |
| test_cli.py # CLI invocation tests | |
| test_web.py # Web UI handler tests | |
| ``` | |
| ### Components | |
| - **config.py** -- Loads settings from a `.env` file using `python-dotenv`. All values are exposed as a frozen `Settings` dataclass. See `.env.example` for the full list of options. | |
| - **separator.py** -- Wraps `audio-separator` with a `StemSplitter` class that handles model selection per mode, lazy initialization (so imports are fast), and model caching (the model stays loaded between calls). | |
| - **cli.py** -- A Click command that accepts an input file and flags for mode, format, output directory, and model override. | |
| - **web.py** -- A Gradio Blocks app with audio upload, mode/format radio buttons, and per-stem audio outputs. The 4-stem outputs (drums, bass) are hidden in 2-stem mode. | |
| ## Configuration | |
| All settings are configurable via environment variables in `.env`: | |
| | Variable | Default | Description | | |
| |----------|---------|-------------| | |
| | `STEMSPLITTER_OUTPUT_DIR` | `./output` | Directory for separated stems | | |
| | `STEMSPLITTER_MODEL_DIR` | `/tmp/audio-separator-models/` | Where downloaded models are cached | | |
| | `STEMSPLITTER_2STEM_MODEL` | `model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt` | Model for 2-stem separation | | |
| | `STEMSPLITTER_4STEM_MODEL` | `htdemucs_ft.yaml` | Model for 4-stem separation | | |
| | `STEMSPLITTER_OUTPUT_FORMAT` | `WAV` | Default output format (WAV, MP3, FLAC) | | |
| | `STEMSPLITTER_OUTPUT_BITRATE` | `320k` | Bitrate for MP3 output | | |
| | `STEMSPLITTER_SAMPLE_RATE` | `44100` | Output sample rate | | |
| | `STEMSPLITTER_NORMALIZATION` | `0.9` | Peak normalization threshold | | |
| | `STEMSPLITTER_LOG_LEVEL` | `WARNING` | Logging verbosity (DEBUG, INFO, WARNING, ERROR) | | |
| | `STEMSPLITTER_WEB_HOST` | `127.0.0.1` | Web UI bind address | | |
| | `STEMSPLITTER_WEB_PORT` | `7860` | Web UI port | | |
| ## Running Tests | |
| ```bash | |
| # Run all tests | |
| uv run pytest | |
| # Verbose output | |
| uv run pytest -v | |
| # With coverage report | |
| uv run pytest -v --cov=stemsplitter --cov-report=term-missing | |
| # Run a specific test file | |
| uv run pytest tests/test_separator.py | |
| ``` | |
| Tests use mocked models so no GPU or model downloads are required. | |
| ## License | |
| MIT | |