BS-RoFormer β Vocal Extraction
Unofficial HuggingFace packaging of BS-RoFormer (Band-Split RoFormer), using the ViperX checkpoint. Adds a
from_pretrainedinterface and native Apple Silicon (MPS) support.
Extracts vocals from a music track. Outputs exactly 2 stems:
| Stem | Description |
|---|---|
vocals |
Isolated vocal track |
other |
Everything else (mix β vocals) |
Quick start
pip install transformers torch librosa soundfile \
einops beartype rotary-embedding-torch
from transformers import AutoModel
model = AutoModel.from_pretrained("puar-playground/bs-roformer", trust_remote_code=True)
stems = model.separate("song.wav", output_dir="output/")
# output/vocals.wav β isolated vocals
# output/other.wav β instrumental
Device selection
# Auto-detect: CUDA β MPS (Apple Silicon) β CPU
model = AutoModel.from_pretrained("puar-playground/bs-roformer", trust_remote_code=True)
# Explicit
model = AutoModel.from_pretrained("puar-playground/bs-roformer", trust_remote_code=True,
device="mps") # or "cuda", "cpu"
Repository layout
bs-roformer/
βββ modeling_bs_roformer.py # BSRoFormerConfig + BSRoFormerForSourceSeparation
βββ config.json
βββ requirements.txt
βββ bs_roformer.yaml # model architecture config
βββ bs_roformer.ckpt # model weights (Git LFS)
βββ bs_roformer_src/ # vendored BS-RoFormer source
Credits
- BS-RoFormer architecture: lucidrains/BS-RoFormer
- Checkpoint: BS-RoFormer (ViperX edition) from ZFTurbo/Music-Source-Separation-Training
- Paper: Music Source Separation with Band-Split RoPE Transformer (Chen et al., 2024)
- Downloads last month
- 68
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support