File size: 2,549 Bytes
5c0dd47 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 | ---
license: mit
tags:
- audio
- music
- source-separation
- stem-separation
- roformer
- safetensors
- maestraea
pipeline_tag: audio-to-audio
---
# RoFormer Stem Separation Models (Safetensors)
**BS-RoFormer & MelBand RoFormer — State-of-the-art music source separation**
> Pretrained weights converted to safetensors format for use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea).
## Models
### BS-RoFormer (Band-Split RoPE Transformer)
| Variant | SDR | Task | Path |
|---------|-----|------|------|
| Vocals (viperx) | 12.97 | Vocal/instrumental separation | `bs_roformer/vocals_viperx/` |
| Multi-stem | 9.65 | 4-stem (bass/drums/vocals/other) | `bs_roformer/multistem/` |
### MelBand RoFormer (Mel-Band RoPE Transformer)
| Variant | SDR | Task | Path |
|---------|-----|------|------|
| Vocals (KimberleyJensen) | 10.98 | Best vocal isolation | `mel_band_roformer/vocals_kj/` |
| Vocals (viperx) | 11.43 | Vocal/instrumental separation | `mel_band_roformer/vocals_viperx/` |
| Dereverb (anvuew) | 19.17 | Remove reverb from audio | `mel_band_roformer/dereverb/` |
| Denoise (aufr33) | 27.99 | Remove noise from audio | `mel_band_roformer/denoise/` |
## Architecture
Both models use the Band-Split RoPE Transformer architecture from [lucidrains/BS-RoFormer](https://github.com/lucidrains/BS-RoFormer):
- **BS-RoFormer**: Splits spectrogram into uniform-width subbands
- **MelBand RoFormer**: Splits using mel-scale (perceptually-weighted) overlapping bands
Both significantly outperform HTDemucs on vocal separation tasks.
## Usage
Each model directory contains:
- `model.safetensors` — Model weights
- `config.yaml` — Architecture configuration (required for model instantiation)
Requires `bs-roformer` Python package: `pip install bs-roformer`
## Credits
- **Architecture**: [lucidrains/BS-RoFormer](https://github.com/lucidrains/BS-RoFormer)
- **Training framework**: [ZFTurbo/Music-Source-Separation-Training](https://github.com/ZFTurbo/Music-Source-Separation-Training)
- **BS-RoFormer vocals**: [viperx](https://github.com/playdasegunda) via [TRvlvr](https://github.com/TRvlvr/model_repo)
- **MelBand vocals**: [KimberleyJensen](https://github.com/KimberleyJensen), [viperx](https://github.com/playdasegunda)
- **MelBand dereverb**: [anvuew](https://github.com/anvuew)
- **MelBand denoise**: [aufr33](https://github.com/aufr33)
- **Conversion & Mirror by**: [AEmotionStudio](https://huggingface.co/AEmotionStudio)
## License
MIT — same as all upstream model releases.
|