File size: 2,549 Bytes
5c0dd47
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
---
license: mit
tags:
  - audio
  - music
  - source-separation
  - stem-separation
  - roformer
  - safetensors
  - maestraea
pipeline_tag: audio-to-audio
---

# RoFormer Stem Separation Models (Safetensors)

**BS-RoFormer & MelBand RoFormer — State-of-the-art music source separation**

> Pretrained weights converted to safetensors format for use with [Mæstræa AI Workstation](https://github.com/AEmotionStudio/Maestraea).

## Models

### BS-RoFormer (Band-Split RoPE Transformer)

| Variant | SDR | Task | Path |
|---------|-----|------|------|
| Vocals (viperx) | 12.97 | Vocal/instrumental separation | `bs_roformer/vocals_viperx/` |
| Multi-stem | 9.65 | 4-stem (bass/drums/vocals/other) | `bs_roformer/multistem/` |

### MelBand RoFormer (Mel-Band RoPE Transformer)

| Variant | SDR | Task | Path |
|---------|-----|------|------|
| Vocals (KimberleyJensen) | 10.98 | Best vocal isolation | `mel_band_roformer/vocals_kj/` |
| Vocals (viperx) | 11.43 | Vocal/instrumental separation | `mel_band_roformer/vocals_viperx/` |
| Dereverb (anvuew) | 19.17 | Remove reverb from audio | `mel_band_roformer/dereverb/` |
| Denoise (aufr33) | 27.99 | Remove noise from audio | `mel_band_roformer/denoise/` |

## Architecture

Both models use the Band-Split RoPE Transformer architecture from [lucidrains/BS-RoFormer](https://github.com/lucidrains/BS-RoFormer):

- **BS-RoFormer**: Splits spectrogram into uniform-width subbands
- **MelBand RoFormer**: Splits using mel-scale (perceptually-weighted) overlapping bands

Both significantly outperform HTDemucs on vocal separation tasks.

## Usage

Each model directory contains:
- `model.safetensors` — Model weights
- `config.yaml` — Architecture configuration (required for model instantiation)

Requires `bs-roformer` Python package: `pip install bs-roformer`

## Credits

- **Architecture**: [lucidrains/BS-RoFormer](https://github.com/lucidrains/BS-RoFormer)
- **Training framework**: [ZFTurbo/Music-Source-Separation-Training](https://github.com/ZFTurbo/Music-Source-Separation-Training)
- **BS-RoFormer vocals**: [viperx](https://github.com/playdasegunda) via [TRvlvr](https://github.com/TRvlvr/model_repo)
- **MelBand vocals**: [KimberleyJensen](https://github.com/KimberleyJensen), [viperx](https://github.com/playdasegunda)
- **MelBand dereverb**: [anvuew](https://github.com/anvuew)
- **MelBand denoise**: [aufr33](https://github.com/aufr33)
- **Conversion & Mirror by**: [AEmotionStudio](https://huggingface.co/AEmotionStudio)

## License

MIT — same as all upstream model releases.