demucs-mlx / README.md
iky1e's picture
Add all 8 Demucs models in float32 safetensors format
d4519e2 verified
---
license: mit
library_name: mlx
tags:
- mlx
- audio
- music-source-separation
- source-separation
- demucs
- htdemucs
- hdemucs
- apple-silicon
base_model: adefossez/demucs
pipeline_tag: audio-to-audio
---
> Originally from: [iky1e/demucs-mlx](https://huggingface.co/iky1e/demucs-mlx)
>
> Float16 variant: [mlx-community/demucs-mlx-fp16](https://huggingface.co/mlx-community/demucs-mlx-fp16)
# Demucs β€” MLX
MLX-compatible weights for all 8 pretrained [Demucs](https://github.com/adefossez/demucs) models, converted to `safetensors` format for inference on Apple Silicon.
Demucs is a music source separation model that splits audio into stems: `drums`, `bass`, `other`, `vocals` (and `guitar`, `piano` for 6-source models).
## Models
| Model | What it is | Architecture | Sub-models | Sources | Weights | Tensors |
|-------|-----------|-------------|-----------|---------|---------|---------|
| `htdemucs` | Default v4 model, best speed/quality balance | HTDemucs (v4) | 1 | 4 | 160 MB | 573 |
| `htdemucs_ft` | Fine-tuned v4, best overall quality | HTDemucs (v4) | 4 (fine-tuned) | 4 | 641 MB | 2292 |
| `htdemucs_6s` | 6-source v4 (adds guitar + piano stems) | HTDemucs (v4) | 1 | 6 | 105 MB | 565 |
| `hdemucs_mmi` | v3 hybrid, trained on more data | HDemucs (v3) | 1 | 4 | 319 MB | 379 |
| `mdx` | v3 bag-of-models ensemble | Demucs + HDemucs | 4 (bag) | 4 | 1.3 GB | 1298 |
| `mdx_extra` | v3 ensemble trained on extra data | HDemucs | 4 (bag) | 4 | 1.2 GB | 1516 |
| `mdx_q` | Quantized v3 ensemble (same quality, smaller) | Demucs + HDemucs | 4 (bag) | 4 | 1.3 GB | 1298 |
| `mdx_extra_q` | Quantized v3 extra ensemble | HDemucs | 4 (bag) | 4 | 1.2 GB | 1516 |
All models output stereo audio at 44.1 kHz.
## Origin
- Original model/repo: [adefossez/demucs](https://github.com/adefossez/demucs)
- License: MIT (same as original Demucs)
- Conversion path: PyTorch checkpoints β†’ safetensors + JSON config (direct, no intermediary)
- Swift MLX port: [iky1e/demucs-mlx-swift](https://github.com/iky1e/demucs-mlx-swift)
No fine-tuning or quantization was applied β€” these are direct conversions of the original pretrained weights.
## Files
Each model consists of two files at the repo root:
- `{model_name}.safetensors` β€” model weights (float32)
- `{model_name}_config.json` β€” model class, architecture config, and bag-of-models metadata
## Usage
### Swift (demucs-mlx-swift)
Models are downloaded automatically from this repo. No manual setup required.
```bash
# Separate a song into stems
demucs-mlx-swift -n htdemucs song.wav
# Use a specific model
demucs-mlx-swift -n htdemucs_ft song.wav
# Two-stem mode (vocals + instrumental)
demucs-mlx-swift -n htdemucs --two-stems vocals song.wav
```
Or use the Swift API directly:
```swift
import DemucsMLX
let separator = try DemucsSeparator(modelName: "htdemucs")
let result = try separator.separate(fileAt: URL(fileURLWithPath: "song.wav"))
```
### Python (demucs-mlx)
```bash
pip install demucs-mlx
demucs-mlx -n htdemucs song.wav
```
## Converting from PyTorch
To reproduce the export directly from PyTorch Demucs checkpoints:
```bash
pip install demucs safetensors numpy
# Export all 8 models
python export_from_pytorch.py --out-dir ./output
# Export specific models
python export_from_pytorch.py --models htdemucs htdemucs_ft --out-dir ./output
```
The conversion script (`export_from_pytorch.py`) is available in the [demucs-mlx-swift](https://github.com/iky1e/demucs-mlx-swift) repo under `scripts/`.
## Citation
```bibtex
@inproceedings{rouard2022hybrid,
title={Hybrid Transformers for Music Source Separation},
author={Rouard, Simon and Massa, Francisco and Defossez, Alexandre},
booktitle={ICASSP 23},
year={2023}
}
@inproceedings{defossez2021hybrid,
title={Hybrid Spectrogram and Waveform Source Separation},
author={Defossez, Alexandre},
booktitle={Proceedings of the ISMIR 2021 Workshop on Music Source Separation},
year={2021}
}
```