---
license: mit
library_name: mlx
tags:
  - mlx
  - audio
  - music-source-separation
  - source-separation
  - demucs
  - htdemucs
  - hdemucs
  - apple-silicon
base_model: adefossez/demucs
pipeline_tag: audio-to-audio
---

> Originally from: [iky1e/demucs-mlx](https://huggingface.co/iky1e/demucs-mlx)
> 
> Float16 variant: [mlx-community/demucs-mlx-fp16](https://huggingface.co/mlx-community/demucs-mlx-fp16)

# Demucs — MLX

MLX-compatible weights for all 8 pretrained [Demucs](https://github.com/adefossez/demucs) models, converted to `safetensors` format for inference on Apple Silicon.

Demucs is a music source separation model that splits audio into stems: `drums`, `bass`, `other`, `vocals` (and `guitar`, `piano` for 6-source models).

## Models

| Model | What it is | Architecture | Sub-models | Sources | Weights | Tensors |
|-------|-----------|-------------|-----------|---------|---------|---------|
| `htdemucs` | Default v4 model, best speed/quality balance | HTDemucs (v4) | 1 | 4 | 160 MB | 573 |
| `htdemucs_ft` | Fine-tuned v4, best overall quality | HTDemucs (v4) | 4 (fine-tuned) | 4 | 641 MB | 2292 |
| `htdemucs_6s` | 6-source v4 (adds guitar + piano stems) | HTDemucs (v4) | 1 | 6 | 105 MB | 565 |
| `hdemucs_mmi` | v3 hybrid, trained on more data | HDemucs (v3) | 1 | 4 | 319 MB | 379 |
| `mdx` | v3 bag-of-models ensemble | Demucs + HDemucs | 4 (bag) | 4 | 1.3 GB | 1298 |
| `mdx_extra` | v3 ensemble trained on extra data | HDemucs | 4 (bag) | 4 | 1.2 GB | 1516 |
| `mdx_q` | Quantized v3 ensemble (same quality, smaller) | Demucs + HDemucs | 4 (bag) | 4 | 1.3 GB | 1298 |
| `mdx_extra_q` | Quantized v3 extra ensemble | HDemucs | 4 (bag) | 4 | 1.2 GB | 1516 |

All models output stereo audio at 44.1 kHz.

## Origin

- Original model/repo: [adefossez/demucs](https://github.com/adefossez/demucs)
- License: MIT (same as original Demucs)
- Conversion path: PyTorch checkpoints → safetensors + JSON config (direct, no intermediary)
- Swift MLX port: [iky1e/demucs-mlx-swift](https://github.com/iky1e/demucs-mlx-swift)

No fine-tuning or quantization was applied — these are direct conversions of the original pretrained weights.

## Files

Each model consists of two files at the repo root:

- `{model_name}.safetensors` — model weights (float32)
- `{model_name}_config.json` — model class, architecture config, and bag-of-models metadata

## Usage

### Swift (demucs-mlx-swift)

Models are downloaded automatically from this repo. No manual setup required.

```bash
# Separate a song into stems
demucs-mlx-swift -n htdemucs song.wav

# Use a specific model
demucs-mlx-swift -n htdemucs_ft song.wav

# Two-stem mode (vocals + instrumental)
demucs-mlx-swift -n htdemucs --two-stems vocals song.wav
```

Or use the Swift API directly:

```swift
import DemucsMLX

let separator = try DemucsSeparator(modelName: "htdemucs")
let result = try separator.separate(fileAt: URL(fileURLWithPath: "song.wav"))
```

### Python (demucs-mlx)

```bash
pip install demucs-mlx
demucs-mlx -n htdemucs song.wav
```

## Converting from PyTorch

To reproduce the export directly from PyTorch Demucs checkpoints:

```bash
pip install demucs safetensors numpy

# Export all 8 models
python export_from_pytorch.py --out-dir ./output

# Export specific models
python export_from_pytorch.py --models htdemucs htdemucs_ft --out-dir ./output
```

The conversion script (`export_from_pytorch.py`) is available in the [demucs-mlx-swift](https://github.com/iky1e/demucs-mlx-swift) repo under `scripts/`.

## Citation

```bibtex
@inproceedings{rouard2022hybrid,
  title={Hybrid Transformers for Music Source Separation},
  author={Rouard, Simon and Massa, Francisco and Defossez, Alexandre},
  booktitle={ICASSP 23},
  year={2023}
}

@inproceedings{defossez2021hybrid,
  title={Hybrid Spectrogram and Waveform Source Separation},
  author={Defossez, Alexandre},
  booktitle={Proceedings of the ISMIR 2021 Workshop on Music Source Separation},
  year={2021}
}
```