Kyle Howells commited on
Commit
c4137e6
Β·
1 Parent(s): 0ef541a

Update README: add model descriptions, update conversion path to direct PyTorch export

Browse files
Files changed (1) hide show
  1. README.md +22 -32
README.md CHANGED
@@ -22,16 +22,16 @@ Demucs is a music source separation model that splits audio into stems: `drums`,
22
 
23
  ## Models
24
 
25
- | Model | Architecture | Sub-models | Sources | Weights | Tensors |
26
- |-------|-------------|-----------|---------|---------|---------|
27
- | `htdemucs` | HTDemucs (v4) | 1 | 4 | 160 MB | 573 |
28
- | `htdemucs_ft` | HTDemucs (v4) | 4 (fine-tuned) | 4 | 641 MB | 2292 |
29
- | `htdemucs_6s` | HTDemucs (v4) | 1 | 6 | 105 MB | 565 |
30
- | `hdemucs_mmi` | HDemucs (v3) | 1 | 4 | 319 MB | 379 |
31
- | `mdx` | Demucs + HDemucs | 4 (bag) | 4 | 1.3 GB | 1298 |
32
- | `mdx_extra` | HDemucs | 4 (bag) | 4 | 1.2 GB | 1516 |
33
- | `mdx_q` | Demucs + HDemucs | 4 (bag) | 4 | 1.3 GB | 1298 |
34
- | `mdx_extra_q` | HDemucs | 4 (bag) | 4 | 1.2 GB | 1516 |
35
 
36
  All models output stereo audio at 44.1 kHz.
37
 
@@ -39,8 +39,7 @@ All models output stereo audio at 44.1 kHz.
39
 
40
  - Original model/repo: [adefossez/demucs](https://github.com/adefossez/demucs)
41
  - License: MIT (same as original Demucs)
42
- - Conversion path: PyTorch checkpoints β†’ `demucs-mlx` pickle β†’ safetensors + JSON config
43
- - MLX Python port: [ssmall256/demucs-mlx](https://github.com/ssmall256/demucs-mlx)
44
  - Swift MLX port: [iky1e/demucs-mlx-swift](https://github.com/iky1e/demucs-mlx-swift)
45
 
46
  No fine-tuning or quantization was applied β€” these are direct conversions of the original pretrained weights.
@@ -52,14 +51,6 @@ Each model consists of two files at the repo root:
52
  - `{model_name}.safetensors` β€” model weights (float32)
53
  - `{model_name}_config.json` β€” model class, architecture config, and bag-of-models metadata
54
 
55
- Conversion scripts are also included:
56
-
57
- | Script | Description |
58
- |--------|-------------|
59
- | `export_all_models.py` | Batch export all demucs-mlx pickle checkpoints to safetensors |
60
- | `export_mdx.py` | Specialized PyTorch β†’ safetensors converter for heterogeneous mdx bags |
61
- | `convert_demucs_mlx_checkpoint.py` | Single checkpoint converter (demucs-mlx pickle β†’ safetensors) |
62
-
63
  ## Usage
64
 
65
  ### Swift (demucs-mlx-swift)
@@ -93,23 +84,22 @@ pip install demucs-mlx
93
  demucs-mlx -n htdemucs song.wav
94
  ```
95
 
96
- ## Converting from demucs-mlx checkpoints
97
 
98
- To reproduce the export from existing `demucs-mlx` cache checkpoints:
99
 
100
  ```bash
101
- # Export all models at once
102
- python export_all_models.py \
103
- --cache-dir ~/.cache/demucs-mlx \
104
- --out-dir ./output
105
-
106
- # Export a single model
107
- python convert_demucs_mlx_checkpoint.py \
108
- --checkpoint ~/.cache/demucs-mlx/htdemucs_mlx.pkl \
109
- --out-dir ./output \
110
- --name htdemucs
111
  ```
112
 
 
 
113
  ## Citation
114
 
115
  ```bibtex
 
22
 
23
  ## Models
24
 
25
+ | Model | What it is | Architecture | Sub-models | Sources | Weights | Tensors |
26
+ |-------|-----------|-------------|-----------|---------|---------|---------|
27
+ | `htdemucs` | Default v4 model, best speed/quality balance | HTDemucs (v4) | 1 | 4 | 160 MB | 573 |
28
+ | `htdemucs_ft` | Fine-tuned v4, best overall quality | HTDemucs (v4) | 4 (fine-tuned) | 4 | 641 MB | 2292 |
29
+ | `htdemucs_6s` | 6-source v4 (adds guitar + piano stems) | HTDemucs (v4) | 1 | 6 | 105 MB | 565 |
30
+ | `hdemucs_mmi` | v3 hybrid, trained on more data | HDemucs (v3) | 1 | 4 | 319 MB | 379 |
31
+ | `mdx` | v3 bag-of-models ensemble | Demucs + HDemucs | 4 (bag) | 4 | 1.3 GB | 1298 |
32
+ | `mdx_extra` | v3 ensemble trained on extra data | HDemucs | 4 (bag) | 4 | 1.2 GB | 1516 |
33
+ | `mdx_q` | Quantized v3 ensemble (same quality, smaller) | Demucs + HDemucs | 4 (bag) | 4 | 1.3 GB | 1298 |
34
+ | `mdx_extra_q` | Quantized v3 extra ensemble | HDemucs | 4 (bag) | 4 | 1.2 GB | 1516 |
35
 
36
  All models output stereo audio at 44.1 kHz.
37
 
 
39
 
40
  - Original model/repo: [adefossez/demucs](https://github.com/adefossez/demucs)
41
  - License: MIT (same as original Demucs)
42
+ - Conversion path: PyTorch checkpoints β†’ safetensors + JSON config (direct, no intermediary)
 
43
  - Swift MLX port: [iky1e/demucs-mlx-swift](https://github.com/iky1e/demucs-mlx-swift)
44
 
45
  No fine-tuning or quantization was applied β€” these are direct conversions of the original pretrained weights.
 
51
  - `{model_name}.safetensors` β€” model weights (float32)
52
  - `{model_name}_config.json` β€” model class, architecture config, and bag-of-models metadata
53
 
 
 
 
 
 
 
 
 
54
  ## Usage
55
 
56
  ### Swift (demucs-mlx-swift)
 
84
  demucs-mlx -n htdemucs song.wav
85
  ```
86
 
87
+ ## Converting from PyTorch
88
 
89
+ To reproduce the export directly from PyTorch Demucs checkpoints:
90
 
91
  ```bash
92
+ pip install demucs safetensors numpy
93
+
94
+ # Export all 8 models
95
+ python export_from_pytorch.py --out-dir ./output
96
+
97
+ # Export specific models
98
+ python export_from_pytorch.py --models htdemucs htdemucs_ft --out-dir ./output
 
 
 
99
  ```
100
 
101
+ The conversion script (`export_from_pytorch.py`) is available in the [demucs-mlx-swift](https://github.com/iky1e/demucs-mlx-swift) repo under `scripts/`.
102
+
103
  ## Citation
104
 
105
  ```bibtex