File size: 4,762 Bytes
dffd7c4
 
fc73d82
 
 
 
 
 
 
 
 
 
 
 
 
dffd7c4
fc73d82
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8abe262
fc73d82
8abe262
 
fc73d82
8abe262
 
fc73d82
8abe262
 
fc73d82
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8abe262
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fc73d82
 
 
 
8abe262
fc73d82
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
---

license: mit
library_name: mlx
tags:
  - mlx
  - audio
  - speech-enhancement
  - noise-suppression
  - deepfilternet
  - apple-silicon
base_model:
  - DeepFilterNet/DeepFilterNet
  - DeepFilterNet/DeepFilterNet2
  - DeepFilterNet/DeepFilterNet3
pipeline_tag: audio-to-audio
---


# DeepFilterNet — MLX

MLX-compatible weights for [DeepFilterNet](https://github.com/Rikorose/DeepFilterNet), a real-time speech enhancement framework that suppresses background noise from full-band 48 kHz audio.

This repository contains all three model versions (v1, v2, v3), converted directly from the original PyTorch checkpoints to `safetensors` format for use with [MLX](https://github.com/ml-explore/mlx) on Apple Silicon. No fine-tuning or quantization was applied — the weights are numerically identical to the originals.

## Models

Each version is stored in its own subfolder:

| Version | Subfolder | Weights | Paper |
|---------|-----------|---------|-------|
| DeepFilterNet v1 | `v1/` | ~7.2 MB (float32) | [arXiv:2110.05588](https://arxiv.org/abs/2110.05588) |
| DeepFilterNet v2 | `v2/` | ~8.9 MB (float32) | [arXiv:2205.05474](https://arxiv.org/abs/2205.05474) |
| DeepFilterNet v3 | `v3/` | ~8.3 MB (float32) | [arXiv:2305.08227](https://arxiv.org/abs/2305.08227) |

## Model Details

All versions share the same audio parameters:

| Parameter | Value |
|-----------|-------|
| Sample rate | 48 kHz |
| FFT size | 960 |
| Hop size | 480 |
| ERB bands | 32 |
| DF bins | 96 |
| DF order | 5 |

| Version | Embedding hidden dim |
|---------|---------------------|
| v1 | 512 |
| v2 | 256 |
| v3 | 256 |

## Files

```

convert_deepfilternet.py   # PyTorch → MLX conversion script

v1/

  config.json              # v1 architecture configuration

  model.safetensors        # v1 weights

v2/

  config.json              # v2 architecture configuration

  model.safetensors        # v2 weights

v3/

  config.json              # v3 architecture configuration

  model.safetensors        # v3 weights

```

## Usage

### Python (mlx-audio)

```python

from mlx_audio.sts.models.deepfilternet import DeepFilterNetModel



# Load v3 (default)

model = DeepFilterNetModel.from_pretrained("mlx-community/DeepFilterNet-mlx")



# Load a specific version

model = DeepFilterNetModel.from_pretrained("mlx-community/DeepFilterNet-mlx", subfolder="v1")



# Enhance a file

enhanced = model.enhance("noisy.wav")

```

### Swift (mlx-audio-swift)

```swift

import MLXAudioSTS



let model = try await DeepFilterNetModel.fromPretrained("mlx-community/DeepFilterNet-mlx", subfolder: "v3")

let enhanced = try model.enhance(audioArray)

```

## Converting from PyTorch

To re-create these weights from the original DeepFilterNet checkpoints:

```bash

# Clone the original repo to get the pretrained checkpoints

git clone https://github.com/Rikorose/DeepFilterNet



# Convert each version

python convert_deepfilternet.py --input DeepFilterNet/DeepFilterNet  --output v1 --name DeepFilterNet

python convert_deepfilternet.py --input DeepFilterNet/DeepFilterNet2 --output v2 --name DeepFilterNet2

python convert_deepfilternet.py --input DeepFilterNet/DeepFilterNet3 --output v3 --name DeepFilterNet3

```

Each input directory should contain a `config.ini` and a `checkpoints/` folder from the original repo.

Requires `torch` and `mlx` to be installed.

## Origin

- **Original model:** [DeepFilterNet](https://github.com/Rikorose/DeepFilterNet) by Hendrik Schroeter
- **License:** MIT (same as the original)
- **Conversion:** PyTorch → `safetensors` via `convert_deepfilternet.py`

## Citations

```bibtex

@inproceedings{schroeter2022deepfilternet,

  title={{DeepFilterNet}: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering},

  author={Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas},

  booktitle={ICASSP 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},

  year={2022},

  organization={IEEE}

}



@inproceedings{schroeter2022deepfilternet2,

  title={{DeepFilterNet2}: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio},

  author={Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas},

  booktitle={17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022)},

  year={2022},

}



@inproceedings{schroeter2023deepfilternet3,

  title={DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement},

  author={Schr{\"o}ter, Hendrik and Rosenkranz, Tobias and Escalante-B., Alberto N. and Maier, Andreas},

  booktitle={INTERSPEECH},

  year={2023}

}

```