File size: 4,762 Bytes

---

license: mit
library_name: mlx
tags:
  - mlx
  - audio
  - speech-enhancement
  - noise-suppression
  - deepfilternet
  - apple-silicon
base_model:
  - DeepFilterNet/DeepFilterNet
  - DeepFilterNet/DeepFilterNet2
  - DeepFilterNet/DeepFilterNet3
pipeline_tag: audio-to-audio
---


# DeepFilterNet — MLX

MLX-compatible weights for [DeepFilterNet](https://github.com/Rikorose/DeepFilterNet), a real-time speech enhancement framework that suppresses background noise from full-band 48 kHz audio.

This repository contains all three model versions (v1, v2, v3), converted directly from the original PyTorch checkpoints to `safetensors` format for use with [MLX](https://github.com/ml-explore/mlx) on Apple Silicon. No fine-tuning or quantization was applied — the weights are numerically identical to the originals.

## Models

Each version is stored in its own subfolder:

| Version | Subfolder | Weights | Paper |
|---------|-----------|---------|-------|
| DeepFilterNet v1 | `v1/` | ~7.2 MB (float32) | [arXiv:2110.05588](https://arxiv.org/abs/2110.05588) |
| DeepFilterNet v2 | `v2/` | ~8.9 MB (float32) | [arXiv:2205.05474](https://arxiv.org/abs/2205.05474) |
| DeepFilterNet v3 | `v3/` | ~8.3 MB (float32) | [arXiv:2305.08227](https://arxiv.org/abs/2305.08227) |

## Model Details

All versions share the same audio parameters:

| Parameter | Value |
|-----------|-------|
| Sample rate | 48 kHz |
| FFT size | 960 |
| Hop size | 480 |
| ERB bands | 32 |
| DF bins | 96 |
| DF order | 5 |

| Version | Embedding hidden dim |
|---------|---------------------|
| v1 | 512 |
| v2 | 256 |
| v3 | 256 |

## Files

```

convert_deepfilternet.py   # PyTorch → MLX conversion script

v1/

  config.json              # v1 architecture configuration

  model.safetensors        # v1 weights

v2/

  config.json              # v2 architecture configuration

  model.safetensors        # v2 weights

v3/

  config.json              # v3 architecture configuration

  model.safetensors        # v3 weights

```

## Usage

### Python (mlx-audio)

```python

from mlx_audio.sts.models.deepfilternet import DeepFilterNetModel



# Load v3 (default)

model = DeepFilterNetModel.from_pretrained("mlx-community/DeepFilterNet-mlx")



# Load a specific version

model = DeepFilterNetModel.from_pretrained("mlx-community/DeepFilterNet-mlx", subfolder="v1")



# Enhance a file

enhanced = model.enhance("noisy.wav")

```

### Swift (mlx-audio-swift)

```swift

import MLXAudioSTS



let model = try await DeepFilterNetModel.fromPretrained("mlx-community/DeepFilterNet-mlx", subfolder: "v3")

let enhanced = try model.enhance(audioArray)

```

## Converting from PyTorch

To re-create these weights from the original DeepFilterNet checkpoints:

```bash

# Clone the original repo to get the pretrained checkpoints

git clone https://github.com/Rikorose/DeepFilterNet



# Convert each version

python convert_deepfilternet.py --input DeepFilterNet/DeepFilterNet  --output v1 --name DeepFilterNet

python convert_deepfilternet.py --input DeepFilterNet/DeepFilterNet2 --output v2 --name DeepFilterNet2

python convert_deepfilternet.py --input DeepFilterNet/DeepFilterNet3 --output v3 --name DeepFilterNet3

```

Each input directory should contain a `config.ini` and a `checkpoints/` folder from the original repo.

Requires `torch` and `mlx` to be installed.

## Origin

- **Original model:** [DeepFilterNet](https://github.com/Rikorose/DeepFilterNet) by Hendrik Schroeter
- **License:** MIT (same as the original)
- **Conversion:** PyTorch → `safetensors` via `convert_deepfilternet.py`

## Citations

```bibtex

@inproceedings{schroeter2022deepfilternet,

  title={{DeepFilterNet}: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering},

  author={Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas},

  booktitle={ICASSP 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},

  year={2022},

  organization={IEEE}

}



@inproceedings{schroeter2022deepfilternet2,

  title={{DeepFilterNet2}: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio},

  author={Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas},

  booktitle={17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022)},

  year={2022},

}



@inproceedings{schroeter2023deepfilternet3,

  title={DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement},

  author={Schr{\"o}ter, Hendrik and Rosenkranz, Tobias and Escalante-B., Alberto N. and Maier, Andreas},

  booktitle={INTERSPEECH},

  year={2023}

}

```