| | ---
|
| | license: mit
|
| | library_name: mlx
|
| | tags:
|
| | - mlx
|
| | - audio
|
| | - speech-enhancement
|
| | - noise-suppression
|
| | - deepfilternet
|
| | - apple-silicon
|
| | base_model:
|
| | - DeepFilterNet/DeepFilterNet
|
| | - DeepFilterNet/DeepFilterNet2
|
| | - DeepFilterNet/DeepFilterNet3
|
| | pipeline_tag: audio-to-audio
|
| | ---
|
| |
|
| | # DeepFilterNet — MLX
|
| |
|
| | MLX-compatible weights for [DeepFilterNet](https://github.com/Rikorose/DeepFilterNet), a real-time speech enhancement framework that suppresses background noise from full-band 48 kHz audio.
|
| |
|
| | This repository contains all three model versions (v1, v2, v3), converted directly from the original PyTorch checkpoints to `safetensors` format for use with [MLX](https://github.com/ml-explore/mlx) on Apple Silicon. No fine-tuning or quantization was applied — the weights are numerically identical to the originals.
|
| |
|
| | ## Models
|
| |
|
| | Each version is stored in its own subfolder:
|
| |
|
| | | Version | Subfolder | Weights | Paper |
|
| | |---------|-----------|---------|-------|
|
| | | DeepFilterNet v1 | `v1/` | ~7.2 MB (float32) | [arXiv:2110.05588](https://arxiv.org/abs/2110.05588) |
|
| | | DeepFilterNet v2 | `v2/` | ~8.9 MB (float32) | [arXiv:2205.05474](https://arxiv.org/abs/2205.05474) |
|
| | | DeepFilterNet v3 | `v3/` | ~8.3 MB (float32) | [arXiv:2305.08227](https://arxiv.org/abs/2305.08227) |
|
| |
|
| | ## Model Details
|
| |
|
| | All versions share the same audio parameters:
|
| |
|
| | | Parameter | Value |
|
| | |-----------|-------|
|
| | | Sample rate | 48 kHz |
|
| | | FFT size | 960 |
|
| | | Hop size | 480 |
|
| | | ERB bands | 32 |
|
| | | DF bins | 96 |
|
| | | DF order | 5 |
|
| |
|
| | | Version | Embedding hidden dim |
|
| | |---------|---------------------|
|
| | | v1 | 512 |
|
| | | v2 | 256 |
|
| | | v3 | 256 |
|
| |
|
| | ## Files
|
| |
|
| | ```
|
| | convert_deepfilternet.py # PyTorch → MLX conversion script
|
| | v1/
|
| | config.json # v1 architecture configuration
|
| | model.safetensors # v1 weights
|
| | v2/
|
| | config.json # v2 architecture configuration
|
| | model.safetensors # v2 weights
|
| | v3/
|
| | config.json # v3 architecture configuration
|
| | model.safetensors # v3 weights
|
| | ```
|
| |
|
| | ## Usage
|
| |
|
| | ### Python (mlx-audio)
|
| |
|
| | ```python
|
| | from mlx_audio.sts.models.deepfilternet import DeepFilterNetModel
|
| |
|
| | # Load v3 (default)
|
| | model = DeepFilterNetModel.from_pretrained("mlx-community/DeepFilterNet-mlx")
|
| |
|
| | # Load a specific version
|
| | model = DeepFilterNetModel.from_pretrained("mlx-community/DeepFilterNet-mlx", subfolder="v1")
|
| |
|
| | # Enhance a file
|
| | enhanced = model.enhance("noisy.wav")
|
| | ```
|
| |
|
| | ### Swift (mlx-audio-swift)
|
| |
|
| | ```swift
|
| | import MLXAudioSTS
|
| |
|
| | let model = try await DeepFilterNetModel.fromPretrained("mlx-community/DeepFilterNet-mlx", subfolder: "v3")
|
| | let enhanced = try model.enhance(audioArray)
|
| | ```
|
| |
|
| | ## Converting from PyTorch
|
| |
|
| | To re-create these weights from the original DeepFilterNet checkpoints:
|
| |
|
| | ```bash
|
| | # Clone the original repo to get the pretrained checkpoints
|
| | git clone https://github.com/Rikorose/DeepFilterNet
|
| |
|
| | # Convert each version
|
| | python convert_deepfilternet.py --input DeepFilterNet/DeepFilterNet --output v1 --name DeepFilterNet
|
| | python convert_deepfilternet.py --input DeepFilterNet/DeepFilterNet2 --output v2 --name DeepFilterNet2
|
| | python convert_deepfilternet.py --input DeepFilterNet/DeepFilterNet3 --output v3 --name DeepFilterNet3
|
| | ```
|
| |
|
| | Each input directory should contain a `config.ini` and a `checkpoints/` folder from the original repo.
|
| |
|
| | Requires `torch` and `mlx` to be installed.
|
| |
|
| | ## Origin
|
| |
|
| | - **Original model:** [DeepFilterNet](https://github.com/Rikorose/DeepFilterNet) by Hendrik Schroeter
|
| | - **License:** MIT (same as the original)
|
| | - **Conversion:** PyTorch → `safetensors` via `convert_deepfilternet.py`
|
| |
|
| | ## Citations
|
| |
|
| | ```bibtex
|
| | @inproceedings{schroeter2022deepfilternet,
|
| | title={{DeepFilterNet}: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering},
|
| | author={Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas},
|
| | booktitle={ICASSP 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
|
| | year={2022},
|
| | organization={IEEE}
|
| | }
|
| |
|
| | @inproceedings{schroeter2022deepfilternet2,
|
| | title={{DeepFilterNet2}: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio},
|
| | author={Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas},
|
| | booktitle={17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022)},
|
| | year={2022},
|
| | }
|
| |
|
| | @inproceedings{schroeter2023deepfilternet3,
|
| | title={DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement},
|
| | author={Schr{\"o}ter, Hendrik and Rosenkranz, Tobias and Escalante-B., Alberto N. and Maier, Andreas},
|
| | booktitle={INTERSPEECH},
|
| | year={2023}
|
| | }
|
| | ```
|
| |
|