--- license: mit library_name: mlx tags: - mlx - audio - speech-enhancement - noise-suppression - deepfilternet - apple-silicon base_model: - DeepFilterNet/DeepFilterNet - DeepFilterNet/DeepFilterNet2 - DeepFilterNet/DeepFilterNet3 pipeline_tag: audio-to-audio --- # DeepFilterNet — MLX MLX-compatible weights for [DeepFilterNet](https://github.com/Rikorose/DeepFilterNet), a real-time speech enhancement framework that suppresses background noise from full-band 48 kHz audio. This repository contains all three model versions (v1, v2, v3), converted directly from the original PyTorch checkpoints to `safetensors` format for use with [MLX](https://github.com/ml-explore/mlx) on Apple Silicon. No fine-tuning or quantization was applied — the weights are numerically identical to the originals. ## Models Each version is stored in its own subfolder: | Version | Subfolder | Weights | Paper | |---------|-----------|---------|-------| | DeepFilterNet v1 | `v1/` | ~7.2 MB (float32) | [arXiv:2110.05588](https://arxiv.org/abs/2110.05588) | | DeepFilterNet v2 | `v2/` | ~8.9 MB (float32) | [arXiv:2205.05474](https://arxiv.org/abs/2205.05474) | | DeepFilterNet v3 | `v3/` | ~8.3 MB (float32) | [arXiv:2305.08227](https://arxiv.org/abs/2305.08227) | ## Model Details All versions share the same audio parameters: | Parameter | Value | |-----------|-------| | Sample rate | 48 kHz | | FFT size | 960 | | Hop size | 480 | | ERB bands | 32 | | DF bins | 96 | | DF order | 5 | | Version | Embedding hidden dim | |---------|---------------------| | v1 | 512 | | v2 | 256 | | v3 | 256 | ## Files ``` convert_deepfilternet.py # PyTorch → MLX conversion script v1/ config.json # v1 architecture configuration model.safetensors # v1 weights v2/ config.json # v2 architecture configuration model.safetensors # v2 weights v3/ config.json # v3 architecture configuration model.safetensors # v3 weights ``` ## Usage ### Python (mlx-audio) ```python from mlx_audio.sts.models.deepfilternet import DeepFilterNetModel # Load v3 (default) model = DeepFilterNetModel.from_pretrained("mlx-community/DeepFilterNet-mlx") # Load a specific version model = DeepFilterNetModel.from_pretrained("mlx-community/DeepFilterNet-mlx", subfolder="v1") # Enhance a file enhanced = model.enhance("noisy.wav") ``` ### Swift (mlx-audio-swift) ```swift import MLXAudioSTS let model = try await DeepFilterNetModel.fromPretrained("mlx-community/DeepFilterNet-mlx", subfolder: "v3") let enhanced = try model.enhance(audioArray) ``` ## Converting from PyTorch To re-create these weights from the original DeepFilterNet checkpoints: ```bash # Clone the original repo to get the pretrained checkpoints git clone https://github.com/Rikorose/DeepFilterNet # Convert each version python convert_deepfilternet.py --input DeepFilterNet/DeepFilterNet --output v1 --name DeepFilterNet python convert_deepfilternet.py --input DeepFilterNet/DeepFilterNet2 --output v2 --name DeepFilterNet2 python convert_deepfilternet.py --input DeepFilterNet/DeepFilterNet3 --output v3 --name DeepFilterNet3 ``` Each input directory should contain a `config.ini` and a `checkpoints/` folder from the original repo. Requires `torch` and `mlx` to be installed. ## Origin - **Original model:** [DeepFilterNet](https://github.com/Rikorose/DeepFilterNet) by Hendrik Schroeter - **License:** MIT (same as the original) - **Conversion:** PyTorch → `safetensors` via `convert_deepfilternet.py` ## Citations ```bibtex @inproceedings{schroeter2022deepfilternet, title={{DeepFilterNet}: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering}, author={Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas}, booktitle={ICASSP 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, year={2022}, organization={IEEE} } @inproceedings{schroeter2022deepfilternet2, title={{DeepFilterNet2}: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio}, author={Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas}, booktitle={17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022)}, year={2022}, } @inproceedings{schroeter2023deepfilternet3, title={DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement}, author={Schr{\"o}ter, Hendrik and Rosenkranz, Tobias and Escalante-B., Alberto N. and Maier, Andreas}, booktitle={INTERSPEECH}, year={2023} } ```