DeepFilterNet-mlx / README.md
Kyle Howells
Add all DeepFilterNet versions (v1, v2, v3) in subdirectories
fc73d82
|
raw
history blame
3.92 kB
metadata
license: mit
library_name: mlx
tags:
  - mlx
  - audio
  - speech-enhancement
  - noise-suppression
  - deepfilternet
  - apple-silicon
base_model:
  - DeepFilterNet/DeepFilterNet
  - DeepFilterNet/DeepFilterNet2
  - DeepFilterNet/DeepFilterNet3
pipeline_tag: audio-to-audio

DeepFilterNet — MLX

MLX-compatible weights for DeepFilterNet, a real-time speech enhancement framework that suppresses background noise from full-band 48 kHz audio.

This repository contains all three model versions (v1, v2, v3), converted directly from the original PyTorch checkpoints to safetensors format for use with MLX on Apple Silicon. No fine-tuning or quantization was applied — the weights are numerically identical to the originals.

Models

Each version is stored in its own subfolder:

Version Subfolder Weights Paper
DeepFilterNet v1 v1/ ~7.2 MB (float32) arXiv:2110.05588
DeepFilterNet v2 v2/ ~8.9 MB (float32) arXiv:2205.05474
DeepFilterNet v3 v3/ ~8.3 MB (float32) arXiv:2305.08227

Model Details

All versions share the same audio parameters:

Parameter Value
Sample rate 48 kHz
FFT size 960
Hop size 480
ERB bands 32
DF bins 96
DF order 5
Version Embedding hidden dim
v1 512
v2 256
v3 256

Files

v1/
  config.json          # v1 architecture configuration
  model.safetensors    # v1 weights
v2/
  config.json          # v2 architecture configuration
  model.safetensors    # v2 weights
v3/
  config.json          # v3 architecture configuration
  model.safetensors    # v3 weights

Usage

Python (mlx-audio)

from mlx_audio.sts.models.deepfilternet import DeepFilterNetModel

# Load v3 (default)
model = DeepFilterNetModel.from_pretrained("mlx-community/DeepFilterNet-mlx")

# Load a specific version
model = DeepFilterNetModel.from_pretrained("mlx-community/DeepFilterNet-mlx", subfolder="v1")

# Enhance a file
enhanced = model.enhance("noisy.wav")

Swift (mlx-audio-swift)

import MLXAudioSTS

let model = try await DeepFilterNetModel.fromPretrained("mlx-community/DeepFilterNet-mlx", subfolder: "v3")
let enhanced = try model.enhance(audioArray)

Origin

  • Original model: DeepFilterNet by Hendrik Schroeter
  • License: MIT (same as the original)
  • Conversion: PyTorch → safetensors

Citations

@inproceedings{schroeter2022deepfilternet,
  title={{DeepFilterNet}: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering},
  author={Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas},
  booktitle={ICASSP 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2022},
  organization={IEEE}
}

@inproceedings{schroeter2022deepfilternet2,
  title={{DeepFilterNet2}: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio},
  author={Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas},
  booktitle={17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022)},
  year={2022},
}

@inproceedings{schroeter2023deepfilternet3,
  title={DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement},
  author={Schr{\"o}ter, Hendrik and Rosenkranz, Tobias and Escalante-B., Alberto N. and Maier, Andreas},
  booktitle={INTERSPEECH},
  year={2023}
}