Kyle Howells

Add all DeepFilterNet versions (v1, v2, v3) in subdirectories

fc73d82 about 1 month ago

3.92 kB

license: mit
library_name: mlx
tags:
  - mlx
  - audio
  - speech-enhancement
  - noise-suppression
  - deepfilternet
  - apple-silicon
base_model:
  - DeepFilterNet/DeepFilterNet
  - DeepFilterNet/DeepFilterNet2
  - DeepFilterNet/DeepFilterNet3
pipeline_tag: audio-to-audio

DeepFilterNet — MLX

MLX-compatible weights for DeepFilterNet, a real-time speech enhancement framework that suppresses background noise from full-band 48 kHz audio.

This repository contains all three model versions (v1, v2, v3), converted directly from the original PyTorch checkpoints to safetensors format for use with MLX on Apple Silicon. No fine-tuning or quantization was applied — the weights are numerically identical to the originals.

Models

Each version is stored in its own subfolder:

Version	Subfolder	Weights	Paper
DeepFilterNet v1	`v1/`	~7.2 MB (float32)	arXiv:2110.05588
DeepFilterNet v2	`v2/`	~8.9 MB (float32)	arXiv:2205.05474
DeepFilterNet v3	`v3/`	~8.3 MB (float32)	arXiv:2305.08227

Model Details

All versions share the same audio parameters:

Parameter	Value
Sample rate	48 kHz
FFT size	960
Hop size	480
ERB bands	32
DF bins	96
DF order	5

Version	Embedding hidden dim
v1	512
v2	256
v3	256

Files

v1/
  config.json          # v1 architecture configuration
  model.safetensors    # v1 weights
v2/
  config.json          # v2 architecture configuration
  model.safetensors    # v2 weights
v3/
  config.json          # v3 architecture configuration
  model.safetensors    # v3 weights

Usage

Python (mlx-audio)

from mlx_audio.sts.models.deepfilternet import DeepFilterNetModel

# Load v3 (default)
model = DeepFilterNetModel.from_pretrained("mlx-community/DeepFilterNet-mlx")

# Load a specific version
model = DeepFilterNetModel.from_pretrained("mlx-community/DeepFilterNet-mlx", subfolder="v1")

# Enhance a file
enhanced = model.enhance("noisy.wav")

Swift (mlx-audio-swift)

import MLXAudioSTS

let model = try await DeepFilterNetModel.fromPretrained("mlx-community/DeepFilterNet-mlx", subfolder: "v3")
let enhanced = try model.enhance(audioArray)

Origin

Original model: DeepFilterNet by Hendrik Schroeter
License: MIT (same as the original)
Conversion: PyTorch → safetensors

Citations

@inproceedings{schroeter2022deepfilternet,
  title={{DeepFilterNet}: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering},
  author={Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas},
  booktitle={ICASSP 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  year={2022},
  organization={IEEE}
}

@inproceedings{schroeter2022deepfilternet2,
  title={{DeepFilterNet2}: Towards Real-Time Speech Enhancement on Embedded Devices for Full-Band Audio},
  author={Schr{\"o}ter, Hendrik and Escalante-B., Alberto N. and Rosenkranz, Tobias and Maier, Andreas},
  booktitle={17th International Workshop on Acoustic Signal Enhancement (IWAENC 2022)},
  year={2022},
}

@inproceedings{schroeter2023deepfilternet3,
  title={DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement},
  author={Schr{\"o}ter, Hendrik and Rosenkranz, Tobias and Escalante-B., Alberto N. and Maier, Andreas},
  booktitle={INTERSPEECH},
  year={2023}
}