Kyle Howells commited on
Commit
ee295a4
·
1 Parent(s): 78e496a

Update model card with full description, usage, and attribution

Browse files
Files changed (1) hide show
  1. README.md +89 -0
README.md CHANGED
@@ -1,3 +1,92 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ library_name: mlx
4
+ tags:
5
+ - mlx
6
+ - audio
7
+ - speech-enhancement
8
+ - noise-suppression
9
+ - deepfilternet
10
+ - apple-silicon
11
+ base_model: DeepFilterNet/DeepFilterNet3
12
+ pipeline_tag: audio-to-audio
13
  ---
14
+
15
+ # DeepFilterNet3 — MLX
16
+
17
+ MLX-compatible weights for [DeepFilterNet3](https://github.com/Rikorose/DeepFilterNet), a real-time speech enhancement model that suppresses background noise from audio.
18
+
19
+ This is a direct conversion of the original PyTorch weights to `safetensors` format for use with [MLX](https://github.com/ml-explore/mlx) on Apple Silicon.
20
+
21
+ ## Origin
22
+
23
+ - **Original model:** [DeepFilterNet3](https://github.com/Rikorose/DeepFilterNet) by Hendrik Schröter
24
+ - **Paper:** [DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement](https://arxiv.org/abs/2305.08227)
25
+ - **License:** MIT (same as the original)
26
+ - **Conversion:** PyTorch → `safetensors` via the included `convert_deepfilternet.py` script
27
+
28
+ No fine-tuning or quantisation was applied — the weights are numerically identical to the original checkpoint.
29
+
30
+ ## Files
31
+
32
+ | File | Description |
33
+ |---|---|
34
+ | `config.json` | Model architecture configuration |
35
+ | `model.safetensors` | Pre-converted weights (8.3 MB, float32) |
36
+ | `convert_deepfilternet.py` | Conversion script (PyTorch → MLX safetensors) |
37
+
38
+ ## Model Details
39
+
40
+ | Parameter | Value |
41
+ |---|---|
42
+ | Sample rate | 48 kHz |
43
+ | FFT size | 960 |
44
+ | Hop size | 480 |
45
+ | ERB bands | 32 |
46
+ | DF bins | 96 |
47
+ | DF order | 5 |
48
+ | Parameters | ~2M |
49
+
50
+ ## Usage
51
+
52
+ ### Swift (mlx-audio-swift)
53
+
54
+ ```swift
55
+ import MLXAudioSTS
56
+
57
+ let model = try await DeepFilterNetModel.fromPretrained("iky1e/DeepFilterNet3-MLX")
58
+ let enhanced = try model.enhance(audioArray)
59
+ ```
60
+
61
+ ### Python (mlx-audio)
62
+
63
+ ```python
64
+ from mlx_audio.sts.models.deepfilternet import DeepFilterNetModel
65
+
66
+ model = DeepFilterNetModel.from_pretrained(version=3, model_dir="path/to/local/dir")
67
+ enhanced = model.enhance("noisy.wav")
68
+ ```
69
+
70
+ ## Converting from PyTorch
71
+
72
+ To re-create this conversion from the original DeepFilterNet checkpoint:
73
+
74
+ ```bash
75
+ python convert_deepfilternet.py \
76
+ --input /path/to/DeepFilterNet3 \
77
+ --output ./DeepFilterNet3-MLX \
78
+ --name DeepFilterNet3
79
+ ```
80
+
81
+ The input directory should contain a `config.ini` and a `checkpoints/` folder from the original repo.
82
+
83
+ ## Citation
84
+
85
+ ```bibtex
86
+ @inproceedings{schroeter2023deepfilternet3,
87
+ title={DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement},
88
+ author={Schr{\"o}ter, Hendrik and Rosenkranz, Tobias and Maier, Andreas and Escalante-B., Alberto N.},
89
+ booktitle={INTERSPEECH},
90
+ year={2023}
91
+ }
92
+ ```