Ceva-IP
/

DPDFNet

@@ -1,3 +1,4 @@
 ---
 license: apache-2.0
 pipeline_tag: audio-to-audio
@@ -5,125 +6,103 @@ tags:
   - speech_enhancement
   - noise_suppression
   - real_time
   - fullband
 ---
-# DPDFNet: Boosting DeepFilterNet2 via Dual-Path RNN
-DPDFNet is a family of causal, single-channel speech enhancement models for real-time noise suppression in challenging everyday environments. It extends the DeepFilterNet2 enhancement framework by inserting Dual-Path RNN (DPRNN) blocks into the encoder, strengthening long-range temporal and cross-band modeling while preserving a compact, streaming-friendly design.
-This repository provides TensorFlow Lite (TFLite) models optimized for mobile and edge deployment:
-**16kHz models**
-* `baseline.tflite`
-* `dpdfnet2.tflite`
-* `dpdfnet4.tflite`
-* `dpdfnet8.tflite`
-**48kHz model**
-* `dpdfnet2_48khz_hr.tflite`
 ---
-## Key Features
-* Causal and low-latency: Designed for streaming use cases such as telephony, conferencing, and embedded devices.
-* Dual-Path RNN integration: Improves temporal context and frequency-domain interactions for more robust enhancement in difficult noise conditions.
-* Scalable family: Choose baseline or dpdfnet2/4/8 to balance quality vs. compute.
-* Edge deployment focus: Demonstrated on Ceva NeuPro Nano NPUs in the accompanying work.
-* Fullband option: A dedicated 48kHz model is provided for fullband enhancement.
 ---
-## Model Variants and Footprint
-### 16kHz models
-| Model     | Params [M] | MACs [G] | TFLite Size [MB] |
-| --------- | :--------: | :------: | :--------------: |
-| Baseline  |    2.31    |   0.36   |       8.5        |
-| DPDFNet-2 |    2.49    |   1.35   |      10.7        |
-| DPDFNet-4 |    2.84    |   2.36   |      12.9        |
-| DPDFNet-8 |    3.54    |   4.37   |      17.2        |
-### 48kHz model
-| Model        | Params [M] | MACs [G] | TFLite Size [MB] |
-| ------------ | :--------: | :------: | :--------------: |
-| DPDFNet-2 HR |    2.58    |   2.42   |      11.6        |
----
-## Intended Use
-Primary task: Real-time, single-channel speech enhancement (noise suppression).
-Deployment targets: Mobile devices, embedded NPUs, and edge platforms.
-Input and Output:
-* **16kHz models**
-  * Input: 16kHz mono noisy speech waveform
-  * Output: 16kHz mono enhanced speech waveform
-* **48kHz model**
-  * Input: 48kHz mono noisy speech waveform
-  * Output: 48kHz mono enhanced speech waveform
-Typical applications:
-* Voice calls and VoIP
-* Video conferencing
-* Always-on voice interfaces
-* Wearables, earbuds, and embedded audio devices
 ---
-## Inference
-This repo includes a inference script for running the TFLite models on WAV files using streaming-style, frame-by-frame inference: `run_tflite.py`.
-> **Note:** When using `dpdfnet2_48khz_hr`, the inference script automatically switches to the 48kHz processing pipeline.
-### Setup
-Install dependencies:
 ```bash
-pip install numpy soundfile librosa tqdm
-pip install tflite-runtime
 ```
-### Model placement
-By default, the script loads models from:
-* `./<model_name>.tflite`
-Create the folder and place the `.tflite` files there (or edit `TFLITE_DIR` in the script to match your layout).
-### Run enhancement on a folder of WAVs
-The script processes `*.wav` files non-recursively and writes enhanced outputs as 16-bit PCM WAVs:
-```bash
-python run_tflite.py --noisy_dir /path/to/noisy_wavs --enhanced_dir /path/to/out --model_name dpdfnet8
-```
-Available `--model_name` options: `baseline`, `dpdfnet2`, `dpdfnet4`, `dpdfnet8`, `dpdfnet2_48khz_hr`.
----
-## Training Data
-The models were trained using a mixture of public speech and noise datasets, including DNS4 (downsampled), MLS, MUSAN, and FSD50K.
 ---
 ## Citation
-If you use these models, please cite:
 ```bibtex
 @article{rika2025dpdfnet,
   title  = {DPDFNet: Boosting DeepFilterNet2 via Dual-Path RNN},

 ---
 license: apache-2.0
 pipeline_tag: audio-to-audio
   - speech_enhancement
   - noise_suppression
   - real_time
+  - streaming
+  - causal
+  - onnx
+  - tflite
   - fullband
 ---
+# DPDFNet
+DPDFNet is a family of **causal, single‑channel** speech enhancement models for **real‑time noise suppression**.\
+It builds on **DeepFilterNet2** by adding **Dual‑Path RNN (DPRNN)** blocks in the encoder for stronger long‑range modeling while staying streaming‑friendly.
+**Links**
+- Project page (audio samples + architecture): https://ceva-ip.github.io/DPDFNet/
+- Paper (arXiv): https://arxiv.org/abs/2512.16420
+- Code (GitHub): https://github.com/ceva-ip/DPDFNet
+- Demo Space: https://huggingface.co/spaces/Ceva-IP/DPDFNetDemo
+- Evaluation set: https://huggingface.co/datasets/Ceva-IP/DPDFNet_EvalSet
 ---
+## What’s in this repo
+- **TFLite**: `*.tflite` (root)
+- **ONNX**: `onnx/*.onnx`
+- **PyTorch checkpoints**: `checkpoints/*.pth`
 ---
+## Model variants
+### 16 kHz models
+| Model | DPRNN blocks | Params (M) | MACs (G) |
+|---|:---:|:---:|:---:|
+| `baseline` | 0 | 2.31 | 0.36 |
+| `dpdfnet2` | 2 | 2.49 | 1.35 |
+| `dpdfnet4` | 4 | 2.84 | 2.36 |
+| `dpdfnet8` | 8 | 3.54 | 4.37 |
+### 48 kHz fullband model
+| Model | DPRNN blocks | Params (M) | MACs (G) |
+|---|:---:|:---:|:---:|
+| `dpdfnet2_48khz_hr` | 2 | 2.58 | 2.42 |
 ---
+## Recommended inference (CPU-only, ONNX)
 ```bash
+pip install dpdfnet
 ```
+### CLI
+```bash
+# Enhance one file
+dpdfnet enhance noisy.wav enhanced.wav --model dpdfnet4
+# Enhance a directory
+dpdfnet enhance-dir ./noisy_wavs ./enhanced_wavs --model dpdfnet2
+# Download models
+dpdfnet download
+dpdfnet download dpdfnet8
+dpdfnet download dpdfnet4 --force
+```
+### Python API
+```python
+import soundfile as sf
+import dpdfnet
+# In-memory enhancement:
+audio, sr = sf.read("noisy.wav")
+enhanced = dpdfnet.enhance(audio, sample_rate=sr, model="dpdfnet4")
+sf.write("enhanced.wav", enhanced, sr)
+# Enhance one file:
+out_path = dpdfnet.enhance_file("noisy.wav", model="dpdfnet2")
+print(out_path)
+# Model listing:
+for row in dpdfnet.available_models():
+    print(row["name"], row["ready"], row["cached"])
+# Download models:
+dpdfnet.download()				# All models
+dpdfnet.download("dpdfnet4")	# Specific model
+```
 ---
 ## Citation
 ```bibtex
 @article{rika2025dpdfnet,
   title  = {DPDFNet: Boosting DeepFilterNet2 via Dual-Path RNN},