mispeech
/

dasheng-denoiser

dashengdenoiser

feature-extraction

signal-processing

Model card Files Files and versions

Heinrich Dinkel commited on 2 days ago

Commit

7a87364

·

1 Parent(s): e23a1f0

updated README

Files changed (1) hide show

README.md +57 -0

README.md CHANGED Viewed

@@ -1,3 +1,60 @@
 ---
 license: apache-2.0
 ---

 ---
+library_name: transformers
+pipeline_tag: audio-to-audio
+tags:
+- signal-processing
 license: apache-2.0
 ---
+<div align="center">
+    <h1>
+    Dasheng Denoiser
+    </h1>
+    <p>
+    Official PyTorch inference code for the Interspeech 2025 paper: <br>
+    <b><em>Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders</em></b>
+    </p>
+    <a href="https://arxiv.org/abs/2506.11514"><img src="https://img.shields.io/badge/arxiv-2506.11514-red" alt="version"></a>
+    <a href="https://www.python.org"><img src="https://img.shields.io/badge/Python-3.10+-orange" alt="version"></a>
+    <a href="https://pytorch.org"><img src="https://img.shields.io/badge/PyTorch-2.0+-brightgreen" alt="python"></a>
+    <a href="https://www.apache.org/licenses/LICENSE-2.0"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="mit"></a>
+    <a href="https://github.com/xiaomi-research/dasheng-denoiser"><img src="https://img.shields.io/github/stars/xiaomi-research/dasheng-denoiser?style=social" alt="stars"></a>
+</div>
+# Installation and Usage
+```bash
+uv pip install transformers torch torchaudio einops
+```
+```python
+import torch
+import torchaudio
+from transformers import AutoModel
+model = AutoModel.from_pretrained("mispeech/dasheng-denoiser",  trust_remote_code=True)
+model.eval()
+# Load audio file (only 16kHz supported!)
+audio, sr = torchaudio.load("path/to/audio.wav")
+with torch.no_grad(), torch.autocast(device_type='cuda'):
+    enhanced = model(audio)
+torchaudio.save("enhanced_audio.wav", enhanced, sr)
+```
+# Acknowledgements
+We referred to [Dasheng](https://github.com/XiaoMi/Dasheng) and [Vocos](https://github.com/gemelo-ai/vocos) to implement this.
+# Citation
+```bibtex
+@inproceedings{xingwei2025dashengdenoiser,
+  title={Efficient Speech Enhancement via Embeddings from Pre-trained Generative Audioencoders},
+  author={Xingwei Sun, Heinrich Dinkel, Yadong Niu, Linzhang Wang, Junbo Zhang, Jian Luan},
+  booktitle={Interspeech 2025},
+  year={2025}
+}
+```