sarulab-speech
/

UTDUSS-Vocoder

Model card Files Files and versions

Wataru commited on Mar 20, 2024

Commit

3768f68

·

verified ·

1 Parent(s): c7a5c60

Update README.md

Files changed (1) hide show

README.md +45 -0

README.md CHANGED Viewed

@@ -1,3 +1,48 @@
 ---
 license: cc-by-nc-4.0
 ---

 ---
 license: cc-by-nc-4.0
 ---
+# UTDUSS vocodder model
+In this repo, we provide model weight of the [descript audio codec](https://arxiv.org/abs/2306.06546) used for the [Interspeech2024 Speech Processing Using Discrete Speech Unit Challenge](https://www.wavlab.org/activities/2024/Interspeech2024-Discrete-Speech-Unit-Challenge/)
+# Prerequesties
+[official dac library](https://github.com/descriptinc/descript-audio-codec) which can be installed with the following command.
+```bash
+pip install descript-audio-codec
+```
+# Provided weights
+## Vocoder task
+| model name on paper | model name on this repo |
+|---|---|
+|😀 | expresso_16k_2code.pth|
+|😀 w/o hyper-parameter tuning| expresso_16k_2code_official.pth|
+|😀 w/o data exclusion| expresso_16k_2code_wo_data.pth|
+|😀 w/o matching sampling rate| expresso_24k_2code_ab.pth|
+## Acoustic +Vocoder (TTS) task
+Please note that the weight for acoustic model is not provided.
+| model name on paper | model name on this repo |
+|---|---|
+|😀 | expresso_16k_2code.pth|
+|😀 w/o hyper-parameter tuning| expresso_16k_2code_official.pth|
+|😀 w/o data exclusion| expresso_16k_2code_wo_data.pth|
+|😀 w/o matching sampling rate| expresso_24k_2code_ab.pth|
+# Sample code
+```python
+import dac
+import torch
+from pathlib import Path
+model_url = "https://huggingface.co/sarulab-speech/UTDUSS-Vocoder-Expresso/resolve/main/expresso_16k_2code.pth"
+model_path = Path(f"/tmp/utduss/{model_url.split('/')[-1]}")
+model_path.parent.mkdir(parents=True,exist_ok=True)
+torch.hub.download_url_to_file(model_url,model_path)
+model = dac.DAC.load(model_path)
+```
+# Contributors
+* [中田 亘](https://wataru-nakata.github.io/)