|
|
--- |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- amphion/Emilia-Dataset |
|
|
language: |
|
|
- de |
|
|
- en |
|
|
base_model: |
|
|
- neuphonic/neucodec |
|
|
tags: |
|
|
- audio |
|
|
- speech |
|
|
--- |
|
|
|
|
|
## NeuCodec decoder fine-tuned for German speech |
|
|
|
|
|
This is just the decoder of [neuphonic/neucodec](https://huggingface.co/neuphonic/neucodec), fine-tuned on equal amounts of German and English speech data from Emilia-Yodas, to enhance decoding quality of German speech. |
|
|
Since we only fine-tuned the decoder, the codebook is identical to the base model, meaning this model can be used with the regular NeuCodec encoder. |
|
|
|
|
|
We supply a compact class `NeuCodecDecoder.py` to easily run inference with this decoder since the NeuCodec codebase doesn't easily allow loading model files from foreign HuggingFace repos. |
|
|
|
|
|
### Inference Example |
|
|
|
|
|
```python |
|
|
import torch |
|
|
import torchaudio |
|
|
|
|
|
from NeuCodecDecoder import NeuCodecDecoder |
|
|
|
|
|
decoder_model = NeuCodecDecoder.from_pretrained("DigitalLearningGmbH/neucodec-decoder-ft-de") |
|
|
decoder_model = decoder_model.eval().cuda() |
|
|
|
|
|
with torch.no_grad(): |
|
|
decoded = decoder_model.decode_code(torch.tensor(tokens).unsqueeze(0).unsqueeze(0).to('cuda')).cpu() |
|
|
|
|
|
torchaudio.save("decoded.wav", decoded[0, :, :], 24_000) |
|
|
``` |
|
|
|
|
|
For more information please refer to [the original model card](https://huggingface.co/neuphonic/neucodec). |