LenDigLearn's picture
Update README.md
5a13d1a verified
---
license: apache-2.0
datasets:
- amphion/Emilia-Dataset
language:
- de
- en
base_model:
- neuphonic/neucodec
tags:
- audio
- speech
---
## NeuCodec decoder fine-tuned for German speech
This is just the decoder of [neuphonic/neucodec](https://huggingface.co/neuphonic/neucodec), fine-tuned on equal amounts of German and English speech data from Emilia-Yodas, to enhance decoding quality of German speech.
Since we only fine-tuned the decoder, the codebook is identical to the base model, meaning this model can be used with the regular NeuCodec encoder.
We supply a compact class `NeuCodecDecoder.py` to easily run inference with this decoder since the NeuCodec codebase doesn't easily allow loading model files from foreign HuggingFace repos.
### Inference Example
```python
import torch
import torchaudio
from NeuCodecDecoder import NeuCodecDecoder
decoder_model = NeuCodecDecoder.from_pretrained("DigitalLearningGmbH/neucodec-decoder-ft-de")
decoder_model = decoder_model.eval().cuda()
with torch.no_grad():
decoded = decoder_model.decode_code(torch.tensor(tokens).unsqueeze(0).unsqueeze(0).to('cuda')).cpu()
torchaudio.save("decoded.wav", decoded[0, :, :], 24_000)
```
For more information please refer to [the original model card](https://huggingface.co/neuphonic/neucodec).