DigitalLearningGmbH
/

educa-ai-voice-preview

Model card Files Files and versions

LenDigLearn commited on Dec 19, 2025

Commit

ddccf91

·

verified ·

1 Parent(s): b2f8b87

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -99,6 +99,8 @@ with torch.no_grad():
 torchaudio.save("tts.wav", recon[0, :, :], 24_000)
 ```
 ### What's to come
 As stated in the model's name, this is a preview model, mainly meant to showcase the capability of the base model.
@@ -107,8 +109,7 @@ We trained on a small dataset of a single speaker without any special emotion ta
 We are actively working on
  - multiple speakers with emotional control and nonverbal elements (fillers, laughing, ...)
  - fine-tuning for general zero-shot voice cloning
  - post-training with reinforcement learning
-Also, we have a fine-tuned version of NeuCodec which we used to generate the speech examples above, which we also plan on realeasing.
 Stay tuned - january 2026 is going to be exciting!

 torchaudio.save("tts.wav", recon[0, :, :], 24_000)
 ```
+For even higher fidelity in German speech, use our [finetuned NeuCodec decoder](https://huggingface.co/DigitalLearningGmbH/neucodec-decoder-ft-de).
 ### What's to come
 As stated in the model's name, this is a preview model, mainly meant to showcase the capability of the base model.
 We are actively working on
  - multiple speakers with emotional control and nonverbal elements (fillers, laughing, ...)
  - fine-tuning for general zero-shot voice cloning
+ - phoneme-based / hybrid generation
  - post-training with reinforcement learning
 Stay tuned - january 2026 is going to be exciting!