LenDigLearn commited on
Commit
ddccf91
·
verified ·
1 Parent(s): b2f8b87

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -99,6 +99,8 @@ with torch.no_grad():
99
  torchaudio.save("tts.wav", recon[0, :, :], 24_000)
100
  ```
101
 
 
 
102
  ### What's to come
103
 
104
  As stated in the model's name, this is a preview model, mainly meant to showcase the capability of the base model.
@@ -107,8 +109,7 @@ We trained on a small dataset of a single speaker without any special emotion ta
107
  We are actively working on
108
  - multiple speakers with emotional control and nonverbal elements (fillers, laughing, ...)
109
  - fine-tuning for general zero-shot voice cloning
 
110
  - post-training with reinforcement learning
111
 
112
- Also, we have a fine-tuned version of NeuCodec which we used to generate the speech examples above, which we also plan on realeasing.
113
-
114
  Stay tuned - january 2026 is going to be exciting!
 
99
  torchaudio.save("tts.wav", recon[0, :, :], 24_000)
100
  ```
101
 
102
+ For even higher fidelity in German speech, use our [finetuned NeuCodec decoder](https://huggingface.co/DigitalLearningGmbH/neucodec-decoder-ft-de).
103
+
104
  ### What's to come
105
 
106
  As stated in the model's name, this is a preview model, mainly meant to showcase the capability of the base model.
 
109
  We are actively working on
110
  - multiple speakers with emotional control and nonverbal elements (fillers, laughing, ...)
111
  - fine-tuning for general zero-shot voice cloning
112
+ - phoneme-based / hybrid generation
113
  - post-training with reinforcement learning
114
 
 
 
115
  Stay tuned - january 2026 is going to be exciting!