Instructions to use microsoft/VibeVoice-1.5B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/VibeVoice-1.5B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="microsoft/VibeVoice-1.5B")# Load model directly from transformers import AutoModelForSeq2SeqLM model = AutoModelForSeq2SeqLM.from_pretrained("microsoft/VibeVoice-1.5B", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Training code
Thank you so much for open sourcing VibeVoice - have been waiting so long for a long-form + expressive TTS model :)
Is there any chance the training code or finetuning could could be released? Would love to try finetuning on some data.
Thanks for your interesting.
We are now preparing the training and finetuning code.
Hope we will release this in 2-3 weeks.
Thank you! Another question - is the 7B model also MIT?
dot
I have downloaded vibevoice large pt, and I have put everything in the folder named “VibeVoice-Large-pt“” via path:ComfyUI-V51-0355\ComfyUI\models\tts\VibeVoice\, and it complains about not having found the large model, I have been racking my brains about it, just can't find a solution to this prob, can anybody pull it off for me, thaaaaaaaaaaaaaaanks.
Argentinian accent does not work. How can we do a LoRA?
Working on unofficial training code for vibevoice... will publish in a bit.
Will be available here: https://github.com/vibevoice-community/VibeVoice/