Instructions to use microsoft/VibeVoice-1.5B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/VibeVoice-1.5B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="microsoft/VibeVoice-1.5B")# Load model directly from transformers import AutoModelForSeq2SeqLM model = AutoModelForSeq2SeqLM.from_pretrained("microsoft/VibeVoice-1.5B", dtype="auto") - Notebooks
- Google Colab
- Kaggle
gguf / Quants
#3
by PsiPi - opened
Someone had to say it
I get it's tiny.
Guess there would be testing needed on how much quality is retained at a Q4 or Q5 level.
It would be particularly interesting to see if quantization affects the "vibe" of voice, and if the compression significantly impacts the frame rate which seems to be a key feature of how this model excels.
Yeah, the thought was a q8 to start and with the advent of the incoming 0.5 (and perhaps a 7?) quants may be less (or more) important,
In some case simply being in the correct format "wrapper" allows people to use their preferred tooling also.