Text-to-Speech
Transformers
Safetensors
English
vibevoice_streaming
Realtime TTS
Streaming text input
Long-form speech generation
Instructions to use microsoft/VibeVoice-Realtime-0.5B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/VibeVoice-Realtime-0.5B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="microsoft/VibeVoice-Realtime-0.5B")# Load model directly from transformers import VibeVoiceStreamingForConditionalGenerationInference model = VibeVoiceStreamingForConditionalGenerationInference.from_pretrained("microsoft/VibeVoice-Realtime-0.5B", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Music played at the start of the 0.5model
#9
by acoloss - opened
spent a long time setting this up to test .. but plays intro music like from a news channel..
anyway to prevent this?
Currently, the model may occasionally generate background music based on the text content, for example, with opening phrases like “Welcome to.” This probability could be further reduced by adjusting the text or switching the speaker. We also plan to tune the model and provide a version that does not trigger music generation in the future.
This comment has been hidden (marked as Resolved)
What kind of music did you hear? 🤣