Instructions to use microsoft/VibeVoice-1.5B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/VibeVoice-1.5B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="microsoft/VibeVoice-1.5B")# Load model directly from transformers import AutoModelForSeq2SeqLM model = AutoModelForSeq2SeqLM.from_pretrained("microsoft/VibeVoice-1.5B", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -26,7 +26,7 @@ The model can synthesize speech up to **90 minutes** long with up to **4 distinc
|
|
| 26 |
<img src="figures/Fig1.png" alt="VibeVoice Overview" height="250px">
|
| 27 |
</p>
|
| 28 |
|
| 29 |
-
## Training
|
| 30 |
Transformer-based Large Language Model (LLM) integrated with specialized acoustic and semantic tokenizers and a diffusion-based decoding head.
|
| 31 |
- LLM: [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B) for this release.
|
| 32 |
- Tokenizers:
|
|
|
|
| 26 |
<img src="figures/Fig1.png" alt="VibeVoice Overview" height="250px">
|
| 27 |
</p>
|
| 28 |
|
| 29 |
+
## Training Details
|
| 30 |
Transformer-based Large Language Model (LLM) integrated with specialized acoustic and semantic tokenizers and a diffusion-based decoding head.
|
| 31 |
- LLM: [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B) for this release.
|
| 32 |
- Tokenizers:
|