microsoft
/

VibeVoice-1.5B

text-generation

Model card Files Files and versions

unilm commited on Aug 28, 2025

Commit

cf42b8f

·

verified ·

1 Parent(s): 217a7fe

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -26,7 +26,7 @@ The model can synthesize speech up to **90 minutes** long with up to **4 distinc
   <img src="figures/Fig1.png" alt="VibeVoice Overview" height="250px">
 </p>
-## Training details
 Transformer-based Large Language Model (LLM) integrated with specialized acoustic and semantic tokenizers and a diffusion-based decoding head.
 - LLM: [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B) for this release.
 - Tokenizers:

   <img src="figures/Fig1.png" alt="VibeVoice Overview" height="250px">
 </p>
+## Training Details
 Transformer-based Large Language Model (LLM) integrated with specialized acoustic and semantic tokenizers and a diffusion-based decoding head.
 - LLM: [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B) for this release.
 - Tokenizers: