Text-to-Speech
Transformers
Safetensors
English
Chinese
speech-recognition
tts
asr
voice-cloning
long-form
multi-speaker
streaming
mirror
Instructions to use AEmotionStudio/vibevoice-models with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AEmotionStudio/vibevoice-models with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="AEmotionStudio/vibevoice-models")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("AEmotionStudio/vibevoice-models", dtype="auto") - Notebooks
- Google Colab
- Kaggle
| { | |
| "processor_class": "VibeVoiceStreamingProcessor", | |
| "speech_tok_compress_ratio": 3200, | |
| "db_normalize": true, | |
| "audio_processor": { | |
| "feature_extractor_type": "VibeVoiceTokenizerProcessor", | |
| "sampling_rate": 24000, | |
| "normalize_audio": true, | |
| "target_dB_FS": -25, | |
| "eps": 1e-06 | |
| }, | |
| "language_model_pretrained_name": "Qwen/Qwen2.5-0.5B" | |
| } |