Text-to-Speech
Transformers
Safetensors
English
Chinese
speech-recognition
tts
asr
voice-cloning
long-form
multi-speaker
streaming
mirror
Instructions to use AEmotionStudio/vibevoice-models with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AEmotionStudio/vibevoice-models with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="AEmotionStudio/vibevoice-models")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("AEmotionStudio/vibevoice-models", dtype="auto") - Notebooks
- Google Colab
- Kaggle
add: tts-large-fp8/preprocessor_config.json (FP8 LM-backbone shard, pre-quantized from aoi-ot/VibeVoice-Large)
Browse files
tts-large-fp8/preprocessor_config.json
ADDED
|
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"processor_class": "VibeVoiceProcessor",
|
| 3 |
+
"speech_tok_compress_ratio": 3200,
|
| 4 |
+
"db_normalize": true,
|
| 5 |
+
"audio_processor": {
|
| 6 |
+
"feature_extractor_type": "VibeVoiceTokenizerProcessor",
|
| 7 |
+
"sampling_rate": 24000,
|
| 8 |
+
"normalize_audio": true,
|
| 9 |
+
"target_dB_FS": -25,
|
| 10 |
+
"eps": 1e-06
|
| 11 |
+
}
|
| 12 |
+
}
|