Instructions to use microsoft/speecht5_tts with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/speecht5_tts with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="microsoft/speecht5_tts")# Load model directly from transformers import AutoProcessor, AutoModelForTextToSpectrogram processor = AutoProcessor.from_pretrained("microsoft/speecht5_tts") model = AutoModelForTextToSpectrogram.from_pretrained("microsoft/speecht5_tts") - Notebooks
- Google Colab
- Kaggle
text-to-speech is not a valid pipeline
When I run the "Hosted inference API" I get text-to-speech is not a valid pipeline
Indeed, there currently is no pipeline for the TTS task in Transformers.
Bump
Update, a TTS pipeline will be added once at least 2 different TTS models are present in the Transformers library. This is to ensure the design is robust enough to handle different models
Update, a TTS pipeline will be added once at least 2 different TTS models are present in the Transformers library. This is to ensure the design is robust enough to handle different models
What is required to accomplish this?
Bump, problem still persists.
A text-to-audio pipeline is now available: https://github.com/huggingface/transformers/pull/24952, supporting SpeechT5 and Bark. Usage is as follows:
from transformers import pipeline
classifier = pipeline(model="suno/bark")
output = pipeline("Hey it's HuggingFace on the phone!")
audio = output["audio"]
sampling_rate = output["sampling_rate"]
Next step is to create a corresponding inference widget for it, cc @mishig .
