Instructions to use openbmb/VoxCPM-0.5B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- VoxCPM
How to use openbmb/VoxCPM-0.5B with VoxCPM:
import soundfile as sf from voxcpm import VoxCPM model = VoxCPM.from_pretrained("openbmb/VoxCPM-0.5B") wav = model.generate( text="VoxCPM is an innovative end-to-end TTS model from ModelBest, designed to generate highly expressive speech.", prompt_wav_path=None, # optional: path to a prompt speech for voice cloning prompt_text=None, # optional: reference text cfg_value=2.0, # LM guidance on LocDiT, higher for better adherence to the prompt, but maybe worse inference_timesteps=10, # LocDiT inference timesteps, higher for better result, lower for fast speed normalize=True, # enable external TN tool denoise=True, # enable external Denoise tool retry_badcase=True, # enable retrying mode for some bad cases (unstoppable) retry_badcase_max_times=3, # maximum retrying times retry_badcase_ratio_threshold=6.0, # maximum length restriction for bad case detection (simple but effective), it could be adjusted for slow pace speech ) sf.write("output.wav", wav, 16000) print("saved: output.wav") - Notebooks
- Google Colab
- Kaggle
modelscope.cn
#8
by anujchopra - opened
I have downloaded using huggingfacehub. Still the code is trying to connect to modelscope.cn
Is it safe? Why is it trying to connect?
In addition to downloading our model weights, we also use zipenhancer for denoising the prompt audio. Therefore, the code will connect to ModelScope to download the model.
It is making connection everytime I am running it. Shouldn't it download and then save it to disk.
TTS quality is good. But I feel that the model is narrating very fast and there should be a parameter to control the speed of tts.
I currently use xtts.