kyutai/pocket-tts
Pocket TTS optimized for Hugging Face Spaces on CPU
Fast, multi-speaker TTS (44.1kHz) with voice cloning
Qwen Image Editing Fusion Collection LoRA Demo
Evaluate speech accuracy with text and audio input
Create custom voices using sliders for dramatic changes
Actually Kokoro can do anything Microsoft Edge tts does what about adding pitch support too? I don't think it's something to be embedded in the model tho I guess we have to do it as a post processing right?
Holy moly my goodness this model is amazing, thank you for writing this blog and hosting a demo, this is literally the best the best TTS I've ever seen 10 times better than any other model I've seen