--- license: apache-2.0 language: - en pipeline_tag: text-to-speech --- ## LuxTTS

Hugging Face Model   Colab Notebook

This is the model for LuxTTS, a lightweight zipvoice based text-to-speech model designed for high quality voice cloning and realistic generation at speeds exceeding 150x realtime. ### Main features - Voice cloning: SOTA voice cloning on par with models 10x larger. - Clarity: Clear 48khz speech generation unlike most TTS models which are limited to 24khz. - Speed: Reaches speeds of 150x realtime on a single GPU and faster then realtime on CPU's as well. - Efficiency: Fits within 1gb vram meaning it can fit in any local gpu. ### Details - Based on ZipVoice, distilled to 4steps. - Uses 48khz vocoder instead of 24khz vocoder. - Implemented higher quality sampling technique then standard euler. ### Usage Please check out the repo for usage: https://github.com/ysharma3501/LuxTTS.git ### License Model and code is released under Apache-2.0 license. If you find the model/code helpful, stars or likes would be appreciated. Thank you.