Buckets:
1.18 GB
11 files
Updated 7 days ago
Ctrl+K
| Name | Size | Uploaded | Xet hash |
|---|---|---|---|
| vocoder | 2 items | ||
| .gitattributes | 1.52 kB xet | 818ba6de | |
| README.md | 1.46 kB xet | 2e16ba82 | |
| config.json | 697 Bytes xet | 54f8d8ef | |
| fm_decoder.onnx | 478 MB xet | 6be1f170 | |
| fm_decoder_int8.onnx | 125 MB xet | a815b50f | |
| model.pt | 491 MB xet | 879c4032 | |
| text_encoder.onnx | 17.6 MB xet | 2747560b | |
| text_encoder_int8.onnx | 5.57 MB xet | 3faae6fd | |
| tokens.txt | 2.57 kB xet | 80444e4c |
LuxTTS
This is the model for LuxTTS, a lightweight zipvoice based text-to-speech model designed for high quality voice cloning and realistic generation at speeds exceeding 150x realtime.
Main features
- Voice cloning: SOTA voice cloning on par with models 10x larger.
- Clarity: Clear 48khz speech generation unlike most TTS models which are limited to 24khz.
- Speed: Reaches speeds of 150x realtime on a single GPU and faster then realtime on CPU's as well.
- Efficiency: Fits within 1gb vram meaning it can fit in any local gpu.
Details
- Based on ZipVoice, distilled to 4steps.
- Uses 48khz vocoder instead of 24khz vocoder.
- Implemented higher quality sampling technique then standard euler.
Usage
Please check out the repo for usage: https://github.com/ysharma3501/LuxTTS.git
License
Model and code is released under Apache-2.0 license.
If you find the model/code helpful, stars or likes would be appreciated. Thank you.
- Total size
- 1.18 GB
- Files
- 11
- Last updated
- Jun 19
- Pre-warmed CDN
- US EU US EU