| | --- |
| | license: apache-2.0 |
| | language: |
| | - en |
| | tags: |
| | - ai |
| | - rvc |
| | - vc |
| | - voice-cloning |
| | - applio |
| | - titan |
| | - pretrained |
| | datasets: |
| | - blaise-tk/TITAN-Medium |
| | pipeline_tag: audio-to-audio |
| | --- |
| | |
| | # TITAN: A Versatile, Robust, and High-Quality Pretrained Model for Retrieval-based Voice Conversion (RVC) Training |
| |
|
| | ## Overview |
| |
|
| | TITAN is a state-of-the-art pretrained model designed for Retrieval-based Voice Conversion (https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/) training. It offers a robust solution for transforming voice characteristics from one speaker to another, providing high-quality results with minimal training effort. |
| |
|
| | ## Model Details |
| |
|
| | ### Titan-Medium |
| |
|
| | - Training Environment: Utilized a RTX 3060 TI on Applio v3.1.1 (https://github.com/IAHispano/Applio), employing a batch size of 8 over a span of 3 weeks. |
| | - Iterations (48k): 1018660 Steps and 530 Epochs |
| | - Iterations (40k): 1010588 Steps and 467 Epochs |
| | - Iterations (32k): 1001469 Steps and 463 Epochs |
| | - Sampling rate: 48k, 40k, 32k |
| | - Fine-tuning Process: RVC v2 pretrained with pitch guidance, leveraging an 11.15-hour dataset sourced from Expresso (https://arxiv.org/abs/2308.05725) also available on [datasets/blaise-tk/TITAN-Medium](https://huggingface.co/datasets/blaise-tk/TITAN-Medium). |
| |
|
| | #### Samples |
| | *Tests performed with a premature ckpt at ~700k steps doing all tests under the same conditions.* |
| |
|
| | <table style="width:100%; text-align:center;"> |
| | <tr> |
| | <th>Titan-Medium</th> |
| | <th>Ov2</th> |
| | <th>Ov2.1</th> |
| | </tr> |
| | <tr> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 1 - Titan.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 1 - Ov2.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | </tr> |
| | |
| | </tr> |
| | <tr> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 2 - Titan.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 2 - Ov2.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | </tr> |
| | |
| | <tr> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 1 - Titan.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 1 - Ov2.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | |
| | </tr> |
| | <tr> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 2 - Titan.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 2 - Ov2.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | </tr> |
| | |
| | </tr> |
| | <tr> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Titan.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Ov2.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Ov2.1.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | </tr> |
| | |
| | </tr> |
| | <tr> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Titan.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Ov2.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | <td> |
| | <audio controls> |
| | <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Ov2.1.wav?download=true" type="audio/wav"> |
| | Your browser does not support the audio element. |
| | </audio> |
| | </td> |
| | </tr> |
| | |
| | </table> |
| |
|
| | ### Titan-Large |
| |
|
| | - Details forthcoming... |
| |
|
| | ## Collaborators |
| |
|
| | We appreciate the contributions of our collaborators who have helped in the development and refinement of TITAN. |
| |
|
| | - Mustar |
| | - SimplCup |
| | - UnitedShoes |
| |
|
| | ## Beta Testers |
| |
|
| | We extend our gratitude to the beta testers who provided valuable feedback during the testing phase of TITAN. |
| |
|
| | - SimplCup |
| | - Leo_Frixi |
| | - Light |
| | - SCRFilms |
| | - Ryanz |
| | - Litsa_the_dancer |
| | |
| | ## Citation |
| | |
| | Should you find TITAN beneficial for your research endeavors or projects, we kindly request citing our repository: |
| | |
| | ``` |
| | @article{titan, |
| | title={TITAN: A Versatile, Robust, and High-Quality Pretrained Model for Retrieval-based Voice Conversion (RVC) Training}, |
| | author={Blaise}, |
| | journal={Hugging Face}, |
| | year={2024}, |
| | publisher={Blaise}, |
| | url={https://huggingface.co/blaise-tk/TITAN/} |
| | } |
| | ``` |
| | |