Commit ·
6728d21
1
Parent(s): c442a81
Upload 18 files
Browse files- Ov2Super/32kfix/f0Ov2Super32kD.pth +3 -0
- Ov2Super/32kfix/f0Ov2Super32kG.pth +3 -0
- Ov2Super/40k/f0Ov2Super40kD.pth +3 -0
- Ov2Super/40k/f0Ov2Super40kG.pth +3 -0
- Ov2Super/ov2.txt +18 -0
- SnowieV3.1-X-RinE3-40K/D_Snowie-X-Rin_40k.pth +3 -0
- SnowieV3.1-X-RinE3-40K/G_Snowie-X-Rin_40k.pth +3 -0
- SnowieV3.1/32k/D_SnowieV3.1_32k.pth +3 -0
- SnowieV3.1/32k/G_SnowieV3.1_32k.pth +3 -0
- SnowieV3.1/40k/D_SnowieV3.1_40k.pth +3 -0
- SnowieV3.1/40k/G_SnowieV3.1_40k.pth +3 -0
- SnowieV3.1/48k/D_SnowieV3.1_48k.pth +3 -0
- SnowieV3.1/48k/G_SnowieV3.1_48k.pth +3 -0
- TITAN/Medium/D-f040k-TITAN-Medium.pth +3 -0
- TITAN/Medium/D-f048k-TITAN-Medium.pth +3 -0
- TITAN/Medium/G-f040k-TITAN-Medium.pth +3 -0
- TITAN/Medium/G-f048k-TITAN-Medium.pth +3 -0
- TITAN/README.md +187 -0
Ov2Super/32kfix/f0Ov2Super32kD.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2401113af8524bc5c0fe2221a81997b32f85db782f2271bb21d268f2fbf15c56
|
| 3 |
+
size 857123266
|
Ov2Super/32kfix/f0Ov2Super32kG.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e4af1279fb8fd15af9eacbb41687fc695e74009e9dd0edc634b6296453324db4
|
| 3 |
+
size 443230526
|
Ov2Super/40k/f0Ov2Super40kD.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e319e21eb26137803c62847857202bc43f833c0696be72bec395a74b3c2178aa
|
| 3 |
+
size 857126469
|
Ov2Super/40k/f0Ov2Super40kG.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0562bdf6fa73197a503ceddf376a79f33727d64e19b1c6371ebfd6872bceccbf
|
| 3 |
+
size 438183069
|
Ov2Super/ov2.txt
ADDED
|
@@ -0,0 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
I'd like to present you a new test version of the pretrain Ov2! I've made it myself improving overall quality of it.
|
| 2 |
+
|
| 3 |
+
It's made on 40k sample rate with 18 new voices, which gives it a lot of variety.
|
| 4 |
+
|
| 5 |
+
To use it, you'll need to copy and paste both .pth files into pretrained_v2 folder of your RVC. Then, in your rvc you'll need to choose 40k sample rate V2 (all sample rate version will be released once i make the final version, this is just a test version). After you've preprocessed your dataset and extracted features, you'll have to put in the names of the pretrains as shown on the second screenshot (don't mismatch the G and D). After that, you can begin training your model.
|
| 6 |
+
|
| 7 |
+
Here are rough guidelines for the model training on this pretrain:
|
| 8 |
+
Minimum length for a ideal model is 1 minute, but you can train good models even with 10 seconds datasets on this pretrain
|
| 9 |
+
The models trained on this model don't require huge epoch count: usually for 3-5 minutes 40-60 can be enough, for 1-3 minutes, 60-100 is enough and for 10-60 seconds 200-300 is enough
|
| 10 |
+
Use clean datasets to get the best sounding results :3
|
| 11 |
+
|
| 12 |
+
---
|
| 13 |
+
3-5 minutes 40-60
|
| 14 |
+
1-3 minutes 60-100
|
| 15 |
+
10-60 seconds 200-300
|
| 16 |
+
|
| 17 |
+
Минимальная продолжительность идеальной модели составляет 1 минуту, но вы можете обучать хорошие модели даже с 10-секундными наборами данных на этом предварительном тренинге
|
| 18 |
+
Модели, обученные по этой модели, не требуют большого количества эпох: обычно для 3-5 минут может быть достаточно 40-60, для 1-3 минут достаточно 60-100, а для 10-60 секунд достаточно 200-300
|
SnowieV3.1-X-RinE3-40K/D_Snowie-X-Rin_40k.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:61f831b0150fb99b61d92873535fd44ac6d9e76df2da4d7fefc4390e5c3b94e6
|
| 3 |
+
size 857125986
|
SnowieV3.1-X-RinE3-40K/G_Snowie-X-Rin_40k.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bf70da2f59cff4a414d3b10607cab15a3ab5d706e06e9b3bd1d653e6abe0f2d9
|
| 3 |
+
size 438176910
|
SnowieV3.1/32k/D_SnowieV3.1_32k.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fd0e6067e06aee2962dd1600d0a7cb9ed3ab4da858003595feaed10b544c811e
|
| 3 |
+
size 857123266
|
SnowieV3.1/32k/G_SnowieV3.1_32k.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:94f8172dc7d975fae748a40e0a4ad09148a40166fb01a9c9aecadacedbad246a
|
| 3 |
+
size 443230526
|
SnowieV3.1/40k/D_SnowieV3.1_40k.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:33eb0b69d2eb1980105b044d7381d2c317652f011f7fa685a599aee846a68592
|
| 3 |
+
size 857123266
|
SnowieV3.1/40k/G_SnowieV3.1_40k.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:95aeacf9ac4c39830fc19bec5f11d780734f73bd53ce681d7f21b891f7da69e7
|
| 3 |
+
size 438167870
|
SnowieV3.1/48k/D_SnowieV3.1_48k.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:342c48ce64c691aeeec10be25e9821c42e9d81e64af454f58c4493807ae8530b
|
| 3 |
+
size 857123266
|
SnowieV3.1/48k/G_SnowieV3.1_48k.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6883d5b2fe92f78e10ac7e87fed24a7d9ef53826835ebb908b2428003f0d1f92
|
| 3 |
+
size 452323646
|
TITAN/Medium/D-f040k-TITAN-Medium.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bc804da752ba9cb8e3aaa90ad102edc0a6e7f033c90db53bfcce135f100d5ad1
|
| 3 |
+
size 857119946
|
TITAN/Medium/D-f048k-TITAN-Medium.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4999a5b66aec0a9ab7e845063eeb242e47a611ab0e6260fb0b5ca44a7c5bbc44
|
| 3 |
+
size 857126469
|
TITAN/Medium/G-f040k-TITAN-Medium.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:217de2fd349f48856aa6a6d9d1257b07a0addccde7b69fc7b22097f1fee5924d
|
| 3 |
+
size 438156650
|
TITAN/Medium/G-f048k-TITAN-Medium.pth
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:51a6dc93687f7e1a6051be3dec958289958c9c738d5c520e4fa956e7886ee153
|
| 3 |
+
size 452338845
|
TITAN/README.md
ADDED
|
@@ -0,0 +1,187 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
tags:
|
| 6 |
+
- ai
|
| 7 |
+
- rvc
|
| 8 |
+
- vc
|
| 9 |
+
- voice-cloning
|
| 10 |
+
- applio
|
| 11 |
+
- titan
|
| 12 |
+
- pretrained
|
| 13 |
+
datasets:
|
| 14 |
+
- blaise-tk/TITAN-Medium
|
| 15 |
+
pipeline_tag: audio-to-audio
|
| 16 |
+
---
|
| 17 |
+
|
| 18 |
+
# TITAN: A Versatile, Robust, and High-Quality Pretrained Model for Retrieval-based Voice Conversion (RVC) Training
|
| 19 |
+
|
| 20 |
+
## Overview
|
| 21 |
+
|
| 22 |
+
TITAN is a state-of-the-art pretrained model designed for Retrieval-based Voice Conversion (https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/) training. It offers a robust solution for transforming voice characteristics from one speaker to another, providing high-quality results with minimal training effort.
|
| 23 |
+
|
| 24 |
+
## Model Details
|
| 25 |
+
|
| 26 |
+
### Titan-Medium
|
| 27 |
+
|
| 28 |
+
- Training Environment: Utilized a RTX 3060 TI on Applio v3.1.1 (https://github.com/IAHispano/Applio), employing a batch size of 8 over a span of 3 weeks.
|
| 29 |
+
- Iterations (48k): 1018660 Steps and 530 Epochs
|
| 30 |
+
- Iterations (40k): 1010588 Steps and 467 Epochs
|
| 31 |
+
- Iterations (32k): 1001469 Steps and 463 Epochs
|
| 32 |
+
- Sampling rate: 48k, 40k, 32k
|
| 33 |
+
- Fine-tuning Process: RVC v2 pretrained with pitch guidance, leveraging an 11.15-hour dataset sourced from Expresso (https://arxiv.org/abs/2308.05725) also available on [datasets/blaise-tk/TITAN-Medium](https://huggingface.co/datasets/blaise-tk/TITAN-Medium).
|
| 34 |
+
|
| 35 |
+
#### Samples
|
| 36 |
+
*Tests performed with a premature ckpt at ~700k steps doing all tests under the same conditions.*
|
| 37 |
+
|
| 38 |
+
<table style="width:100%; text-align:center;">
|
| 39 |
+
<tr>
|
| 40 |
+
<th>Titan-Medium</th>
|
| 41 |
+
<th>Ov2</th>
|
| 42 |
+
<th>Ov2.1</th>
|
| 43 |
+
</tr>
|
| 44 |
+
<tr>
|
| 45 |
+
<td>
|
| 46 |
+
<audio controls>
|
| 47 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 1 - Titan.wav?download=true" type="audio/wav">
|
| 48 |
+
Your browser does not support the audio element.
|
| 49 |
+
</audio>
|
| 50 |
+
</td>
|
| 51 |
+
<td>
|
| 52 |
+
<audio controls>
|
| 53 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 1 - Ov2.wav?download=true" type="audio/wav">
|
| 54 |
+
Your browser does not support the audio element.
|
| 55 |
+
</audio>
|
| 56 |
+
</td>
|
| 57 |
+
</tr>
|
| 58 |
+
|
| 59 |
+
</tr>
|
| 60 |
+
<tr>
|
| 61 |
+
<td>
|
| 62 |
+
<audio controls>
|
| 63 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 2 - Titan.wav?download=true" type="audio/wav">
|
| 64 |
+
Your browser does not support the audio element.
|
| 65 |
+
</audio>
|
| 66 |
+
</td>
|
| 67 |
+
<td>
|
| 68 |
+
<audio controls>
|
| 69 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 2 - Ov2.wav?download=true" type="audio/wav">
|
| 70 |
+
Your browser does not support the audio element.
|
| 71 |
+
</audio>
|
| 72 |
+
</td>
|
| 73 |
+
</tr>
|
| 74 |
+
|
| 75 |
+
<tr>
|
| 76 |
+
<td>
|
| 77 |
+
<audio controls>
|
| 78 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 1 - Titan.wav?download=true" type="audio/wav">
|
| 79 |
+
Your browser does not support the audio element.
|
| 80 |
+
</audio>
|
| 81 |
+
</td>
|
| 82 |
+
<td>
|
| 83 |
+
<audio controls>
|
| 84 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 1 - Ov2.wav?download=true" type="audio/wav">
|
| 85 |
+
Your browser does not support the audio element.
|
| 86 |
+
</audio>
|
| 87 |
+
</td>
|
| 88 |
+
|
| 89 |
+
</tr>
|
| 90 |
+
<tr>
|
| 91 |
+
<td>
|
| 92 |
+
<audio controls>
|
| 93 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 2 - Titan.wav?download=true" type="audio/wav">
|
| 94 |
+
Your browser does not support the audio element.
|
| 95 |
+
</audio>
|
| 96 |
+
</td>
|
| 97 |
+
<td>
|
| 98 |
+
<audio controls>
|
| 99 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 2 - Ov2.wav?download=true" type="audio/wav">
|
| 100 |
+
Your browser does not support the audio element.
|
| 101 |
+
</audio>
|
| 102 |
+
</td>
|
| 103 |
+
</tr>
|
| 104 |
+
|
| 105 |
+
</tr>
|
| 106 |
+
<tr>
|
| 107 |
+
<td>
|
| 108 |
+
<audio controls>
|
| 109 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Titan.wav?download=true" type="audio/wav">
|
| 110 |
+
Your browser does not support the audio element.
|
| 111 |
+
</audio>
|
| 112 |
+
</td>
|
| 113 |
+
<td>
|
| 114 |
+
<audio controls>
|
| 115 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Ov2.wav?download=true" type="audio/wav">
|
| 116 |
+
Your browser does not support the audio element.
|
| 117 |
+
</audio>
|
| 118 |
+
</td>
|
| 119 |
+
<td>
|
| 120 |
+
<audio controls>
|
| 121 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Ov2.1.wav?download=true" type="audio/wav">
|
| 122 |
+
Your browser does not support the audio element.
|
| 123 |
+
</audio>
|
| 124 |
+
</td>
|
| 125 |
+
</tr>
|
| 126 |
+
|
| 127 |
+
</tr>
|
| 128 |
+
<tr>
|
| 129 |
+
<td>
|
| 130 |
+
<audio controls>
|
| 131 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Titan.wav?download=true" type="audio/wav">
|
| 132 |
+
Your browser does not support the audio element.
|
| 133 |
+
</audio>
|
| 134 |
+
</td>
|
| 135 |
+
<td>
|
| 136 |
+
<audio controls>
|
| 137 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Ov2.wav?download=true" type="audio/wav">
|
| 138 |
+
Your browser does not support the audio element.
|
| 139 |
+
</audio>
|
| 140 |
+
</td>
|
| 141 |
+
<td>
|
| 142 |
+
<audio controls>
|
| 143 |
+
<source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Ov2.1.wav?download=true" type="audio/wav">
|
| 144 |
+
Your browser does not support the audio element.
|
| 145 |
+
</audio>
|
| 146 |
+
</td>
|
| 147 |
+
</tr>
|
| 148 |
+
|
| 149 |
+
</table>
|
| 150 |
+
|
| 151 |
+
### Titan-Large
|
| 152 |
+
|
| 153 |
+
- Details forthcoming...
|
| 154 |
+
|
| 155 |
+
## Collaborators
|
| 156 |
+
|
| 157 |
+
We appreciate the contributions of our collaborators who have helped in the development and refinement of TITAN.
|
| 158 |
+
|
| 159 |
+
- Mustar
|
| 160 |
+
- SimplCup
|
| 161 |
+
- UnitedShoes
|
| 162 |
+
|
| 163 |
+
## Beta Testers
|
| 164 |
+
|
| 165 |
+
We extend our gratitude to the beta testers who provided valuable feedback during the testing phase of TITAN.
|
| 166 |
+
|
| 167 |
+
- SimplCup
|
| 168 |
+
- Leo_Frixi
|
| 169 |
+
- Light
|
| 170 |
+
- SCRFilms
|
| 171 |
+
- Ryanz
|
| 172 |
+
- Litsa_the_dancer
|
| 173 |
+
|
| 174 |
+
## Citation
|
| 175 |
+
|
| 176 |
+
Should you find TITAN beneficial for your research endeavors or projects, we kindly request citing our repository:
|
| 177 |
+
|
| 178 |
+
```
|
| 179 |
+
@article{titan,
|
| 180 |
+
title={TITAN: A Versatile, Robust, and High-Quality Pretrained Model for Retrieval-based Voice Conversion (RVC) Training},
|
| 181 |
+
author={Blaise},
|
| 182 |
+
journal={Hugging Face},
|
| 183 |
+
year={2024},
|
| 184 |
+
publisher={Blaise},
|
| 185 |
+
url={https://huggingface.co/blaise-tk/TITAN/}
|
| 186 |
+
}
|
| 187 |
+
```
|