Anonlestia commited on Nov 3, 2024

Commit

6728d21

1 Parent(s): c442a81

Upload 18 files

Browse files

Files changed (18) hide show

Ov2Super/32kfix/f0Ov2Super32kD.pth +3 -0
Ov2Super/32kfix/f0Ov2Super32kG.pth +3 -0
Ov2Super/40k/f0Ov2Super40kD.pth +3 -0
Ov2Super/40k/f0Ov2Super40kG.pth +3 -0
Ov2Super/ov2.txt +18 -0
SnowieV3.1-X-RinE3-40K/D_Snowie-X-Rin_40k.pth +3 -0
SnowieV3.1-X-RinE3-40K/G_Snowie-X-Rin_40k.pth +3 -0
SnowieV3.1/32k/D_SnowieV3.1_32k.pth +3 -0
SnowieV3.1/32k/G_SnowieV3.1_32k.pth +3 -0
SnowieV3.1/40k/D_SnowieV3.1_40k.pth +3 -0
SnowieV3.1/40k/G_SnowieV3.1_40k.pth +3 -0
SnowieV3.1/48k/D_SnowieV3.1_48k.pth +3 -0
SnowieV3.1/48k/G_SnowieV3.1_48k.pth +3 -0
TITAN/Medium/D-f040k-TITAN-Medium.pth +3 -0
TITAN/Medium/D-f048k-TITAN-Medium.pth +3 -0
TITAN/Medium/G-f040k-TITAN-Medium.pth +3 -0
TITAN/Medium/G-f048k-TITAN-Medium.pth +3 -0
TITAN/README.md +187 -0

Ov2Super/32kfix/f0Ov2Super32kD.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2401113af8524bc5c0fe2221a81997b32f85db782f2271bb21d268f2fbf15c56
+size 857123266

Ov2Super/32kfix/f0Ov2Super32kG.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e4af1279fb8fd15af9eacbb41687fc695e74009e9dd0edc634b6296453324db4
+size 443230526

Ov2Super/40k/f0Ov2Super40kD.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e319e21eb26137803c62847857202bc43f833c0696be72bec395a74b3c2178aa
+size 857126469

Ov2Super/40k/f0Ov2Super40kG.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0562bdf6fa73197a503ceddf376a79f33727d64e19b1c6371ebfd6872bceccbf
+size 438183069

Ov2Super/ov2.txt ADDED Viewed

	@@ -0,0 +1,18 @@

+I'd like to present you a new test version of the pretrain Ov2! I've made it myself improving overall quality of it.
+It's made on 40k sample rate with 18 new voices, which gives it a lot of variety.
+To use it, you'll need to copy and paste both .pth files into pretrained_v2 folder of your RVC. Then, in your rvc you'll need to choose 40k sample rate V2 (all sample rate version will be released once i make the final version, this is just a test version). After you've preprocessed your dataset and extracted features, you'll have to put in the names of the pretrains as shown on the second screenshot (don't mismatch the G and D). After that, you can begin training your model.
+Here are rough guidelines for the model training on this pretrain:
+Minimum length for a ideal model is 1 minute, but you can train good models even with 10 seconds datasets on this pretrain
+The models trained on this model don't require huge epoch count: usually for 3-5 minutes 40-60 can be enough, for 1-3 minutes, 60-100 is enough and for 10-60 seconds 200-300 is enough
+Use clean datasets to get the best sounding results :3
+---
+3-5 minutes 40-60
+1-3 minutes 60-100
+10-60 seconds 200-300
+Минимальная продолжительность идеальной модели составляет 1 минуту, но вы можете обучать хорошие модели даже с 10-секундными наборами данных на этом предварительном тренинге
+Модели, обученные по этой модели, не требуют большого количества эпох: обычно для 3-5 минут может быть достаточно 40-60, для 1-3 минут достаточно 60-100, а для 10-60 секунд достаточно 200-300

SnowieV3.1-X-RinE3-40K/D_Snowie-X-Rin_40k.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:61f831b0150fb99b61d92873535fd44ac6d9e76df2da4d7fefc4390e5c3b94e6
+size 857125986

SnowieV3.1-X-RinE3-40K/G_Snowie-X-Rin_40k.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bf70da2f59cff4a414d3b10607cab15a3ab5d706e06e9b3bd1d653e6abe0f2d9
+size 438176910

SnowieV3.1/32k/D_SnowieV3.1_32k.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fd0e6067e06aee2962dd1600d0a7cb9ed3ab4da858003595feaed10b544c811e
+size 857123266

SnowieV3.1/32k/G_SnowieV3.1_32k.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:94f8172dc7d975fae748a40e0a4ad09148a40166fb01a9c9aecadacedbad246a
+size 443230526

SnowieV3.1/40k/D_SnowieV3.1_40k.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:33eb0b69d2eb1980105b044d7381d2c317652f011f7fa685a599aee846a68592
+size 857123266

SnowieV3.1/40k/G_SnowieV3.1_40k.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:95aeacf9ac4c39830fc19bec5f11d780734f73bd53ce681d7f21b891f7da69e7
+size 438167870

SnowieV3.1/48k/D_SnowieV3.1_48k.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:342c48ce64c691aeeec10be25e9821c42e9d81e64af454f58c4493807ae8530b
+size 857123266

SnowieV3.1/48k/G_SnowieV3.1_48k.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6883d5b2fe92f78e10ac7e87fed24a7d9ef53826835ebb908b2428003f0d1f92
+size 452323646

TITAN/Medium/D-f040k-TITAN-Medium.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bc804da752ba9cb8e3aaa90ad102edc0a6e7f033c90db53bfcce135f100d5ad1
+size 857119946

TITAN/Medium/D-f048k-TITAN-Medium.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4999a5b66aec0a9ab7e845063eeb242e47a611ab0e6260fb0b5ca44a7c5bbc44
+size 857126469

TITAN/Medium/G-f040k-TITAN-Medium.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:217de2fd349f48856aa6a6d9d1257b07a0addccde7b69fc7b22097f1fee5924d
+size 438156650

TITAN/Medium/G-f048k-TITAN-Medium.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:51a6dc93687f7e1a6051be3dec958289958c9c738d5c520e4fa956e7886ee153
+size 452338845

TITAN/README.md ADDED Viewed

	@@ -0,0 +1,187 @@

+---
+license: apache-2.0
+language:
+  - en
+tags:
+  - ai
+  - rvc
+  - vc
+  - voice-cloning
+  - applio
+  - titan
+  - pretrained
+datasets:
+  - blaise-tk/TITAN-Medium
+pipeline_tag: audio-to-audio
+---
+# TITAN: A Versatile, Robust, and High-Quality Pretrained Model for Retrieval-based Voice Conversion (RVC) Training
+## Overview
+TITAN is a state-of-the-art pretrained model designed for Retrieval-based Voice Conversion (https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/) training. It offers a robust solution for transforming voice characteristics from one speaker to another, providing high-quality results with minimal training effort.
+## Model Details
+### Titan-Medium
+- Training Environment: Utilized a RTX 3060 TI on Applio v3.1.1 (https://github.com/IAHispano/Applio), employing a batch size of 8 over a span of 3 weeks.
+- Iterations (48k): 1018660 Steps and 530 Epochs
+- Iterations (40k): 1010588 Steps and 467 Epochs
+- Iterations (32k): 1001469 Steps and 463 Epochs
+- Sampling rate: 48k, 40k, 32k
+- Fine-tuning Process: RVC v2 pretrained with pitch guidance, leveraging an 11.15-hour dataset sourced from Expresso (https://arxiv.org/abs/2308.05725) also available on [datasets/blaise-tk/TITAN-Medium](https://huggingface.co/datasets/blaise-tk/TITAN-Medium).
+#### Samples
+*Tests performed with a premature ckpt at ~700k steps doing all tests under the same conditions.*
+<table style="width:100%; text-align:center;">
+  <tr>
+    <th>Titan-Medium</th>
+    <th>Ov2</th>
+    <th>Ov2.1</th>
+  </tr>
+    <tr>
+    <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 1 - Titan.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+    <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 1 - Ov2.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+  </tr>
+  </tr>
+    <tr>
+    <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 2 - Titan.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+    <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 1 - Test 2 - Ov2.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+  </tr>
+  <tr>
+    <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 1 - Titan.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+    <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 1 - Ov2.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+  </tr>
+    <tr>
+    <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 2 - Titan.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+    <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 2 - Test 2 - Ov2.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+  </tr>
+  </tr>
+    <tr>
+    <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Titan.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+    <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Ov2.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+        <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 1 - Ov2.1.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+  </tr>
+  </tr>
+    <tr>
+    <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Titan.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+    <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Ov2.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+        <td>
+      <audio controls>
+        <source src="https://huggingface.co/blaise-tk/TITAN/resolve/main/demos/Model 3 - Test 2 - Ov2.1.wav?download=true" type="audio/wav">
+        Your browser does not support the audio element.
+      </audio>
+    </td>
+  </tr>
+</table>
+### Titan-Large
+- Details forthcoming...
+## Collaborators
+We appreciate the contributions of our collaborators who have helped in the development and refinement of TITAN.
+- Mustar
+- SimplCup
+- UnitedShoes
+## Beta Testers
+We extend our gratitude to the beta testers who provided valuable feedback during the testing phase of TITAN.
+- SimplCup
+- Leo_Frixi
+- Light
+- SCRFilms
+- Ryanz
+- Litsa_the_dancer
+## Citation
+Should you find TITAN beneficial for your research endeavors or projects, we kindly request citing our repository:
+```
+@article{titan,
+  title={TITAN: A Versatile, Robust, and High-Quality Pretrained Model for Retrieval-based Voice Conversion (RVC) Training},
+  author={Blaise},
+  journal={Hugging Face},
+  year={2024},
+  publisher={Blaise},
+  url={https://huggingface.co/blaise-tk/TITAN/}
+}
+```