Upload folder using huggingface_hub

Browse files

Files changed (4) hide show

README.md +8 -5
contentvec/BigVGAN/D_135810.pth +3 -0
contentvec/BigVGAN/G_135810.pth +3 -0
contentvec/BigVGAN/config.json +111 -0

README.md CHANGED Viewed

@@ -20,18 +20,21 @@ library_name: pytorch
 # Convbased
-Gtihub: [https://github.com/Convbased/Convbased-Studio](https://github.com/Convbased/Convbased-Studio)
-本项目专注于训练高质量的预训练底模，为语音转换任务提供强大的基础模型支持。
-| 特征提取 | 声码器 | 采样率40k | 采样率48k |
 |-----------|--------|-----|-----|
 | contentvec | hifigannsf | ❌ | ✅ |
 | contentvec | sifigan | ❌ | ✅ |
 | spin | hifigannsf | ❌ | ✅ |
 | spin | sifigan | ❌ | ✅ |
 | chinese-hubert-base | hifigannsf | ❌ | ✅ |
-| chinese-hubert-base | sifigan | 🧱 | 🧱 |
-*致力于推进中文语音合成技术的发展，该底模已用于微调大部分模型于 [Convbased Studio](https://weights.chat/)*

 # Convbased
+Github: [https://github.com/Convbased/Convbased-Studio](https://github.com/Convbased/Convbased-Studio)
+This project focuses on training high-quality pre-trained models.
+| Feature Extraction | Vocoder | Sample Rate 40k | Sample Rate 48k |
 |-----------|--------|-----|-----|
 | contentvec | hifigannsf | ❌ | ✅ |
 | contentvec | sifigan | ❌ | ✅ |
+| contentvec | bigvgan | ✅ | ❌ |
 | spin | hifigannsf | ❌ | ✅ |
 | spin | sifigan | ❌ | ✅ |
+| spin-v2 | bigvgan | 🧱 | ❌ |
 | chinese-hubert-base | hifigannsf | ❌ | ✅ |
+*Training code from [Applio](https://github.com/IAHispano/Applio)*
+*Dedicated to advancing Chinese speech synthesis technology. These base models have been used for fine-tuning most models at [Convbased Studio](https://weights.chat/)*

contentvec/BigVGAN/D_135810.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:57f00324dbc8145f89283d041dbe6f9178c85960499510f8ce2eab6f3b6385a9
+size 857123185

contentvec/BigVGAN/G_135810.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:45fcd2f5d8af5655aa327c374268b62672b9516f82f206a2f29e5fa0c358aae2
+size 438608285

contentvec/BigVGAN/config.json ADDED Viewed

	@@ -0,0 +1,111 @@

+{
+    "process_pids": [
+        4048,
+        4278,
+        9099,
+        10214,
+        11439,
+        12155,
+        14658,
+        1796,
+        1797,
+        1798,
+        5406,
+        5407,
+        5408,
+        5409,
+        5410,
+        5411,
+        14883,
+        14884,
+        14885,
+        14886,
+        14887,
+        14888,
+        49208,
+        49209,
+        49210,
+        49211,
+        49212,
+        49213,
+        57786,
+        57787,
+        57788,
+        57789,
+        57790,
+        57791
+    ],
+    "train": {
+        "log_interval": 200,
+        "seed": 1234,
+        "learning_rate": 0.0001,
+        "betas": [
+            0.8,
+            0.99
+        ],
+        "eps": 1e-09,
+        "lr_decay": 0.999875,
+        "segment_size": 12800,
+        "c_mel": 45,
+        "c_kl": 1.0
+    },
+    "data": {
+        "max_wav_value": 32768.0,
+        "sample_rate": 40000,
+        "filter_length": 2048,
+        "hop_length": 400,
+        "win_length": 2048,
+        "n_mel_channels": 125,
+        "mel_fmin": 0.0,
+        "mel_fmax": null
+    },
+    "model": {
+        "inter_channels": 192,
+        "hidden_channels": 192,
+        "filter_channels": 768,
+        "text_enc_hidden_dim": 768,
+        "n_heads": 2,
+        "n_layers": 6,
+        "kernel_size": 3,
+        "p_dropout": 0,
+        "resblock": "1",
+        "resblock_kernel_sizes": [
+            3,
+            7,
+            11
+        ],
+        "resblock_dilation_sizes": [
+            [
+                1,
+                3,
+                5
+            ],
+            [
+                1,
+                3,
+                5
+            ],
+            [
+                1,
+                3,
+                5
+            ]
+        ],
+        "upsample_rates": [
+            10,
+            10,
+            2,
+            2
+        ],
+        "upsample_initial_channel": 512,
+        "upsample_kernel_sizes": [
+            16,
+            16,
+            4,
+            4
+        ],
+        "use_spectral_norm": false,
+        "gin_channels": 256,
+        "spk_embed_dim": 109
+    }
+}