GFengG commited on
Commit
cb9ce06
·
verified ·
1 Parent(s): f346666

Add files using upload-large-folder tool

Browse files
Files changed (29) hide show
  1. README.md +274 -3
  2. avatar_multi/config.json +37 -0
  3. avatar_multi/diffusion_pytorch_model-00001-of-00006.safetensors +3 -0
  4. avatar_multi/diffusion_pytorch_model-00002-of-00006.safetensors +3 -0
  5. avatar_multi/diffusion_pytorch_model-00003-of-00006.safetensors +3 -0
  6. avatar_multi/diffusion_pytorch_model-00004-of-00006.safetensors +3 -0
  7. avatar_multi/diffusion_pytorch_model-00005-of-00006.safetensors +3 -0
  8. avatar_multi/diffusion_pytorch_model-00006-of-00006.safetensors +3 -0
  9. avatar_multi/diffusion_pytorch_model.safetensors.index.json +0 -0
  10. avatar_single/config.json +37 -0
  11. avatar_single/diffusion_pytorch_model-00001-of-00006.safetensors +3 -0
  12. avatar_single/diffusion_pytorch_model-00002-of-00006.safetensors +3 -0
  13. avatar_single/diffusion_pytorch_model-00003-of-00006.safetensors +3 -0
  14. avatar_single/diffusion_pytorch_model-00004-of-00006.safetensors +3 -0
  15. avatar_single/diffusion_pytorch_model-00005-of-00006.safetensors +3 -0
  16. avatar_single/diffusion_pytorch_model-00006-of-00006.safetensors +3 -0
  17. avatar_single/diffusion_pytorch_model.safetensors.index.json +0 -0
  18. chinese-wav2vec2-base/.gitattributes +27 -0
  19. chinese-wav2vec2-base/README.md +61 -0
  20. chinese-wav2vec2-base/chinese-wav2vec2-base-fairseq-ckpt.pt +3 -0
  21. chinese-wav2vec2-base/config.json +105 -0
  22. chinese-wav2vec2-base/preprocessor_config.json +8 -0
  23. chinese-wav2vec2-base/pytorch_model.bin +3 -0
  24. config.json +3 -0
  25. model_index.json +3 -0
  26. vocal_separator/Kim_Vocal_2.onnx +3 -0
  27. vocal_separator/download_checks.json +289 -0
  28. vocal_separator/mdx_model_data.json +482 -0
  29. vocal_separator/vr_model_data.json +137 -0
README.md CHANGED
@@ -1,3 +1,274 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # LongCat-Video
2
+
3
+ <div align="center">
4
+ <img src="assets/longcat-video_logo.svg" width="45%" alt="LongCat-Video" />
5
+ </div>
6
+ <hr>
7
+
8
+ <div align="center" style="line-height: 1;">
9
+ <a href='https://meituan-longcat.github.io/LongCat-Video/'><img src='https://img.shields.io/badge/Project-Page-green'></a>
10
+ <a href='https://arxiv.org/abs/2510.22200'><img src='https://img.shields.io/badge/Technique-Report-red'></a>
11
+ <a href='https://huggingface.co/meituan-longcat/LongCat-Video'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue'></a>
12
+ </div>
13
+
14
+ <div align="center" style="line-height: 1;">
15
+ <a href='https://github.com/meituan-longcat/LongCat-Flash-Chat/blob/main/figures/wechat_official_accounts.png'><img src='https://img.shields.io/badge/WeChat-LongCat-brightgreen?logo=wechat&logoColor=white'></a>
16
+ <a href='https://x.com/Meituan_LongCat'><img src='https://img.shields.io/badge/Twitter-LongCat-white?logo=x&logoColor=white'></a>
17
+ </div>
18
+
19
+ <div align="center" style="line-height: 1;">
20
+ <a href='LICENSE'><img src='https://img.shields.io/badge/License-MIT-f5de53?&color=f5de53'></a>
21
+ </div>
22
+
23
+ ## Model Introduction
24
+ We introduce LongCat-Video, a foundational video generation model with 13.6B parameters, delivering strong performance across *Text-to-Video*, *Image-to-Video*, and *Video-Continuation* generation tasks. It particularly excels in efficient and high-quality long video generation, representing our first step toward world models.
25
+
26
+ ### Key Features
27
+ - 🌟 **Unified architecture for multiple tasks**: LongCat-Video unifies *Text-to-Video*, *Image-to-Video*, and *Video-Continuation* tasks within a single video generation framework. It natively supports all these tasks with a single model and consistently delivers strong performance across each individual task.
28
+ - 🌟 **Long video generation**: LongCat-Video is natively pretrained on *Video-Continuation* tasks, enabling it to produce minutes-long videos without color drifting or quality degradation.
29
+ - 🌟 **Efficient inference**: LongCat-Video generates $720p$, $30fps$ videos within minutes by employing a coarse-to-fine generation strategy along both the temporal and spatial axes. Block Sparse Attention further enhances efficiency, particularly at high resolutions
30
+ - 🌟 **Strong performance with multi-reward RLHF**: Powered by multi-reward Group Relative Policy Optimization (GRPO), comprehensive evaluations on both internal and public benchmarks demonstrate that LongCat-Video achieves performance comparable to leading open-source video generation models as well as the latest commercial solutions.
31
+
32
+ For more detail, please refer to the comprehensive [***LongCat-Video Technical Report***](https://arxiv.org/abs/2510.22200).
33
+
34
+ ## 🎥 Teaser Video
35
+
36
+ <div align="center">
37
+ <video src="https://github.com/user-attachments/assets/00fa63f0-9c4e-461a-a79e-c662ad596d7d" width="2264" height="384"> </video>
38
+ </div>
39
+
40
+ ## 🔥 Latest News!!
41
+ - Dec 12, 2025: 🚀 We propose Longcat-Video-Avatar, an unified model for expressive and highly dynamic speech-driven character animation. It natively supports *Text-Speech-to-Video*, *Image-Speech-to-Video*, and *Video-Continuation* tasks, and is compatible with both *single-stream* and *multi-stream* speech inputs. We are releasing the [***LongCat-Video-Avatar Technical Report***](https://arxiv.org/abs/2510.22200) and inference code, and you can find the model weights at 🤗 [Huggingface](https://huggingface.co/meituan-longcat/LongCat-Video) !
42
+ - Oct 25, 2025: 🚀 We've released LongCat-Video, a foundational video generation model. Tech report and models are available at [***LongCat-Video Technical Report***](https://arxiv.org/abs/2510.22200) and 🤗 [Huggingface](https://huggingface.co/meituan-longcat/LongCat-Video) !
43
+
44
+
45
+
46
+ ## Quick Start
47
+
48
+ ### Installation
49
+
50
+ Clone the repo:
51
+
52
+ ```shell
53
+ git clone --single-branch --branch main https://github.com/meituan-longcat/LongCat-Video
54
+ cd LongCat-Video
55
+ ```
56
+
57
+ Install dependencies:
58
+
59
+ ```shell
60
+ # create conda environment
61
+ conda create -n longcat-video python=3.10
62
+ conda activate longcat-video
63
+
64
+ # install torch (configure according to your CUDA version)
65
+ pip install torch==2.6.0+cu124 torchvision==0.21.0+cu124 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu124
66
+
67
+ # install flash-attn-2
68
+ pip install ninja
69
+ pip install psutil
70
+ pip install packaging
71
+ pip install flash_attn==2.7.4.post1
72
+
73
+ # install other requirements
74
+ pip install -r requirements.txt
75
+
76
+ # install longcat-video-avatar requirements
77
+ pip install -r requirements_avatar.txt
78
+ conda install -c conda-forge librosa
79
+ conda install -c conda-forge ffmpeg
80
+ ```
81
+
82
+ FlashAttention-2 is enabled in the model config by default; you can also change the model config ("./weights/LongCat-Video/dit/config.json") to use FlashAttention-3 or xformers once installed.
83
+
84
+ ### Model Download
85
+
86
+ | Models | Description | Download Link |
87
+ | --- | --- | --- |
88
+ | LongCat-Video | foundational video generation | 🤗 [Huggingface](https://huggingface.co/meituan-longcat/LongCat-Video) |
89
+ | LongCat-Video-Avatar-Single | single-character speech-driven video generation | 🤗 [Huggingface](xxx) |
90
+ | LongCat-Video-Avatar-Multi | multi-character speech-driven video generation | 🤗 [Huggingface](xxx) |
91
+
92
+ Download models using huggingface-cli:
93
+ ```shell
94
+ pip install "huggingface_hub[cli]"
95
+ huggingface-cli download meituan-longcat/LongCat-Video --local-dir ./weights/LongCat-Video
96
+ huggingface-cli download meituan-longcat/LongCat-Video-Avatar --local-dir ./weights/LongCat-Video-Avatar
97
+ ```
98
+
99
+ ### Run Text-to-Video
100
+
101
+ ```shell
102
+ # Single-GPU inference
103
+ torchrun run_demo_text_to_video.py --checkpoint_dir=./weights/LongCat-Video --enable_compile
104
+
105
+ # Multi-GPU inference
106
+ torchrun --nproc_per_node=2 run_demo_text_to_video.py --context_parallel_size=2 --checkpoint_dir=./weights/LongCat-Video --enable_compile
107
+ ```
108
+
109
+ ### Run Image-to-Video
110
+
111
+ ```shell
112
+ # Single-GPU inference
113
+ torchrun run_demo_image_to_video.py --checkpoint_dir=./weights/LongCat-Video --enable_compile
114
+
115
+ # Multi-GPU inference
116
+ torchrun --nproc_per_node=2 run_demo_image_to_video.py --context_parallel_size=2 --checkpoint_dir=./weights/LongCat-Video --enable_compile
117
+ ```
118
+
119
+ ### Run Video-Continuation
120
+
121
+ ```shell
122
+ # Single-GPU inference
123
+ torchrun run_demo_video_continuation.py --checkpoint_dir=./weights/LongCat-Video --enable_compile
124
+
125
+ # Multi-GPU inference
126
+ torchrun --nproc_per_node=2 run_demo_video_continuation.py --context_parallel_size=2 --checkpoint_dir=./weights/LongCat-Video --enable_compile
127
+ ```
128
+
129
+ ### Run Long-Video Generation
130
+
131
+ ```shell
132
+ # Single-GPU inference
133
+ torchrun run_demo_long_video.py --checkpoint_dir=./weights/LongCat-Video --enable_compile
134
+
135
+ # Multi-GPU inference
136
+ torchrun --nproc_per_node=2 run_demo_long_video.py --context_parallel_size=2 --checkpoint_dir=./weights/LongCat-Video --enable_compile
137
+ ```
138
+
139
+ ### Run Interactive Video Generation
140
+
141
+ ```shell
142
+ # Single-GPU inference
143
+ torchrun run_demo_interactive_video.py --checkpoint_dir=./weights/LongCat-Video --enable_compile
144
+
145
+ # Multi-GPU inference
146
+ torchrun --nproc_per_node=2 run_demo_interactive_video.py --context_parallel_size=2 --checkpoint_dir=./weights/LongCat-Video --enable_compile
147
+ ```
148
+
149
+ ### Run LongCat-Video-Avatar
150
+ 💡 Some tips
151
+ > - Lip synchronization accuracy:​​ Audio CFG works optimally between 3–5. Increase the audio CFG value for better synchronization.
152
+ > - Prompt Enhancement: Incorporate talking-related descriptions in the prompt to better guide lip movements.
153
+ > - Mitigate repeated actions: Setting the reference image index(--ref_img_index, default to 10) between 0 and 24 ensures better consistency, while selecting other ranges helps reduce repeated actions. Additionally, increasing the mask frame range (--mask_frame_range, default to 0, and 0 or 1 is recommended) can further help mitigate repeated actions.
154
+ > - Super resolution: Our model is compatible with both 480P and 720P, which can be controlled via --resolution.
155
+
156
+ - Single-Speech-to-Video Generation
157
+ ```shell
158
+ # Speech-Text-to-Video
159
+ torchrun --nproc_per_node=2 run_demo_avatar_single_speech_to_video.py --context_parallel_size=2 --checkpoint_dir=./weights/LongCat-Video --stage_1=at2v --input_json=assets/avatar/single_example_1.json
160
+
161
+ # Speech-Image-to-Video
162
+ torchrun --nproc_per_node=2 run_demo_avatar_single_speech_to_video.py --context_parallel_size=2 --checkpoint_dir=./weights/LongCat-Video --stage_1=ai2v --input_json=assets/avatar/single_example_1.json
163
+
164
+ # Speech-Text-to-Video and Video-Continuation
165
+ torchrun --nproc_per_node=2 run_demo_avatar_single_speech_to_video.py --context_parallel_size=2 --checkpoint_dir=./weights/LongCat-Video --stage_1=at2v --input_json=assets/avatar/single_example_1.json --num_segments=5 --ref_img_index=30 --mask_frame_range=0
166
+
167
+ # Speech-Image-to-Video and Video-Continuation
168
+ torchrun --nproc_per_node=2 run_demo_avatar_single_speech_to_video.py --context_parallel_size=2 --checkpoint_dir=./weights/LongCat-Video --stage_1=ai2v --input_json=assets/avatar/single_example_1.json --num_segments=5 --ref_img_index=30 --mask_frame_range=0
169
+ ```
170
+
171
+ - Multi-Speech-to-Video Generation
172
+ ```shell
173
+ # Speech-Image-to-Video
174
+ torchrun --nproc_per_node=2 run_demo_avatar_multi_speech_to_video.py --context_parallel_size=2 --checkpoint_dir=./weights/LongCat-Video --input_json=assets/avatar/multi_example_1.json
175
+
176
+ # Speech-Image-to-Video and Video-Continuation
177
+ torchrun --nproc_per_node=2 run_demo_avatar_multi_speech_to_video.py --context_parallel_size=2 --checkpoint_dir=./weights/LongCat-Video --input_json=assets/avatar/multi_example_1.json --num_segments=5 --ref_img_index=30 --mask_frame_range=0
178
+ ```
179
+
180
+ ### Run Streamlit
181
+
182
+ ```shell
183
+ # Single-GPU inference
184
+ streamlit run ./run_streamlit.py --server.fileWatcherType none --server.headless=false
185
+ ```
186
+
187
+
188
+
189
+ ## Evaluation Results
190
+
191
+ ### Text-to-Video
192
+ The *Text-to-Video* MOS evaluation results on our internal benchmark.
193
+
194
+ | **MOS score** | **Veo3** | **PixVerse-V5** | **Wan 2.2-T2V-A14B** | **LongCat-Video** |
195
+ |---------------|-------------------|--------------------|-------------|-------------|
196
+ | **Accessibility** | Proprietary | Proprietary | Open Source | Open Source |
197
+ | **Architecture** | - | - | MoE | Dense |
198
+ | **# Total Params** | - | - | 28B | 13.6B |
199
+ | **# Activated Params** | - | - | 14B | 13.6B |
200
+ | Text-Alignment↑ | 3.99 | 3.81 | 3.70 | 3.76 |
201
+ | Visual Quality↑ | 3.23 | 3.13 | 3.26 | 3.25 |
202
+ | Motion Quality↑ | 3.86 | 3.81 | 3.78 | 3.74 |
203
+ | Overall Quality↑ | 3.48 | 3.36 | 3.35 | 3.38 |
204
+
205
+ ### Image-to-Video
206
+ The *Image-to-Video* MOS evaluation results on our internal benchmark.
207
+
208
+ | **MOS score** | **Seedance 1.0** | **Hailuo-02** | **Wan 2.2-I2V-A14B** | **LongCat-Video** |
209
+ |---------------|-------------------|--------------------|-------------|-------------|
210
+ | **Accessibility** | Proprietary | Proprietary | Open Source | Open Source |
211
+ | **Architecture** | - | - | MoE | Dense |
212
+ | **# Total Params** | - | - | 28B | 13.6B |
213
+ | **# Activated Params** | - | - | 14B | 13.6B |
214
+ | Image-Alignment↑ | 4.12 | 4.18 | 4.18 | 4.04 |
215
+ | Text-Alignment↑ | 3.70 | 3.85 | 3.33 | 3.49 |
216
+ | Visual Quality↑ | 3.22 | 3.18 | 3.23 | 3.27 |
217
+ | Motion Quality↑ | 3.77 | 3.80 | 3.79 | 3.59 |
218
+ | Overall Quality↑ | 3.35 | 3.27 | 3.26 | 3.17 |
219
+
220
+ ## Community Works
221
+
222
+ Community works are welcome! Please PR or inform us in Issue to add your work.
223
+
224
+ - [CacheDiT](https://github.com/vipshop/cache-dit) offers Fully Cache Acceleration support for LongCat-Video with DBCache and TaylorSeer, achieved nearly 1.7x speedup without obvious loss of precision. Visit their [example](https://github.com/vipshop/cache-dit/blob/main/examples/pipeline/run_longcat_video.py) for more details.
225
+
226
+
227
+ ## License Agreement
228
+
229
+ The **model weights** are released under the **MIT License**.
230
+
231
+ Any contributions to this repository are licensed under the MIT License, unless otherwise stated. This license does not grant any rights to use Meituan trademarks or patents.
232
+
233
+ See the [LICENSE](LICENSE) file for the full license text.
234
+
235
+
236
+ ## Usage Considerations
237
+ This model has not been specifically designed or comprehensively evaluated for every possible downstream application.
238
+
239
+ Developers should take into account the known limitations of large language models, including performance variations across different languages, and carefully assess accuracy, safety, and fairness before deploying the model in sensitive or high-risk scenarios.
240
+ It is the responsibility of developers and downstream users to understand and comply with all applicable laws and regulations relevant to their use case, including but not limited to data protection, privacy, and content safety requirements.
241
+
242
+ Nothing in this Model Card should be interpreted as altering or restricting the terms of the MIT License under which the model is released.
243
+
244
+ ## Citation
245
+ We kindly encourage citation of our work if you find it useful.
246
+
247
+ ```
248
+ @misc{meituanlongcatteam2025longcatvideotechnicalreport,
249
+ title={LongCat-Video Technical Report},
250
+ author={Meituan LongCat Team and Xunliang Cai and Qilong Huang and Zhuoliang Kang and Hongyu Li and Shijun Liang and Liya Ma and Siyu Ren and Xiaoming Wei and Rixu Xie and Tong Zhang},
251
+ year={2025},
252
+ eprint={2510.22200},
253
+ archivePrefix={arXiv},
254
+ primaryClass={cs.CV},
255
+ url={https://arxiv.org/abs/2510.22200},
256
+ }
257
+ @misc{meituanlongcatteam2025longcatvideoavatartechnicalreport,
258
+ title={LongCat-Video-Avatar Technical Report},
259
+ author={Meituan LongCat Team},
260
+ year={2025},
261
+ eprint={xxx},
262
+ archivePrefix={arXiv},
263
+ primaryClass={cs.CV},
264
+ url={xxx},
265
+ }
266
+ ```
267
+
268
+ ## Acknowledgements
269
+
270
+ We would like to thank the contributors to the [Wan](https://huggingface.co/Wan-AI), [UMT5-XXL](https://huggingface.co/google/umt5-xxl), [Diffusers](https://github.com/huggingface/diffusers) and [HuggingFace](https://huggingface.co) repositories, for their open research.
271
+
272
+
273
+ ## Contact
274
+ Please contact us at <a href="mailto:longcat-team@meituan.com">longcat-team@meituan.com</a> or join our WeChat Group if you have any questions.
avatar_multi/config.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "LongCatVideoAvatarTransformer3DModel",
3
+ "_diffusers_version": "0.32.0",
4
+ "adaln_tembed_dim": 512,
5
+ "bsa_params":{
6
+ "sparsity": 0.9375,
7
+ "chunk_3d_shape_q": [4, 4, 4],
8
+ "chunk_3d_shape_k": [4, 4, 4]
9
+ },
10
+ "caption_channels": 4096,
11
+ "cp_split_hw": null,
12
+ "depth": 48,
13
+ "enable_bsa": false,
14
+ "enable_flashattn3": false,
15
+ "enable_flashattn2": true,
16
+ "enable_xformers": false,
17
+ "frequency_embedding_size": 256,
18
+ "hidden_size": 4096,
19
+ "in_channels": 16,
20
+ "text_tokens_zero_pad": true,
21
+ "mlp_ratio": 4,
22
+ "num_heads": 32,
23
+ "out_channels": 16,
24
+ "patch_size": [
25
+ 1,
26
+ 2,
27
+ 2
28
+ ],
29
+ "audio_window": 5,
30
+ "intermediate_dim": 512,
31
+ "output_dim": 768,
32
+ "context_tokens": 32,
33
+ "vae_scale": 4,
34
+ "audio_prenorm": false,
35
+ "class_range": 24,
36
+ "class_interval": 4
37
+ }
avatar_multi/diffusion_pytorch_model-00001-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e02cec39c5897baee8934633e93bebdb2b5c877bf0ad0f2e7e273ad312537ed8
3
+ size 10718471704
avatar_multi/diffusion_pytorch_model-00002-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1b9f0be46dda53ed6675145d2c36f700c0bc8d0afd7f1862bdf37a38ae739dd6
3
+ size 10707244312
avatar_multi/diffusion_pytorch_model-00003-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d399a783eac783650badfa55bbda2c07180b8dd6f4246b5c7b202cf1de3091f4
3
+ size 10640118144
avatar_multi/diffusion_pytorch_model-00004-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d062b5be0390982f357c9cec60193602254f2d337947d4104e2f319b82843876
3
+ size 10707244640
avatar_multi/diffusion_pytorch_model-00005-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:13470f851b0fcd3e5141746601a8b5e05d01854579d03080a99db05e1b29a970
3
+ size 10665318256
avatar_multi/diffusion_pytorch_model-00006-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:25a74791743e648094db3689f007470ab9955793d0fcfddc1aba8f8ba4bf53b2
3
+ size 10048417104
avatar_multi/diffusion_pytorch_model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
avatar_single/config.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "LongCatVideoAvatarTransformer3DModel",
3
+ "_diffusers_version": "0.32.0",
4
+ "adaln_tembed_dim": 512,
5
+ "bsa_params":{
6
+ "sparsity": 0.9375,
7
+ "chunk_3d_shape_q": [4, 4, 4],
8
+ "chunk_3d_shape_k": [4, 4, 4]
9
+ },
10
+ "caption_channels": 4096,
11
+ "cp_split_hw": null,
12
+ "depth": 48,
13
+ "enable_bsa": false,
14
+ "enable_flashattn3": false,
15
+ "enable_flashattn2": true,
16
+ "enable_xformers": false,
17
+ "frequency_embedding_size": 256,
18
+ "hidden_size": 4096,
19
+ "in_channels": 16,
20
+ "text_tokens_zero_pad": true,
21
+ "mlp_ratio": 4,
22
+ "num_heads": 32,
23
+ "out_channels": 16,
24
+ "patch_size": [
25
+ 1,
26
+ 2,
27
+ 2
28
+ ],
29
+ "audio_window": 5,
30
+ "intermediate_dim": 512,
31
+ "output_dim": 768,
32
+ "context_tokens": 32,
33
+ "vae_scale": 4,
34
+ "audio_prenorm": false,
35
+ "class_range": 24,
36
+ "class_interval": 4
37
+ }
avatar_single/diffusion_pytorch_model-00001-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e065fdebd25e3aace72c228a36e6da99ef7d90262458dd082835446bbc6fb9d7
3
+ size 10718471704
avatar_single/diffusion_pytorch_model-00002-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c983697c66644cb18f22136597e6488282d0b6a11344c7d97384f39a40af5522
3
+ size 10707244312
avatar_single/diffusion_pytorch_model-00003-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ecadc8f493ebb5fa9e12af6d495d2e9e3462f49bda0b0e93b87d65d0d08a8df9
3
+ size 10640118144
avatar_single/diffusion_pytorch_model-00004-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9d55f6e892b38f19a80494ca4dfbb5e4c2a1468bcd133211070742da79d36dd
3
+ size 10707244640
avatar_single/diffusion_pytorch_model-00005-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a17d6702d4f1f864dcf7fefe5374cff2bd2b717bfde5cd6777be450bb24b2b9f
3
+ size 10665318256
avatar_single/diffusion_pytorch_model-00006-of-00006.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c2e3116c0a88638abad265546825749514780539d0d1a5275ce8280a55aecc8a
3
+ size 10048417104
avatar_single/diffusion_pytorch_model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
chinese-wav2vec2-base/.gitattributes ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ftz filter=lfs diff=lfs merge=lfs -text
6
+ *.gz filter=lfs diff=lfs merge=lfs -text
7
+ *.h5 filter=lfs diff=lfs merge=lfs -text
8
+ *.joblib filter=lfs diff=lfs merge=lfs -text
9
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
10
+ *.model filter=lfs diff=lfs merge=lfs -text
11
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
12
+ *.onnx filter=lfs diff=lfs merge=lfs -text
13
+ *.ot filter=lfs diff=lfs merge=lfs -text
14
+ *.parquet filter=lfs diff=lfs merge=lfs -text
15
+ *.pb filter=lfs diff=lfs merge=lfs -text
16
+ *.pt filter=lfs diff=lfs merge=lfs -text
17
+ *.pth filter=lfs diff=lfs merge=lfs -text
18
+ *.rar filter=lfs diff=lfs merge=lfs -text
19
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
20
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
21
+ *.tflite filter=lfs diff=lfs merge=lfs -text
22
+ *.tgz filter=lfs diff=lfs merge=lfs -text
23
+ *.wasm filter=lfs diff=lfs merge=lfs -text
24
+ *.xz filter=lfs diff=lfs merge=lfs -text
25
+ *.zip filter=lfs diff=lfs merge=lfs -text
26
+ *.zstandard filter=lfs diff=lfs merge=lfs -text
27
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
chinese-wav2vec2-base/README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+ Pretrained on 10k hours WenetSpeech L subset. More details in [TencentGameMate/chinese_speech_pretrain](https://github.com/TencentGameMate/chinese_speech_pretrain)
5
+
6
+ This model does not have a tokenizer as it was pretrained on audio alone.
7
+ In order to use this model speech recognition, a tokenizer should be created and the model should be fine-tuned on labeled text data.
8
+
9
+ python package:
10
+ transformers==4.16.2
11
+
12
+ ```python
13
+
14
+
15
+ import torch
16
+ import torch.nn.functional as F
17
+ import soundfile as sf
18
+ from fairseq import checkpoint_utils
19
+
20
+ from transformers import (
21
+ Wav2Vec2FeatureExtractor,
22
+ Wav2Vec2ForPreTraining,
23
+ Wav2Vec2Model,
24
+ )
25
+ from transformers.models.wav2vec2.modeling_wav2vec2 import _compute_mask_indices
26
+
27
+ model_path=""
28
+ wav_path=""
29
+ mask_prob=0.0
30
+ mask_length=10
31
+
32
+ feature_extractor = Wav2Vec2FeatureExtractor.from_pretrained(model_path)
33
+ model = Wav2Vec2Model.from_pretrained(model_path)
34
+
35
+ # for pretrain: Wav2Vec2ForPreTraining
36
+ # model = Wav2Vec2ForPreTraining.from_pretrained(model_path)
37
+
38
+ model = model.to(device)
39
+ model = model.half()
40
+ model.eval()
41
+
42
+ wav, sr = sf.read(wav_path)
43
+ input_values = feature_extractor(wav, return_tensors="pt").input_values
44
+ input_values = input_values.half()
45
+ input_values = input_values.to(device)
46
+
47
+ # for Wav2Vec2ForPreTraining
48
+ # batch_size, raw_sequence_length = input_values.shape
49
+ # sequence_length = model._get_feat_extract_output_lengths(raw_sequence_length)
50
+ # mask_time_indices = _compute_mask_indices((batch_size, sequence_length), mask_prob=0.0, mask_length=2)
51
+ # mask_time_indices = torch.tensor(mask_time_indices, device=input_values.device, dtype=torch.long)
52
+
53
+ with torch.no_grad():
54
+ outputs = model(input_values)
55
+ last_hidden_state = outputs.last_hidden_state
56
+
57
+ # for Wav2Vec2ForPreTraining
58
+ # outputs = model(input_values, mask_time_indices=mask_time_indices, output_hidden_states=True)
59
+ # last_hidden_state = outputs.hidden_states[-1]
60
+
61
+ ```
chinese-wav2vec2-base/chinese-wav2vec2-base-fairseq-ckpt.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a75e04e426977dd399415b7f586b18978bc6836a3e8514ae1bb29e468fb17184
3
+ size 1140632401
chinese-wav2vec2-base/config.json ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "activation_dropout": 0.1,
3
+ "adapter_kernel_size": 3,
4
+ "adapter_stride": 2,
5
+ "add_adapter": false,
6
+ "apply_spec_augment": true,
7
+ "architectures": [
8
+ "Wav2Vec2ForPreTraining"
9
+ ],
10
+ "attention_dropout": 0.1,
11
+ "bos_token_id": 1,
12
+ "classifier_proj_size": 256,
13
+ "codevector_dim": 256,
14
+ "contrastive_logits_temperature": 0.1,
15
+ "conv_bias": false,
16
+ "conv_dim": [
17
+ 512,
18
+ 512,
19
+ 512,
20
+ 512,
21
+ 512,
22
+ 512,
23
+ 512
24
+ ],
25
+ "conv_kernel": [
26
+ 10,
27
+ 3,
28
+ 3,
29
+ 3,
30
+ 3,
31
+ 2,
32
+ 2
33
+ ],
34
+ "conv_stride": [
35
+ 5,
36
+ 2,
37
+ 2,
38
+ 2,
39
+ 2,
40
+ 2,
41
+ 2
42
+ ],
43
+ "ctc_loss_reduction": "sum",
44
+ "ctc_zero_infinity": false,
45
+ "diversity_loss_weight": 0.1,
46
+ "do_stable_layer_norm": false,
47
+ "eos_token_id": 2,
48
+ "feat_extract_activation": "gelu",
49
+ "feat_extract_norm": "group",
50
+ "feat_proj_dropout": 0.0,
51
+ "feat_quantizer_dropout": 0.0,
52
+ "final_dropout": 0.1,
53
+ "hidden_act": "gelu",
54
+ "hidden_dropout": 0.1,
55
+ "hidden_size": 768,
56
+ "initializer_range": 0.02,
57
+ "intermediate_size": 3072,
58
+ "layer_norm_eps": 1e-05,
59
+ "layerdrop": 0.1,
60
+ "mask_feature_length": 10,
61
+ "mask_feature_min_masks": 0,
62
+ "mask_feature_prob": 0.0,
63
+ "mask_time_length": 10,
64
+ "mask_time_min_masks": 2,
65
+ "mask_time_prob": 0.05,
66
+ "model_type": "wav2vec2",
67
+ "num_adapter_layers": 3,
68
+ "num_attention_heads": 12,
69
+ "num_codevector_groups": 2,
70
+ "num_codevectors_per_group": 320,
71
+ "num_conv_pos_embedding_groups": 16,
72
+ "num_conv_pos_embeddings": 128,
73
+ "num_feat_extract_layers": 7,
74
+ "num_hidden_layers": 12,
75
+ "num_negatives": 100,
76
+ "output_hidden_size": 768,
77
+ "pad_token_id": 0,
78
+ "proj_codevector_dim": 256,
79
+ "tdnn_dilation": [
80
+ 1,
81
+ 2,
82
+ 3,
83
+ 1,
84
+ 1
85
+ ],
86
+ "tdnn_dim": [
87
+ 512,
88
+ 512,
89
+ 512,
90
+ 512,
91
+ 1500
92
+ ],
93
+ "tdnn_kernel": [
94
+ 5,
95
+ 3,
96
+ 3,
97
+ 1,
98
+ 1
99
+ ],
100
+ "torch_dtype": "float32",
101
+ "transformers_version": "4.16.2",
102
+ "use_weighted_layer_sum": false,
103
+ "vocab_size": 32,
104
+ "xvector_output_dim": 512
105
+ }
chinese-wav2vec2-base/preprocessor_config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "do_normalize": false,
3
+ "feature_size": 1,
4
+ "padding_side": "right",
5
+ "padding_value": 0.0,
6
+ "return_attention_mask": false,
7
+ "sampling_rate": 16000
8
+ }
chinese-wav2vec2-base/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:be2da40c9e7ae26bfc904a3ed79ebb9e8f060bec6dba85d6a6ae86114bc38901
3
+ size 380261837
config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "model_name": "LongCat-Video-Avatar"
3
+ }
model_index.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "model_name": "LongCat-Video-Avatar"
3
+ }
vocal_separator/Kim_Vocal_2.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ce74ef3b6a6024ce44211a07be9cf8bc6d87728cc852a68ab34eb8e58cde9c8b
3
+ size 66759214
vocal_separator/download_checks.json ADDED
@@ -0,0 +1,289 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "current_version": "UVR_Patch_10_6_23_4_27",
3
+ "current_version_ocl": "UVR_Patch_10_6_23_4_27",
4
+ "current_version_mac": "UVR_Patch_10_6_23_4_27",
5
+ "current_version_linux": "UVR_Patch_10_6_23_4_27",
6
+ "vr_download_list": {
7
+ "VR Arch Single Model v5: 1_HP-UVR": "1_HP-UVR.pth",
8
+ "VR Arch Single Model v5: 2_HP-UVR": "2_HP-UVR.pth",
9
+ "VR Arch Single Model v5: 3_HP-Vocal-UVR": "3_HP-Vocal-UVR.pth",
10
+ "VR Arch Single Model v5: 4_HP-Vocal-UVR": "4_HP-Vocal-UVR.pth",
11
+ "VR Arch Single Model v5: 5_HP-Karaoke-UVR": "5_HP-Karaoke-UVR.pth",
12
+ "VR Arch Single Model v5: 6_HP-Karaoke-UVR": "6_HP-Karaoke-UVR.pth",
13
+ "VR Arch Single Model v5: 7_HP2-UVR": "7_HP2-UVR.pth",
14
+ "VR Arch Single Model v5: 8_HP2-UVR": "8_HP2-UVR.pth",
15
+ "VR Arch Single Model v5: 9_HP2-UVR": "9_HP2-UVR.pth",
16
+ "VR Arch Single Model v5: 10_SP-UVR-2B-32000-1": "10_SP-UVR-2B-32000-1.pth",
17
+ "VR Arch Single Model v5: 11_SP-UVR-2B-32000-2": "11_SP-UVR-2B-32000-2.pth",
18
+ "VR Arch Single Model v5: 12_SP-UVR-3B-44100": "12_SP-UVR-3B-44100.pth",
19
+ "VR Arch Single Model v5: 13_SP-UVR-4B-44100-1": "13_SP-UVR-4B-44100-1.pth",
20
+ "VR Arch Single Model v5: 14_SP-UVR-4B-44100-2": "14_SP-UVR-4B-44100-2.pth",
21
+ "VR Arch Single Model v5: 15_SP-UVR-MID-44100-1": "15_SP-UVR-MID-44100-1.pth",
22
+ "VR Arch Single Model v5: 16_SP-UVR-MID-44100-2": "16_SP-UVR-MID-44100-2.pth",
23
+ "VR Arch Single Model v5: 17_HP-Wind_Inst-UVR": "17_HP-Wind_Inst-UVR.pth",
24
+ "VR Arch Single Model v5: UVR-De-Echo-Aggressive by FoxJoy": "UVR-De-Echo-Aggressive.pth",
25
+ "VR Arch Single Model v5: UVR-De-Echo-Normal by FoxJoy": "UVR-De-Echo-Normal.pth",
26
+ "VR Arch Single Model v5: UVR-DeEcho-DeReverb by FoxJoy": "UVR-DeEcho-DeReverb.pth",
27
+ "VR Arch Single Model v5: UVR-DeNoise-Lite by FoxJoy": "UVR-DeNoise-Lite.pth",
28
+ "VR Arch Single Model v5: UVR-DeNoise by FoxJoy": "UVR-DeNoise.pth",
29
+ "VR Arch Single Model v5: UVR-BVE-4B_SN-44100-1": "UVR-BVE-4B_SN-44100-1.pth",
30
+ "VR Arch Single Model v4: MGM_HIGHEND_v4": "MGM_HIGHEND_v4.pth",
31
+ "VR Arch Single Model v4: MGM_LOWEND_A_v4": "MGM_LOWEND_A_v4.pth",
32
+ "VR Arch Single Model v4: MGM_LOWEND_B_v4": "MGM_LOWEND_B_v4.pth",
33
+ "VR Arch Single Model v4: MGM_MAIN_v4": "MGM_MAIN_v4.pth"
34
+ },
35
+
36
+ "mdx_download_list": {
37
+ "MDX-Net Model: UVR-MDX-NET Inst HQ 1": "UVR-MDX-NET-Inst_HQ_1.onnx",
38
+ "MDX-Net Model: UVR-MDX-NET Inst HQ 2": "UVR-MDX-NET-Inst_HQ_2.onnx",
39
+ "MDX-Net Model: UVR-MDX-NET Inst HQ 3": "UVR-MDX-NET-Inst_HQ_3.onnx",
40
+ "MDX-Net Model: UVR-MDX-NET Inst HQ 4": "UVR-MDX-NET-Inst_HQ_4.onnx",
41
+ "MDX-Net Model: UVR-MDX-NET Inst HQ 5": "UVR-MDX-NET-Inst_HQ_5.onnx",
42
+ "MDX-Net Model: UVR-MDX-NET Main": "UVR_MDXNET_Main.onnx",
43
+ "MDX-Net Model: UVR-MDX-NET Inst Main": "UVR-MDX-NET-Inst_Main.onnx",
44
+ "MDX-Net Model: UVR-MDX-NET 1": "UVR_MDXNET_1_9703.onnx",
45
+ "MDX-Net Model: UVR-MDX-NET 2": "UVR_MDXNET_2_9682.onnx",
46
+ "MDX-Net Model: UVR-MDX-NET 3": "UVR_MDXNET_3_9662.onnx",
47
+ "MDX-Net Model: UVR-MDX-NET Inst 1": "UVR-MDX-NET-Inst_1.onnx",
48
+ "MDX-Net Model: UVR-MDX-NET Inst 2": "UVR-MDX-NET-Inst_2.onnx",
49
+ "MDX-Net Model: UVR-MDX-NET Inst 3": "UVR-MDX-NET-Inst_3.onnx",
50
+ "MDX-Net Model: UVR-MDX-NET Karaoke": "UVR_MDXNET_KARA.onnx",
51
+ "MDX-Net Model: UVR-MDX-NET Karaoke 2": "UVR_MDXNET_KARA_2.onnx",
52
+ "MDX-Net Model: UVR_MDXNET_9482": "UVR_MDXNET_9482.onnx",
53
+ "MDX-Net Model: UVR-MDX-NET Voc FT": "UVR-MDX-NET-Voc_FT.onnx",
54
+ "MDX-Net Model: Kim Vocal 1": "Kim_Vocal_1.onnx",
55
+ "MDX-Net Model: Kim Vocal 2": "Kim_Vocal_2.onnx",
56
+ "MDX-Net Model: Kim Inst": "Kim_Inst.onnx",
57
+ "MDX-Net Model: Reverb HQ By FoxJoy": "Reverb_HQ_By_FoxJoy.onnx",
58
+ "MDX-Net Model: UVR-MDX-NET Crowd HQ 1 By Aufr33": "UVR-MDX-NET_Crowd_HQ_1.onnx",
59
+ "MDX-Net Model: kuielab_a_vocals": "kuielab_a_vocals.onnx",
60
+ "MDX-Net Model: kuielab_a_other": "kuielab_a_other.onnx",
61
+ "MDX-Net Model: kuielab_a_bass": "kuielab_a_bass.onnx",
62
+ "MDX-Net Model: kuielab_a_drums": "kuielab_a_drums.onnx",
63
+ "MDX-Net Model: kuielab_b_vocals": "kuielab_b_vocals.onnx",
64
+ "MDX-Net Model: kuielab_b_other": "kuielab_b_other.onnx",
65
+ "MDX-Net Model: kuielab_b_bass": "kuielab_b_bass.onnx",
66
+ "MDX-Net Model: kuielab_b_drums": "kuielab_b_drums.onnx"
67
+ },
68
+
69
+ "demucs_download_list":{
70
+
71
+ "Demucs v4: htdemucs_ft":{
72
+ "f7e0c4bc-ba3fe64a.th":"https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/f7e0c4bc-ba3fe64a.th",
73
+ "d12395a8-e57c48e6.th":"https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/d12395a8-e57c48e6.th",
74
+ "92cfc3b6-ef3bcb9c.th":"https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/92cfc3b6-ef3bcb9c.th",
75
+ "04573f0d-f3cf25b2.th":"https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/04573f0d-f3cf25b2.th",
76
+ "htdemucs_ft.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/htdemucs_ft.yaml"
77
+ },
78
+
79
+ "Demucs v4: htdemucs":{
80
+ "955717e8-8726e21a.th": "https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/955717e8-8726e21a.th",
81
+ "htdemucs.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/htdemucs.yaml"
82
+ },
83
+
84
+ "Demucs v4: hdemucs_mmi":{
85
+ "75fc33f5-1941ce65.th": "https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/75fc33f5-1941ce65.th",
86
+ "hdemucs_mmi.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/hdemucs_mmi.yaml"
87
+ },
88
+ "Demucs v4: htdemucs_6s":{
89
+ "5c90dfd2-34c22ccb.th": "https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/5c90dfd2-34c22ccb.th",
90
+ "htdemucs_6s.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/htdemucs_6s.yaml"
91
+ },
92
+ "Demucs v3: mdx":{
93
+ "0d19c1c6-0f06f20e.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/0d19c1c6-0f06f20e.th",
94
+ "7ecf8ec1-70f50cc9.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/7ecf8ec1-70f50cc9.th",
95
+ "c511e2ab-fe698775.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/c511e2ab-fe698775.th",
96
+ "7d865c68-3d5dd56b.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/7d865c68-3d5dd56b.th",
97
+ "mdx.yaml": "https://raw.githubusercontent.com/facebookresearch/demucs/main/demucs/remote/mdx.yaml"
98
+ },
99
+
100
+ "Demucs v3: mdx_q":{
101
+ "6b9c2ca1-3fd82607.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/6b9c2ca1-3fd82607.th",
102
+ "b72baf4e-8778635e.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/b72baf4e-8778635e.th",
103
+ "42e558d4-196e0e1b.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/42e558d4-196e0e1b.th",
104
+ "305bc58f-18378783.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/305bc58f-18378783.th",
105
+ "mdx_q.yaml": "https://raw.githubusercontent.com/facebookresearch/demucs/main/demucs/remote/mdx_q.yaml"
106
+ },
107
+
108
+ "Demucs v3: mdx_extra":{
109
+ "e51eebcc-c1b80bdd.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/e51eebcc-c1b80bdd.th",
110
+ "a1d90b5c-ae9d2452.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/a1d90b5c-ae9d2452.th",
111
+ "5d2d6c55-db83574e.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/5d2d6c55-db83574e.th",
112
+ "cfa93e08-61801ae1.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/cfa93e08-61801ae1.th",
113
+ "mdx_extra.yaml": "https://raw.githubusercontent.com/facebookresearch/demucs/main/demucs/remote/mdx_extra.yaml"
114
+ },
115
+
116
+ "Demucs v3: mdx_extra_q": {
117
+ "83fc094f-4a16d450.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/83fc094f-4a16d450.th",
118
+ "464b36d7-e5a9386e.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/464b36d7-e5a9386e.th",
119
+ "14fc6a69-a89dd0ee.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/14fc6a69-a89dd0ee.th",
120
+ "7fd6ef75-a905dd85.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/7fd6ef75-a905dd85.th",
121
+ "mdx_extra_q.yaml": "https://raw.githubusercontent.com/facebookresearch/demucs/main/demucs/remote/mdx_extra_q.yaml"
122
+ },
123
+
124
+ "Demucs v3: UVR Model":{
125
+ "ebf34a2db.th": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/ebf34a2db.th",
126
+ "UVR_Demucs_Model_1.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/UVR_Demucs_Model_1.yaml"
127
+ },
128
+
129
+ "Demucs v3: repro_mdx_a":{
130
+ "9a6b4851-03af0aa6.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/9a6b4851-03af0aa6.th",
131
+ "1ef250f1-592467ce.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/1ef250f1-592467ce.th",
132
+ "fa0cb7f9-100d8bf4.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/fa0cb7f9-100d8bf4.th",
133
+ "902315c2-b39ce9c9.th": "https://dl.fbaipublicfiles.com/demucs/mdx_final/902315c2-b39ce9c9.th",
134
+ "repro_mdx_a.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/repro_mdx_a.yaml"
135
+ },
136
+
137
+ "Demucs v3: repro_mdx_a_time_only":{
138
+ "9a6b4851-03af0aa6.th":"https://dl.fbaipublicfiles.com/demucs/mdx_final/9a6b4851-03af0aa6.th",
139
+ "1ef250f1-592467ce.th":"https://dl.fbaipublicfiles.com/demucs/mdx_final/1ef250f1-592467ce.th",
140
+ "repro_mdx_a_time_only.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/repro_mdx_a_time_only.yaml"
141
+ },
142
+
143
+ "Demucs v3: repro_mdx_a_hybrid_only":{
144
+ "fa0cb7f9-100d8bf4.th":"https://dl.fbaipublicfiles.com/demucs/mdx_final/fa0cb7f9-100d8bf4.th",
145
+ "902315c2-b39ce9c9.th":"https://dl.fbaipublicfiles.com/demucs/mdx_final/902315c2-b39ce9c9.th",
146
+ "repro_mdx_a_hybrid_only.yaml": "https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/repro_mdx_a_hybrid_only.yaml"
147
+ },
148
+
149
+ "Demucs v2: demucs": {
150
+ "demucs-e07c671f.th": "https://dl.fbaipublicfiles.com/demucs/v3.0/demucs-e07c671f.th"
151
+ },
152
+
153
+ "Demucs v2: demucs_extra": {
154
+ "demucs_extra-3646af93.th":"https://dl.fbaipublicfiles.com/demucs/v3.0/demucs_extra-3646af93.th"
155
+ },
156
+
157
+ "Demucs v2: demucs48_hq": {
158
+ "demucs48_hq-28a1282c.th":"https://dl.fbaipublicfiles.com/demucs/v3.0/demucs48_hq-28a1282c.th"
159
+ },
160
+
161
+ "Demucs v2: tasnet": {
162
+ "tasnet-beb46fac.th":"https://dl.fbaipublicfiles.com/demucs/v3.0/tasnet-beb46fac.th"
163
+ },
164
+
165
+ "Demucs v2: tasnet_extra": {
166
+ "tasnet_extra-df3777b2.th":"https://dl.fbaipublicfiles.com/demucs/v3.0/tasnet_extra-df3777b2.th"
167
+ },
168
+
169
+ "Demucs v2: demucs_unittest": {
170
+ "demucs_unittest-09ebc15f.th":"https://dl.fbaipublicfiles.com/demucs/v3.0/demucs_unittest-09ebc15f.th"
171
+ },
172
+
173
+ "Demucs v1: demucs": {
174
+ "demucs.th":"https://dl.fbaipublicfiles.com/demucs/v2.0/demucs.th"
175
+ },
176
+
177
+ "Demucs v1: demucs_extra": {
178
+ "demucs_extra.th":"https://dl.fbaipublicfiles.com/demucs/v2.0/demucs_extra.th"
179
+ },
180
+
181
+ "Demucs v1: light": {
182
+ "light.th":"https://dl.fbaipublicfiles.com/demucs/v2.0/light.th"
183
+ },
184
+
185
+ "Demucs v1: light_extra": {
186
+ "light_extra.th":"https://dl.fbaipublicfiles.com/demucs/v2.0/light_extra.th"
187
+ },
188
+
189
+ "Demucs v1: tasnet": {
190
+ "tasnet.th":"https://dl.fbaipublicfiles.com/demucs/v2.0/tasnet.th"
191
+ },
192
+
193
+ "Demucs v1: tasnet_extra": {
194
+ "tasnet_extra.th":"https://dl.fbaipublicfiles.com/demucs/v2.0/tasnet_extra.th"
195
+ }
196
+ },
197
+
198
+ "mdx_download_vip_list": {
199
+ "MDX-Net Model VIP: UVR-MDX-NET_Main_340": "UVR-MDX-NET_Main_340.onnx",
200
+ "MDX-Net Model VIP: UVR-MDX-NET_Main_390": "UVR-MDX-NET_Main_390.onnx",
201
+ "MDX-Net Model VIP: UVR-MDX-NET_Main_406": "UVR-MDX-NET_Main_406.onnx",
202
+ "MDX-Net Model VIP: UVR-MDX-NET_Main_427": "UVR-MDX-NET_Main_427.onnx",
203
+ "MDX-Net Model VIP: UVR-MDX-NET_Main_438": "UVR-MDX-NET_Main_438.onnx",
204
+ "MDX-Net Model VIP: UVR-MDX-NET_Inst_82_beta": "UVR-MDX-NET_Inst_82_beta.onnx",
205
+ "MDX-Net Model VIP: UVR-MDX-NET_Inst_90_beta": "UVR-MDX-NET_Inst_90_beta.onnx",
206
+ "MDX-Net Model VIP: UVR-MDX-NET_Inst_187_beta": "UVR-MDX-NET_Inst_187_beta.onnx",
207
+ "MDX-Net Model VIP: UVR-MDX-NET-Inst_full_292": "UVR-MDX-NET-Inst_full_292.onnx"
208
+ },
209
+
210
+ "mdx23_download_list": {
211
+ "MDX23C Model: MDX23C_D1581": {"MDX23C_D1581.ckpt":"model_2_stem_061321.yaml"}
212
+ },
213
+
214
+ "mdx23c_download_list": {
215
+ "MDX23C Model: MDX23C-InstVoc HQ": {"MDX23C-8KFFT-InstVoc_HQ.ckpt":"model_2_stem_full_band_8k.yaml"}
216
+ },
217
+
218
+ "roformer_download_list": {
219
+ "Roformer Model: BS-Roformer-Viperx-1297": {"model_bs_roformer_ep_317_sdr_12.9755.ckpt":"model_bs_roformer_ep_317_sdr_12.9755.yaml"},
220
+ "Roformer Model: BS-Roformer-Viperx-1296": {"model_bs_roformer_ep_368_sdr_12.9628.ckpt":"model_bs_roformer_ep_368_sdr_12.9628.yaml"},
221
+ "Roformer Model: BS-Roformer-Viperx-1053": {"model_bs_roformer_ep_937_sdr_10.5309.ckpt":"model_bs_roformer_ep_937_sdr_10.5309.yaml"},
222
+ "Roformer Model: Mel-Roformer-Viperx-1143": {"model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt":"model_mel_band_roformer_ep_3005_sdr_11.4360.yaml"},
223
+ "Roformer Model: MelBand Roformer Kim | Inst V1 by Unwa": {"melband_roformer_inst_v1.ckpt":"config_melbandroformer_inst.yaml"},
224
+ "Roformer Model: MelBand Roformer Kim | Inst V2 by Unwa": {"melband_roformer_inst_v2.ckpt":"config_melbandroformer_inst_v2.yaml"},
225
+ "Roformer Model: MelBand Roformer Kim | InstVoc Duality V1 by Unwa": {"melband_roformer_instvoc_duality_v1.ckpt":"config_melbandroformer_instvoc_duality.yaml"},
226
+ "Roformer Model: MelBand Roformer Kim | InstVoc Duality V2 by Unwa": {"melband_roformer_instvox_duality_v2.ckpt":"config_melbandroformer_instvoc_duality.yaml"}
227
+ },
228
+
229
+ "other_network_list": {
230
+ "Roformer Model: BS-Roformer-Viperx-1297": {"model_bs_roformer_ep_317_sdr_12.9755.ckpt":"https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/model_bs_roformer_ep_317_sdr_12.9755.ckpt",
231
+ "model_bs_roformer_ep_317_sdr_12.9755.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/model_bs_roformer_ep_317_sdr_12.9755.yaml"},
232
+ "Roformer Model: BS-Roformer-Viperx-1296": {"model_bs_roformer_ep_368_sdr_12.9628.ckpt":"https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/model_bs_roformer_ep_368_sdr_12.9628.ckpt",
233
+ "model_bs_roformer_ep_368_sdr_12.9628.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/model_bs_roformer_ep_368_sdr_12.9628.yaml"},
234
+ "Roformer Model: BS-Roformer-Viperx-1053": {"model_bs_roformer_ep_937_sdr_10.5309.ckpt":"https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/model_bs_roformer_ep_937_sdr_10.5309.ckpt",
235
+ "model_bs_roformer_ep_937_sdr_10.5309.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/model_bs_roformer_ep_937_sdr_10.5309.yaml"},
236
+ "Roformer Model: Mel-Roformer-Viperx-1143": {"model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt":"https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/model_mel_band_roformer_ep_3005_sdr_11.4360.ckpt",
237
+ "model_mel_band_roformer_ep_3005_sdr_11.4360.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/model_mel_band_roformer_ep_3005_sdr_11.4360.yaml"},
238
+ "Roformer Model: MelBand Roformer Kim | Inst V1 by Unwa": {"melband_roformer_inst_v1.ckpt":"https://huggingface.co/pcunwa/Mel-Band-Roformer-Inst/resolve/main/melband_roformer_inst_v1.ckpt",
239
+ "config_melbandroformer_inst.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_melbandroformer_inst.yaml"},
240
+ "Roformer Model: MelBand Roformer Kim | Inst V2 by Unwa": {"melband_roformer_inst_v2.ckpt":"https://huggingface.co/pcunwa/Mel-Band-Roformer-Inst/resolve/main/melband_roformer_inst_v2.ckpt",
241
+ "config_melbandroformer_inst_v2.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_melbandroformer_inst_v2.yaml"},
242
+ "Roformer Model: MelBand Roformer Kim | InstVoc Duality V1 by Unwa": {"melband_roformer_instvoc_duality_v1.ckpt":"https://huggingface.co/pcunwa/Mel-Band-Roformer-InstVoc-Duality/resolve/main/melband_roformer_instvoc_duality_v1.ckpt",
243
+ "config_melbandroformer_instvoc_duality.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_melbandroformer_instvoc_duality.yaml"},
244
+ "Roformer Model: MelBand Roformer Kim | InstVoc Duality V2 by Unwa": {"melband_roformer_instvox_duality_v2.ckpt":"https://huggingface.co/pcunwa/Mel-Band-Roformer-InstVoc-Duality/resolve/main/melband_roformer_instvox_duality_v2.ckpt",
245
+ "config_melbandroformer_instvoc_duality.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_melbandroformer_instvoc_duality.yaml"},
246
+ "Roformer Model: MelBand Roformer Kim | Inst V1 (E) by Unwa": {"inst_v1e.ckpt":"https://huggingface.co/pcunwa/Mel-Band-Roformer-Inst/resolve/main/inst_v1e.ckpt",
247
+ "config_melbandroformer_inst.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_melbandroformer_inst.yaml"},
248
+ "Roformer Model: MelBand Roformer Kim": {"MelBandRoformer.ckpt":"https://huggingface.co/KimberleyJSN/melbandroformer/resolve/main/MelBandRoformer.ckpt",
249
+ "config_vocals_mel_band_roformer_kim.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_vocals_mel_band_roformer_kim.yaml"}
250
+ },
251
+
252
+ "other_network_list_new": {
253
+
254
+ "Roformer Model: BS-Roformer-Inst-EXP-Value-Residual | (by Unwa)": {"BS_Inst_EXP_VRL.ckpt":"https://huggingface.co/pcunwa/BS-Roformer-Inst-EXP-Value-Residual/resolve/main/BS_Inst_EXP_VRL.ckpt",
255
+ "BS_Inst_EXP_VRL.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/BS_Inst_EXP_VRL.yaml"},
256
+ "Roformer Model: Karaoke MelBand Roformer | (by aufr33 & viperx)": {"mel_band_roformer_karaoke_aufr33_viperx_sdr_10.1956.ckpt":"https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/mel_band_roformer_karaoke_aufr33_viperx_sdr_10.1956.ckpt",
257
+ "config_mel_band_roformer_karaoke.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_mel_band_roformer_karaoke.yaml"},
258
+ "Roformer Model: BS Roformer Dereverb | (anvuew edition)": {"deverb_bs_roformer_8_256dim_8depth.ckpt":"https://huggingface.co/anvuew/deverb_bs_roformer/resolve/main/deverb_bs_roformer_8_256dim_8depth.ckpt",
259
+ "deverb_bs_roformer_8_256dim_8depth.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/deverb_bs_roformer_8_256dim_8depth.yaml"},
260
+ "Bandit Plus: Cinematic Bandit Plus | (by kwatcharasupat)": {"model_bandit_plus_dnr_sdr_11.47.ckpt":"https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/model_bandit_plus_dnr_sdr_11.47.ckpt",
261
+ "config_dnr_bandit_bsrnn_multi_mus64.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_dnr_bandit_bsrnn_multi_mus64.yaml"},
262
+ "Bandit v2: Cinematic Bandit v2 Multilang | (by kwatcharasupat)": {"checkpoint-multi_fixed.ckpt":"https://github.com/TRvlvr/model_repo/releases/download/all_public_uvr_models/checkpoint-multi_fixed.ckpt",
263
+ "config_dnr_bandit_v2_mus64.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_dnr_bandit_v2_mus64.yaml"},
264
+ "SCnet: 4-stems SCNet_MUSDB18 | (by starrytong)": {"scnet_checkpoint_musdb18.ckpt":"https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v.1.0.6/scnet_checkpoint_musdb18.ckpt",
265
+ "config_musdb18_scnet.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_musdb18_scnet.yaml"},
266
+ "SCnet: 4-stems SCNet Large | (by starrytong)": {"SCNet-large_starrytong_fixed.ckpt":"https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.9/SCNet-large_starrytong_fixed.ckpt",
267
+ "config_musdb18_scnet_large_starrytong.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_musdb18_scnet_large_starrytong.yaml"},
268
+ "SCnet: 4-stems SCNet Large": {"model_scnet_sdr_9.3244.ckpt":"https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.8/model_scnet_sdr_9.3244.ckpt",
269
+ "config_musdb18_scnet_large.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_musdb18_scnet_large.yaml"},
270
+ "SCnet: 4-stems SCNet-XL": {"model_scnet_ep_54_sdr_9.8051.ckpt":"https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.13/model_scnet_ep_54_sdr_9.8051.ckpt",
271
+ "config_musdb18_scnet_xl.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_musdb18_scnet_xl.yaml"},
272
+ "MDX23C Model: DrumSep 6stem | (by aufr33 & jarredou)": {"aufr33-jarredou_DrumSep_model_mdx23c_ep_141_sdr_10.8059.ckpt":"https://github.com/jarredou/models/releases/download/aufr33-jarredou_MDX23C_DrumSep_model_v0.1/aufr33-jarredou_DrumSep_model_mdx23c_ep_141_sdr_10.8059.ckpt",
273
+ "aufr33-jarredou_DrumSep_model_mdx23c_ep_141_sdr_10.8059.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/aufr33-jarredou_DrumSep_model_mdx23c_ep_141_sdr_10.8059.yaml"},
274
+ "MDX23C Model: Phantom Centre extraction | (by wesleyr36)": {"model_mdx23c_ep_271_l1_freq_72.2383.ckpt":"https://github.com/ZFTurbo/Music-Source-Separation-Training/releases/download/v1.0.10/model_mdx23c_ep_271_l1_freq_72.2383.ckpt",
275
+ "config_mdx23c_similarity.yaml":"https://raw.githubusercontent.com/TRvlvr/application_data/main/mdx_model_data/mdx_c_configs/config_mdx23c_similarity.yaml"}
276
+ },
277
+
278
+ "mdx23c_download_vip_list": {
279
+ "MDX23C Model VIP: MDX23C_D1581": {"MDX23C_D1581.ckpt":"model_2_stem_061321.yaml"},
280
+ "MDX23C Model VIP: MDX23C-InstVoc HQ 2": {"MDX23C-8KFFT-InstVoc_HQ_2.ckpt":"model_2_stem_full_band_8k.yaml"}
281
+ },
282
+
283
+ "roll_back_win_url": "https://github.com/Anjok07/ultimatevocalremovergui/releases/download/v5.6/UVR_v5.6.0_setup.exe",
284
+ "roll_back_macos_x86_64_url": "https://github.com/Anjok07/ultimatevocalremovergui/releases/download/v5.6/Ultimate_Vocal_Remover_v5_6_MacOS_x86_64.dmg",
285
+ "roll_back_macos_arm64_url": "https://github.com/Anjok07/ultimatevocalremovergui/releases/download/v5.6/Ultimate_Vocal_Remover_v5_6_MacOS_arm64.dmg",
286
+
287
+ "vr_download_vip_list": [],
288
+ "demucs_download_vip_list": []
289
+ }
vocal_separator/mdx_model_data.json ADDED
@@ -0,0 +1,482 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "0ddfc0eb5792638ad5dc27850236c246": {
3
+ "compensate": 1.035,
4
+ "mdx_dim_f_set": 2048,
5
+ "mdx_dim_t_set": 8,
6
+ "mdx_n_fft_scale_set": 6144,
7
+ "primary_stem": "Vocals"
8
+ },
9
+ "26d308f91f3423a67dc69a6d12a8793d": {
10
+ "compensate": 1.035,
11
+ "mdx_dim_f_set": 2048,
12
+ "mdx_dim_t_set": 9,
13
+ "mdx_n_fft_scale_set": 8192,
14
+ "primary_stem": "Other"
15
+ },
16
+ "2cdd429caac38f0194b133884160f2c6": {
17
+ "compensate": 1.045,
18
+ "mdx_dim_f_set": 3072,
19
+ "mdx_dim_t_set": 8,
20
+ "mdx_n_fft_scale_set": 7680,
21
+ "primary_stem": "Instrumental"
22
+ },
23
+ "2f5501189a2f6db6349916fabe8c90de": {
24
+ "compensate": 1.035,
25
+ "mdx_dim_f_set": 2048,
26
+ "mdx_dim_t_set": 8,
27
+ "mdx_n_fft_scale_set": 6144,
28
+ "primary_stem": "Vocals",
29
+ "is_karaoke": true
30
+ },
31
+ "398580b6d5d973af3120df54cee6759d": {
32
+ "compensate": 1.75,
33
+ "mdx_dim_f_set": 3072,
34
+ "mdx_dim_t_set": 8,
35
+ "mdx_n_fft_scale_set": 7680,
36
+ "primary_stem": "Vocals"
37
+ },
38
+ "488b3e6f8bd3717d9d7c428476be2d75": {
39
+ "compensate": 1.035,
40
+ "mdx_dim_f_set": 3072,
41
+ "mdx_dim_t_set": 8,
42
+ "mdx_n_fft_scale_set": 7680,
43
+ "primary_stem": "Instrumental"
44
+ },
45
+ "4910e7827f335048bdac11fa967772f9": {
46
+ "compensate": 1.035,
47
+ "mdx_dim_f_set": 2048,
48
+ "mdx_dim_t_set": 7,
49
+ "mdx_n_fft_scale_set": 4096,
50
+ "primary_stem": "Drums"
51
+ },
52
+ "53c4baf4d12c3e6c3831bb8f5b532b93": {
53
+ "compensate": 1.043,
54
+ "mdx_dim_f_set": 3072,
55
+ "mdx_dim_t_set": 8,
56
+ "mdx_n_fft_scale_set": 7680,
57
+ "primary_stem": "Vocals"
58
+ },
59
+ "5d343409ef0df48c7d78cce9f0106781": {
60
+ "compensate": 1.075,
61
+ "mdx_dim_f_set": 3072,
62
+ "mdx_dim_t_set": 8,
63
+ "mdx_n_fft_scale_set": 7680,
64
+ "primary_stem": "Vocals"
65
+ },
66
+ "5f6483271e1efb9bfb59e4a3e6d4d098": {
67
+ "compensate": 1.035,
68
+ "mdx_dim_f_set": 2048,
69
+ "mdx_dim_t_set": 9,
70
+ "mdx_n_fft_scale_set": 6144,
71
+ "primary_stem": "Vocals"
72
+ },
73
+ "65ab5919372a128e4167f5e01a8fda85": {
74
+ "compensate": 1.035,
75
+ "mdx_dim_f_set": 2048,
76
+ "mdx_dim_t_set": 8,
77
+ "mdx_n_fft_scale_set": 8192,
78
+ "primary_stem": "Other"
79
+ },
80
+ "6703e39f36f18aa7855ee1047765621d": {
81
+ "compensate": 1.035,
82
+ "mdx_dim_f_set": 2048,
83
+ "mdx_dim_t_set": 9,
84
+ "mdx_n_fft_scale_set": 16384,
85
+ "primary_stem": "Bass"
86
+ },
87
+ "6b31de20e84392859a3d09d43f089515": {
88
+ "compensate": 1.035,
89
+ "mdx_dim_f_set": 2048,
90
+ "mdx_dim_t_set": 8,
91
+ "mdx_n_fft_scale_set": 6144,
92
+ "primary_stem": "Vocals"
93
+ },
94
+ "867595e9de46f6ab699008295df62798": {
95
+ "compensate": 1.03,
96
+ "mdx_dim_f_set": 3072,
97
+ "mdx_dim_t_set": 8,
98
+ "mdx_n_fft_scale_set": 7680,
99
+ "primary_stem": "Vocals"
100
+ },
101
+ "a3cd63058945e777505c01d2507daf37": {
102
+ "compensate": 1.03,
103
+ "mdx_dim_f_set": 2048,
104
+ "mdx_dim_t_set": 8,
105
+ "mdx_n_fft_scale_set": 6144,
106
+ "primary_stem": "Vocals"
107
+ },
108
+ "b33d9b3950b6cbf5fe90a32608924700": {
109
+ "compensate": 1.03,
110
+ "mdx_dim_f_set": 3072,
111
+ "mdx_dim_t_set": 8,
112
+ "mdx_n_fft_scale_set": 7680,
113
+ "primary_stem": "Vocals"
114
+ },
115
+ "c3b29bdce8c4fa17ec609e16220330ab": {
116
+ "compensate": 1.035,
117
+ "mdx_dim_f_set": 2048,
118
+ "mdx_dim_t_set": 8,
119
+ "mdx_n_fft_scale_set": 16384,
120
+ "primary_stem": "Bass"
121
+ },
122
+ "ceed671467c1f64ebdfac8a2490d0d52": {
123
+ "compensate": 1.035,
124
+ "mdx_dim_f_set": 3072,
125
+ "mdx_dim_t_set": 8,
126
+ "mdx_n_fft_scale_set": 7680,
127
+ "primary_stem": "Instrumental"
128
+ },
129
+ "d2a1376f310e4f7fa37fb9b5774eb701": {
130
+ "compensate": 1.035,
131
+ "mdx_dim_f_set": 3072,
132
+ "mdx_dim_t_set": 8,
133
+ "mdx_n_fft_scale_set": 7680,
134
+ "primary_stem": "Instrumental"
135
+ },
136
+ "d7bff498db9324db933d913388cba6be": {
137
+ "compensate": 1.035,
138
+ "mdx_dim_f_set": 2048,
139
+ "mdx_dim_t_set": 8,
140
+ "mdx_n_fft_scale_set": 6144,
141
+ "primary_stem": "Vocals"
142
+ },
143
+ "d94058f8c7f1fae4164868ae8ae66b20": {
144
+ "compensate": 1.035,
145
+ "mdx_dim_f_set": 2048,
146
+ "mdx_dim_t_set": 8,
147
+ "mdx_n_fft_scale_set": 6144,
148
+ "primary_stem": "Vocals"
149
+ },
150
+ "dc41ede5961d50f277eb846db17f5319": {
151
+ "compensate": 1.035,
152
+ "mdx_dim_f_set": 2048,
153
+ "mdx_dim_t_set": 9,
154
+ "mdx_n_fft_scale_set": 4096,
155
+ "primary_stem": "Drums"
156
+ },
157
+ "e5572e58abf111f80d8241d2e44e7fa4": {
158
+ "compensate": 1.028,
159
+ "mdx_dim_f_set": 3072,
160
+ "mdx_dim_t_set": 8,
161
+ "mdx_n_fft_scale_set": 7680,
162
+ "primary_stem": "Instrumental"
163
+ },
164
+ "e7324c873b1f615c35c1967f912db92a": {
165
+ "compensate": 1.03,
166
+ "mdx_dim_f_set": 3072,
167
+ "mdx_dim_t_set": 8,
168
+ "mdx_n_fft_scale_set": 7680,
169
+ "primary_stem": "Vocals"
170
+ },
171
+ "1c56ec0224f1d559c42fd6fd2a67b154": {
172
+ "compensate": 1.025,
173
+ "mdx_dim_f_set": 2048,
174
+ "mdx_dim_t_set": 8,
175
+ "mdx_n_fft_scale_set": 5120,
176
+ "primary_stem": "Instrumental"
177
+ },
178
+ "f2df6d6863d8f435436d8b561594ff49": {
179
+ "compensate": 1.035,
180
+ "mdx_dim_f_set": 3072,
181
+ "mdx_dim_t_set": 8,
182
+ "mdx_n_fft_scale_set": 7680,
183
+ "primary_stem": "Instrumental"
184
+ },
185
+ "b06327a00d5e5fbc7d96e1781bbdb596": {
186
+ "compensate": 1.035,
187
+ "mdx_dim_f_set": 3072,
188
+ "mdx_dim_t_set": 8,
189
+ "mdx_n_fft_scale_set": 6144,
190
+ "primary_stem": "Instrumental"
191
+ },
192
+ "94ff780b977d3ca07c7a343dab2e25dd": {
193
+ "compensate": 1.039,
194
+ "mdx_dim_f_set": 3072,
195
+ "mdx_dim_t_set": 8,
196
+ "mdx_n_fft_scale_set": 6144,
197
+ "primary_stem": "Instrumental"
198
+ },
199
+ "73492b58195c3b52d34590d5474452f6": {
200
+ "compensate": 1.043,
201
+ "mdx_dim_f_set": 3072,
202
+ "mdx_dim_t_set": 8,
203
+ "mdx_n_fft_scale_set": 7680,
204
+ "primary_stem": "Vocals"
205
+ },
206
+ "970b3f9492014d18fefeedfe4773cb42": {
207
+ "compensate": 1.009,
208
+ "mdx_dim_f_set": 3072,
209
+ "mdx_dim_t_set": 8,
210
+ "mdx_n_fft_scale_set": 7680,
211
+ "primary_stem": "Vocals"
212
+ },
213
+ "1d64a6d2c30f709b8c9b4ce1366d96ee": {
214
+ "compensate": 1.065,
215
+ "mdx_dim_f_set": 2048,
216
+ "mdx_dim_t_set": 8,
217
+ "mdx_n_fft_scale_set": 5120,
218
+ "primary_stem": "Instrumental",
219
+ "is_karaoke": true
220
+ },
221
+ "203f2a3955221b64df85a41af87cf8f0": {
222
+ "compensate": 1.035,
223
+ "mdx_dim_f_set": 3072,
224
+ "mdx_dim_t_set": 8,
225
+ "mdx_n_fft_scale_set": 6144,
226
+ "primary_stem": "Instrumental"
227
+ },
228
+ "291c2049608edb52648b96e27eb80e95": {
229
+ "compensate": 1.035,
230
+ "mdx_dim_f_set": 3072,
231
+ "mdx_dim_t_set": 8,
232
+ "mdx_n_fft_scale_set": 6144,
233
+ "primary_stem": "Instrumental"
234
+ },
235
+ "ead8d05dab12ec571d67549b3aab03fc": {
236
+ "compensate": 1.035,
237
+ "mdx_dim_f_set": 3072,
238
+ "mdx_dim_t_set": 8,
239
+ "mdx_n_fft_scale_set": 6144,
240
+ "primary_stem": "Instrumental"
241
+ },
242
+ "cc63408db3d80b4d85b0287d1d7c9632": {
243
+ "compensate": 1.033,
244
+ "mdx_dim_f_set": 3072,
245
+ "mdx_dim_t_set": 8,
246
+ "mdx_n_fft_scale_set": 6144,
247
+ "primary_stem": "Instrumental"
248
+ },
249
+ "cd5b2989ad863f116c855db1dfe24e39": {
250
+ "compensate": 1.035,
251
+ "mdx_dim_f_set": 3072,
252
+ "mdx_dim_t_set": 9,
253
+ "mdx_n_fft_scale_set": 6144,
254
+ "primary_stem": "Reverb"
255
+ },
256
+ "55657dd70583b0fedfba5f67df11d711": {
257
+ "compensate": 1.022,
258
+ "mdx_dim_f_set": 3072,
259
+ "mdx_dim_t_set": 8,
260
+ "mdx_n_fft_scale_set": 6144,
261
+ "primary_stem": "Instrumental"
262
+ },
263
+ "b6bccda408a436db8500083ef3491e8b": {
264
+ "compensate": 1.02,
265
+ "mdx_dim_f_set": 3072,
266
+ "mdx_dim_t_set": 8,
267
+ "mdx_n_fft_scale_set": 7680,
268
+ "primary_stem": "Instrumental"
269
+ },
270
+ "8a88db95c7fb5dbe6a095ff2ffb428b1": {
271
+ "compensate": 1.026,
272
+ "mdx_dim_f_set": 2048,
273
+ "mdx_dim_t_set": 8,
274
+ "mdx_n_fft_scale_set": 5120,
275
+ "primary_stem": "Instrumental"
276
+ },
277
+ "b78da4afc6512f98e4756f5977f5c6b9": {
278
+ "compensate": 1.021,
279
+ "mdx_dim_f_set": 3072,
280
+ "mdx_dim_t_set": 8,
281
+ "mdx_n_fft_scale_set": 7680,
282
+ "primary_stem": "Instrumental"
283
+ },
284
+ "77d07b2667ddf05b9e3175941b4454a0": {
285
+ "compensate": 1.021,
286
+ "mdx_dim_f_set": 3072,
287
+ "mdx_dim_t_set": 8,
288
+ "mdx_n_fft_scale_set": 7680,
289
+ "primary_stem": "Vocals"
290
+ },
291
+ "0f2a6bc5b49d87d64728ee40e23bceb1": {
292
+ "compensate": 1.019,
293
+ "mdx_dim_f_set": 2560,
294
+ "mdx_dim_t_set": 8,
295
+ "mdx_n_fft_scale_set": 5120,
296
+ "primary_stem": "Instrumental"
297
+ },
298
+ "cb790d0c913647ced70fc6b38f5bea1a": {
299
+ "compensate": 1.01,
300
+ "mdx_dim_f_set": 2560,
301
+ "mdx_dim_t_set": 8,
302
+ "mdx_n_fft_scale_set": 5120,
303
+ "primary_stem": "Instrumental"
304
+ },
305
+ "b02be2d198d4968a121030cf8950b492": {
306
+ "compensate": 1.02,
307
+ "mdx_dim_f_set": 2560,
308
+ "mdx_dim_t_set": 8,
309
+ "mdx_n_fft_scale_set": 5120,
310
+ "primary_stem": "No Crowd"
311
+ },
312
+ "2154254ee89b2945b97a7efed6e88820": {
313
+ "config_yaml": "model_2_stem_061321.yaml"
314
+ },
315
+ "063aadd735d58150722926dcbf5852a9": {
316
+ "config_yaml": "model_2_stem_061321.yaml"
317
+ },
318
+ "c09f714d978b41d718facfe3427e6001": {
319
+ "config_yaml": "model_2_stem_061321.yaml"
320
+ },
321
+ "fe96801369f6a148df2720f5ced88c19": {
322
+ "config_yaml": "model3.yaml"
323
+ },
324
+ "02e8b226f85fb566e5db894b9931c640": {
325
+ "config_yaml": "model2.yaml"
326
+ },
327
+ "e3de6d861635ab9c1d766149edd680d6": {
328
+ "config_yaml": "model1.yaml"
329
+ },
330
+ "3f2936c554ab73ce2e396d54636bd373": {
331
+ "config_yaml": "modelB.yaml"
332
+ },
333
+ "890d0f6f82d7574bca741a9e8bcb8168": {
334
+ "config_yaml": "modelB.yaml"
335
+ },
336
+ "63a3cb8c37c474681049be4ad1ba8815": {
337
+ "config_yaml": "modelB.yaml"
338
+ },
339
+ "a7fc5d719743c7fd6b61bd2b4d48b9f0": {
340
+ "config_yaml": "modelA.yaml"
341
+ },
342
+ "3567f3dee6e77bf366fcb1c7b8bc3745": {
343
+ "config_yaml": "modelA.yaml"
344
+ },
345
+ "a28f4d717bd0d34cd2ff7a3b0a3d065e": {
346
+ "config_yaml": "modelA.yaml"
347
+ },
348
+ "c9971a18da20911822593dc81caa8be9": {
349
+ "config_yaml": "sndfx.yaml"
350
+ },
351
+ "57d94d5ed705460d21c75a5ac829a605": {
352
+ "config_yaml": "sndfx.yaml"
353
+ },
354
+ "e7a25f8764f25a52c1b96c4946e66ba2": {
355
+ "config_yaml": "sndfx.yaml"
356
+ },
357
+ "104081d24e37217086ce5fde09147ee1": {
358
+ "config_yaml": "model_2_stem_061321.yaml"
359
+ },
360
+ "1e6165b601539f38d0a9330f3facffeb": {
361
+ "config_yaml": "model_2_stem_061321.yaml"
362
+ },
363
+ "fe0108464ce0d8271be5ab810891bd7c": {
364
+ "config_yaml": "model_2_stem_full_band.yaml"
365
+ },
366
+ "e9b82ec90ee56c507a3a982f1555714c": {
367
+ "config_yaml": "model_2_stem_full_band_2.yaml"
368
+ },
369
+ "99b6ceaae542265a3b6d657bf9fde79f": {
370
+ "config_yaml": "model_2_stem_full_band_8k.yaml"
371
+ },
372
+ "116f6f9dabb907b53d847ed9f7a9475f": {
373
+ "config_yaml": "model_2_stem_full_band_8k.yaml"
374
+ },
375
+ "53f707017bfcbb56f5e1bfac420d6732": {
376
+ "config_yaml": "model_bs_roformer_ep_317_sdr_12.9755.yaml",
377
+ "is_roformer": true
378
+ },
379
+ "63e41acc264bf681a73aa9f7e5f606cc": {
380
+ "config_yaml": "model_mel_band_roformer_ep_3005_sdr_11.4360.yaml",
381
+ "is_roformer": true
382
+ },
383
+ "e733736763234047587931fc35322fd9": {
384
+ "config_yaml": "model_bs_roformer_ep_937_sdr_10.5309.yaml",
385
+ "is_roformer": true
386
+ },
387
+ "d7a256bee3e7c620f554bceaab2f68f6": {
388
+ "config_yaml": "config_melbandroformer_inst.yaml",
389
+ "is_roformer": true
390
+ },
391
+ "365ccfa0e04b31ac2e24bbb935142a81": {
392
+ "config_yaml": "config_melbandroformer_inst.yaml",
393
+ "is_roformer": true
394
+ },
395
+ "3c15abf122d8eccc4a0eb97bf84a3e58": {
396
+ "config_yaml": "config_melbandroformer_instvoc_duality.yaml",
397
+ "is_roformer": true
398
+ },
399
+ "9fb197af219c5172ea38703a33aceb79": {
400
+ "config_yaml": "config_melbandroformer_instvoc_duality.yaml",
401
+ "is_roformer": true
402
+ },
403
+ "d789065adfd747d6f585b27b495bcdae": {
404
+ "config_yaml": "model_bs_roformer_ep_368_sdr_12.9628.yaml",
405
+ "is_roformer": true
406
+ },
407
+ "e4ca75912fcff3224a19058e55facfbf": {
408
+ "config_yaml": "config_vocals_mel_band_roformer_kim.yaml",
409
+ "is_roformer": true
410
+ },
411
+ "951f8ef420a941a395a9919f5d55cce9": {
412
+ "config_yaml": "config_melbandroformer_inst_v2.yaml",
413
+ "is_roformer": true
414
+ },
415
+ "8d306349be961bd1de6b02ee598920b3": {
416
+ "config_yaml": "config_mel_band_roformer_karaoke.yaml",
417
+ "is_roformer": true,
418
+ "model_type": "MelBand-Roformer",
419
+ "is_karaoke": true
420
+ },
421
+ "b1e6014529f71b38dae30aec978c8eb4": {
422
+ "config_yaml": "aufr33-jarredou_DrumSep_model_mdx23c_ep_141_sdr_10.8059.yaml",
423
+ "is_roformer": false,
424
+ "model_type": "MDX23C",
425
+ "is_karaoke": false
426
+ },
427
+ "490993ba65cd8b61a97a7c61650ddb7d": {
428
+ "config_yaml": "BS_Inst_EXP_VRL.yaml",
429
+ "is_roformer": true,
430
+ "model_type": "BS-Roformer v2",
431
+ "is_karaoke": false
432
+ },
433
+ "cfa9cee352938476c31e92656f1b9074": {
434
+ "config_yaml": "config_dnr_bandit_v2_mus64.yaml",
435
+ "is_roformer": true,
436
+ "model_type": "Bandit v2",
437
+ "is_karaoke": false
438
+ },
439
+ "8318a54fe1278ddcf78aad32145c0a6f": {
440
+ "config_yaml": "deverb_bs_roformer_8_256dim_8depth.yaml",
441
+ "is_roformer": true,
442
+ "model_type": "BS-Roformer",
443
+ "is_karaoke": false
444
+ },
445
+ "3bff56e6709357854e71cb2e7802733a": {
446
+ "config_yaml": "config_dnr_bandit_bsrnn_multi_mus64.yaml",
447
+ "is_roformer": true,
448
+ "model_type": "Bandit",
449
+ "is_karaoke": false
450
+ },
451
+ "c7500d7fdb1c0fc24b14b698515462d2": {
452
+ "config_yaml": "config_mdx23c_similarity.yaml",
453
+ "is_roformer": false,
454
+ "model_type": "MDX23C",
455
+ "is_karaoke": false
456
+ },
457
+ "77dd1942c0feb5c04ad0b4effa34fbc6": {
458
+ "config_yaml": "config_musdb18_scnet_large.yaml",
459
+ "is_roformer": true,
460
+ "model_type": "SCNet",
461
+ "is_karaoke": false
462
+ },
463
+ "6e749f5803b8baa7988a10393bb96df1": {
464
+ "config_yaml": "config_musdb18_scnet.yaml",
465
+ "is_roformer": true,
466
+ "model_type": "SCNet",
467
+ "is_karaoke": false
468
+ },
469
+ "bb845b04a3cb0d7a32cdb1f2ce1e007c": {
470
+ "config_yaml": "config_musdb18_scnet_large_starrytong.yaml",
471
+ "is_roformer": true,
472
+ "model_type": "SCNet",
473
+ "is_karaoke": false
474
+ },
475
+ "85b1955d57c0719e94e832eaaeceb0a8": {
476
+ "config_yaml": "config_musdb18_scnet_xl.yaml",
477
+ "is_roformer": true,
478
+ "model_type": "SCNet",
479
+ "is_karaoke": false
480
+ }
481
+
482
+ }
vocal_separator/vr_model_data.json ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "0d0e6d143046b0eecc41a22e60224582": {
3
+ "vr_model_param": "3band_44100_mid",
4
+ "primary_stem": "Instrumental"
5
+ },
6
+ "18b52f873021a0af556fb4ecd552bb8e": {
7
+ "vr_model_param": "2band_32000",
8
+ "primary_stem": "Instrumental"
9
+ },
10
+ "1fc66027c82b499c7d8f55f79e64cadc": {
11
+ "vr_model_param": "2band_32000",
12
+ "primary_stem": "Instrumental"
13
+ },
14
+ "2aa34fbc01f8e6d2bf509726481e7142": {
15
+ "vr_model_param": "4band_44100",
16
+ "primary_stem": "No Piano"
17
+ },
18
+ "3e18f639b11abea7361db1a4a91c2559": {
19
+ "vr_model_param": "4band_44100",
20
+ "primary_stem": "Instrumental"
21
+ },
22
+ "570b5f50054609a17741369a35007ddd": {
23
+ "vr_model_param": "4band_v3",
24
+ "primary_stem": "Instrumental"
25
+ },
26
+ "5a6e24c1b530f2dab045a522ef89b751": {
27
+ "vr_model_param": "1band_sr44100_hl512",
28
+ "primary_stem": "Instrumental"
29
+ },
30
+ "6b5916069a49be3fe29d4397ecfd73fa": {
31
+ "vr_model_param": "3band_44100_msb2",
32
+ "primary_stem": "Instrumental",
33
+ "is_karaoke": true
34
+ },
35
+ "74b3bc5fa2b69f29baf7839b858bc679": {
36
+ "vr_model_param": "4band_44100",
37
+ "primary_stem": "Instrumental"
38
+ },
39
+ "827213b316df36b52a1f3d04fec89369": {
40
+ "vr_model_param": "4band_44100",
41
+ "primary_stem": "Instrumental"
42
+ },
43
+ "911d4048eee7223eca4ee0efb7d29256": {
44
+ "vr_model_param": "4band_44100",
45
+ "primary_stem": "Vocals"
46
+ },
47
+ "941f3f7f0b0341f12087aacdfef644b1": {
48
+ "vr_model_param": "4band_v2",
49
+ "primary_stem": "Instrumental"
50
+ },
51
+ "a02827cf69d75781a35c0e8a327f3195": {
52
+ "vr_model_param": "1band_sr33075_hl384",
53
+ "primary_stem": "Instrumental"
54
+ },
55
+ "b165fbff113c959dba5303b74c6484bc": {
56
+ "vr_model_param": "3band_44100",
57
+ "primary_stem": "Instrumental"
58
+ },
59
+ "b5f988cd3e891dca7253bf5f0f3427c7": {
60
+ "vr_model_param": "4band_44100",
61
+ "primary_stem": "Instrumental"
62
+ },
63
+ "b99c35723bc35cb11ed14a4780006a80": {
64
+ "vr_model_param": "1band_sr44100_hl1024",
65
+ "primary_stem": "Instrumental"
66
+ },
67
+ "ba02fd25b71d620eebbdb49e18e4c336": {
68
+ "vr_model_param": "3band_44100_mid",
69
+ "primary_stem": "Instrumental"
70
+ },
71
+ "c4476ef424d8cba65f38d8d04e8514e2": {
72
+ "vr_model_param": "3band_44100_msb2",
73
+ "primary_stem": "Instrumental"
74
+ },
75
+ "da2d37b8be2972e550a409bae08335aa": {
76
+ "vr_model_param": "4band_44100",
77
+ "primary_stem": "Vocals"
78
+ },
79
+ "db57205d3133e39df8e050b435a78c80": {
80
+ "vr_model_param": "4band_44100",
81
+ "primary_stem": "Instrumental"
82
+ },
83
+ "ea83b08e32ec2303456fe50659035f69": {
84
+ "vr_model_param": "4band_v3",
85
+ "primary_stem": "Instrumental"
86
+ },
87
+ "f6ea8473ff86017b5ebd586ccacf156b": {
88
+ "vr_model_param": "4band_v2_sn",
89
+ "primary_stem": "Instrumental",
90
+ "is_karaoke": true
91
+ },
92
+ "fd297a61eafc9d829033f8b987c39a3d": {
93
+ "vr_model_param": "1band_sr32000_hl512",
94
+ "primary_stem": "Instrumental"
95
+ },
96
+ "0ec76fd9e65f81d8b4fbd13af4826ed8": {
97
+ "vr_model_param": "4band_v3",
98
+ "primary_stem": "No Woodwinds"
99
+ },
100
+ "0fb9249ffe4ffc38d7b16243f394c0ff": {
101
+ "vr_model_param": "4band_v3",
102
+ "primary_stem": "No Reverb"
103
+ },
104
+ "6857b2972e1754913aad0c9a1678c753": {
105
+ "vr_model_param": "4band_v3",
106
+ "primary_stem": "No Echo",
107
+ "nout": 48,
108
+ "nout_lstm": 128
109
+ },
110
+ "f200a145434efc7dcf0cd093f517ed52": {
111
+ "vr_model_param": "4band_v3",
112
+ "primary_stem": "No Echo",
113
+ "nout": 48,
114
+ "nout_lstm": 128
115
+ },
116
+ "44c55d8b5d2e3edea98c2b2bf93071c7": {
117
+ "vr_model_param": "4band_v3",
118
+ "primary_stem": "Noise",
119
+ "nout": 48,
120
+ "nout_lstm": 128
121
+ },
122
+ "51ea8c43a6928ed3c10ef5cb2707d57b": {
123
+ "vr_model_param": "1band_sr44100_hl1024",
124
+ "primary_stem": "Noise",
125
+ "nout": 16,
126
+ "nout_lstm": 128
127
+ },
128
+ "944950a9c5963a5eb70b445d67b7068a": {
129
+ "vr_model_param": "4band_v3_sn",
130
+ "primary_stem": "Vocals",
131
+ "nout": 64,
132
+ "nout_lstm": 128,
133
+ "is_karaoke": false,
134
+ "is_bv_model": true,
135
+ "is_bv_model_rebalanced": 0.9
136
+ }
137
+ }