Text-to-Audio
MLX
Safetensors
apple-silicon
singing-voice-synthesis
singing-voice-conversion
soulx-singer
Instructions to use mlx-community/SoulX-Singer-bf16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/SoulX-Singer-bf16 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir SoulX-Singer-bf16 mlx-community/SoulX-Singer-bf16
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
Add config.yaml
Browse files- config.yaml +37 -0
config.yaml
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
infer:
|
| 2 |
+
n_steps: 32
|
| 3 |
+
cfg: 3
|
| 4 |
+
|
| 5 |
+
audio:
|
| 6 |
+
hop_size: 480
|
| 7 |
+
sample_rate: 24000
|
| 8 |
+
max_length: 36000
|
| 9 |
+
n_fft: 1920
|
| 10 |
+
num_mels: 128
|
| 11 |
+
win_size: 1920
|
| 12 |
+
fmin: 0
|
| 13 |
+
fmax: 12000
|
| 14 |
+
mel_var: 8.14
|
| 15 |
+
mel_mean: -4.92
|
| 16 |
+
|
| 17 |
+
model:
|
| 18 |
+
encoder:
|
| 19 |
+
vocab_size: 3000
|
| 20 |
+
text_dim: 512
|
| 21 |
+
pitch_dim: 512
|
| 22 |
+
type_dim: 512
|
| 23 |
+
f0_bin: 361
|
| 24 |
+
f0_dim: 512
|
| 25 |
+
num_layers: 4
|
| 26 |
+
|
| 27 |
+
flow_matching:
|
| 28 |
+
mel_dim: 128
|
| 29 |
+
hidden_size: 1024
|
| 30 |
+
num_layers: 22
|
| 31 |
+
num_heads: 16
|
| 32 |
+
cfg_drop_prob: 0.2
|
| 33 |
+
use_embedding: False
|
| 34 |
+
cond_codebook_size: 512
|
| 35 |
+
cond_scale_factor: 1
|
| 36 |
+
sigma: 1e-5
|
| 37 |
+
time_scheduler: cos
|