File size: 1,539 Bytes
a769b35
 
 
 
 
 
 
 
 
 
 
d73f1c3
a769b35
 
 
 
 
 
a3ef6b1
a769b35
 
 
 
 
fd3389f
52f18c9
9b1ee80
 
 
320df04
9b1ee80
 
 
a769b35
 
 
 
 
 
 
 
9b1ee80
a769b35
0da7790
9b1ee80
 
a769b35
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---
library_name: mlx-audio
base_model:
- ResembleAI/chatterbox
tags:
- mlx
pipeline_tag: text-to-speech
---

# litmudoc/Chatterbox-Multilingual-MLX-v2-fp16

This model was converted to MLX format from [ResembleAI/chatterbox](https://huggingface.co/ResembleAI/chatterbox) using [mlx-audio](https://github.com/Blaizzy/mlx-audio) version **0.2.10**.

**Note:** This model requires the S3Tokenizer weights from [mlx-community/S3TokenizerV2](https://huggingface.co/mlx-community/S3TokenizerV2), which will be downloaded automatically.

## Use with mlx-audio

```bash
pip install -U mlx-audio
```

### Command line

```bash
curl -L -o ko.wav https://huggingface.co/litmudoc/Chatterbox-Multilingual-MLX-v2-fp16/resolve/main/ko.wav

mlx_audio.tts.generate \
  --model litmudoc/Chatterbox-Multilingual-MLX-v2-fp16 \
  --text ", μ§€λ‚œλ‹¬ μš°λ¦¬λŠ” 유튜브 μ±„λ„μ—μ„œ 이십얡 μ‘°νšŒμˆ˜λΌλŠ” μƒˆλ‘œμš΄ μ΄μ •ν‘œμ— λ„λ‹¬ν–ˆμŠ΅λ‹ˆλ‹€." \
  --lang_code ko \
  --ref_audio ko.wav \
  --ref_text "μš°λ¦¬λŠ” μ •λ§λ‘œ ν—ˆλ¦„ν•œ ν˜Έν…”μ— λ¬΅μ—ˆμ§€λ§Œ, κ·Έλž˜λ„ ν–‰λ³΅ν–ˆλ‹€." \
  --verbose --play
```

### Python

```python
from mlx_audio.tts.generate import generate_audio

generate_audio(
    text=", μ§€λ‚œλ‹¬ μš°λ¦¬λŠ” 유튜브 μ±„λ„μ—μ„œ 이십얡 μ‘°νšŒμˆ˜λΌλŠ” μƒˆλ‘œμš΄ μ΄μ •ν‘œμ— λ„λ‹¬ν–ˆμŠ΅λ‹ˆλ‹€.",
    model="litmudoc/Chatterbox-Multilingual-MLX-v2-fp16",
    lang_code="ko",
    ref_audio="ko.wav",
    ref_text="μš°λ¦¬λŠ” μ •λ§λ‘œ ν—ˆλ¦„ν•œ ν˜Έν…”μ— λ¬΅μ—ˆμ§€λ§Œ, κ·Έλž˜λ„ ν–‰λ³΅ν–ˆλ‹€.",
    file_prefix="output",
)
```