Fix base model reference: multilingual chatterbox; fix usage example
Browse files
README.md
CHANGED
|
@@ -18,7 +18,7 @@ A fine-tuned version of [ResembleAI/chatterbox](https://huggingface.co/ResembleA
|
|
| 18 |
|
| 19 |
## Model Details
|
| 20 |
|
| 21 |
-
- **Base model**: ResembleAI/chatterbox (
|
| 22 |
- **Fine-tuned on**: Bengali speech corpus (~99 hours, 58,820 samples)
|
| 23 |
- ai4bharat/Shrutilipi (Bengali split) — 17,882 samples, ~28h
|
| 24 |
- ai4bharat/Rasa (Bengali split) — 28,088 samples, ~51h
|
|
@@ -30,15 +30,16 @@ A fine-tuned version of [ResembleAI/chatterbox](https://huggingface.co/ResembleA
|
|
| 30 |
## Usage
|
| 31 |
|
| 32 |
```python
|
| 33 |
-
import
|
| 34 |
from chatterbox.tts import ChatterboxTTS
|
|
|
|
| 35 |
|
| 36 |
-
|
|
|
|
| 37 |
|
| 38 |
text = "আমি বাংলায় কথা বলতে পারি। এটি একটি পরীক্ষামূলক বাক্য।"
|
| 39 |
wav = model.generate(text)
|
| 40 |
|
| 41 |
-
import torchaudio
|
| 42 |
torchaudio.save("output.wav", wav, model.sr)
|
| 43 |
```
|
| 44 |
|
|
@@ -67,4 +68,4 @@ Datasets sourced from AI4Bharat and SPRINGLab public datasets.
|
|
| 67 |
|
| 68 |
- Optimized for Bengali; other languages may degrade
|
| 69 |
- Best results with clear, well-punctuated Bengali text
|
| 70 |
-
- Emotion control inherited from base ChatterBox model
|
|
|
|
| 18 |
|
| 19 |
## Model Details
|
| 20 |
|
| 21 |
+
- **Base model**: ResembleAI/chatterbox — multilingual ChatterBox (supports 23 languages)
|
| 22 |
- **Fine-tuned on**: Bengali speech corpus (~99 hours, 58,820 samples)
|
| 23 |
- ai4bharat/Shrutilipi (Bengali split) — 17,882 samples, ~28h
|
| 24 |
- ai4bharat/Rasa (Bengali split) — 28,088 samples, ~51h
|
|
|
|
| 30 |
## Usage
|
| 31 |
|
| 32 |
```python
|
| 33 |
+
from huggingface_hub import snapshot_download
|
| 34 |
from chatterbox.tts import ChatterboxTTS
|
| 35 |
+
import torchaudio
|
| 36 |
|
| 37 |
+
model_dir = snapshot_download("arijitx/chatterbox-bangla")
|
| 38 |
+
model = ChatterboxTTS.from_local(model_dir, device="cuda")
|
| 39 |
|
| 40 |
text = "আমি বাংলায় কথা বলতে পারি। এটি একটি পরীক্ষামূলক বাক্য।"
|
| 41 |
wav = model.generate(text)
|
| 42 |
|
|
|
|
| 43 |
torchaudio.save("output.wav", wav, model.sr)
|
| 44 |
```
|
| 45 |
|
|
|
|
| 68 |
|
| 69 |
- Optimized for Bengali; other languages may degrade
|
| 70 |
- Best results with clear, well-punctuated Bengali text
|
| 71 |
+
- Emotion control inherited from base ChatterBox multilingual model
|