Instructions to use ResembleAI/Chatterbox-Multilingual-hi with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Chatterbox
How to use ResembleAI/Chatterbox-Multilingual-hi with Chatterbox:
# pip install chatterbox-tts import torchaudio as ta from chatterbox.tts import ChatterboxTTS model = ChatterboxTTS.from_pretrained(device="cuda") text = "Ezreal and Jinx teamed up with Ahri, Yasuo, and Teemo to take down the enemy's Nexus in an epic late-game pentakill." wav = model.generate(text) ta.save("test-1.wav", wav, model.sr) # If you want to synthesize with a different voice, specify the audio prompt AUDIO_PROMPT_PATH="YOUR_FILE.wav" wav = model.generate(text, audio_prompt_path=AUDIO_PROMPT_PATH) ta.save("test-2.wav", wav, model.sr) - Notebooks
- Google Colab
- Kaggle
| license: mit | |
| language: | |
| - hi | |
| tags: | |
| - chatterbox | |
| - text-to-speech | |
| - tts | |
| - multilingual | |
| - single-language-tts | |
| - voice-cloning | |
| - chatterbox-v3 | |
| pipeline_tag: text-to-speech | |
| base_model: ResembleAI/chatterbox | |
| base_model_relation: finetune | |
| <!-- chatterbox-space-link --> | |
| > 🎙️ **Live demo:** Try this model in the [`ResembleAI/Chatterbox-Multilingual-TTS-hi`](https://huggingface.co/spaces/ResembleAI/Chatterbox-Multilingual-TTS-hi) Space. | |
| <!-- chatterbox-space-link --> | |
| # Chatterbox Multilingual: Hindi | |
| Chatterbox Multilingual: Hindi is a dedicated single-language finetune in the **Chatterbox Multilingual V3 Single Language Pack**. It is optimized for Hindi, with language- and region-specific behavior for expressive text-to-speech and voice cloning. | |
| Use this model when you want tighter Hindi quality control than the broad multilingual checkpoint. For a single model that covers all supported languages, use [`ResembleAI/chatterbox`](https://huggingface.co/ResembleAI/chatterbox). | |
| ## Demo | |
| Try the hosted demo Space: [`ResembleAI/Chatterbox-Multilingual-TTS-hi`](https://huggingface.co/spaces/ResembleAI/Chatterbox-Multilingual-TTS-hi). | |
| ## Files | |
| - `t3_hi.safetensors`: T3 state dict in safetensors format. | |
| - `s3gen_v3.pt` / `s3gen_v3.safetensors`: V3 S3Gen speech decoder checkpoint. | |
| - `grapheme_mtl_merged_expanded_v1.json`: multilingual tokenizer config. | |
| ## Language | |
| - Locale: `hi` | |
| - Chatterbox language ID: `hi` | |
| ## Checkpoint Metadata | |
| - Source step: `131000` | |
| - Source checkpoint: `t3_131000.pth.tar` | |
| - Tensor count: `292` | |
| - Dtype: `float32` | |
| - Text embedding shape: `(2454, 1024)` | |
| - Speech embedding shape: `(8194, 1024)` | |
| - Size: `2143990224` bytes | |
| - SHA256: `89fd813802e2cf7350d609959cec4dae63dd58f445651b05009262d7e24780f9` | |
| ## Loader Notes | |
| This repository contains Chatterbox Multilingual V3 single-language assets used by the linked demo Space. The T3 checkpoint is loaded with multilingual vocabulary shape `2454` and S3 speech vocabulary shape `8194`. | |
| The demo combines these model-specific assets with the shared Chatterbox inference code and companion assets needed for end-to-end speech generation. | |