niobures
/

Chatterbox-TTS

ONNX

Safetensors

GGUF

Model card Files Files and versions

xet

Community

niobures commited on Aug 22, 2025

Commit

9e8adec

verified ·

1 Parent(s): 96b9357

Update fr/README.md

Browse files

Files changed (1) hide show

fr/README.md +115 -111

fr/README.md CHANGED Viewed

@@ -1,111 +1,115 @@
----
-license: cc-by-4.0
-datasets:
-- amphion/Emilia-Dataset
-language:
-- fr
-base_model:
-- ResembleAI/chatterbox
-pipeline_tag: text-to-speech
-tags:
-- french
-- audio
-- speech
-- tts
-- fine-tuning
-- chatterbox
-- Emilia
-- voice-cloning
-- zero-shot
----
-# Chatterbox TTS French 🥖
-**Chatterbox TTS French** is a fine-tuned text-to-speech model specialized for the French language. The model has been trained on high-quality voice data for natural and expressive speech synthesis.
-<div align="center"><img width="400px" src="https://ih1.redbubble.net/image.5397735048.6235/bg,f8f8f8-flat,750x,075,f-pad,750x1000,f8f8f8.jpg" alt="baguette-france-tour-eiffel-image" /></div>
-- 🔊 **Language**: French 🇫🇷
-- 🗣️ **Training dataset**: [Emilia Dataset (FR branch)](https://huggingface.co/datasets/amphion/Emilia-Dataset/tree/main/FR)
-- ⏱️ **Data quantity**: 1400 hours of audio
-## Usage Example
-Here’s how to generate speech using Chatterbox-TTS French:
-```python
-import torch
-import soundfile as sf
-from chatterbox.tts import ChatterboxTTS
-from huggingface_hub import hf_hub_download
-from safetensors.torch import load_file
-# Configuration
-MODEL_REPO = "Thomcles/Chatterbox-TTS-French"
-CHECKPOINT_FILENAME = "t3_cfg.safetensors"
-OUTPUT_PATH = "output_cloned_voice.wav"
-TEXT_TO_SYNTHESIZE = "Jean-Paul Sartre laisse à la postérité une œuvre considérable, tant littéraire que philosophique, ayant influencée à la fois la vie politique française d'après-guerre et les penseurs de son temps (Merleau-Ponty et Alain Badiou notamment)."
-def get_device() -> str:
-    return "cuda" if torch.cuda.is_available() else "cpu"
-def download_checkpoint(repo: str, filename: str) -> str:
-    return hf_hub_download(repo_id=repo, filename=filename)
-def load_tts_model(repo: str, checkpoint_file: str, device: str) -> ChatterboxTTS:
-    model = ChatterboxTTS.from_pretrained(device=device)
-    checkpoint_path = download_checkpoint(repo, checkpoint_file)
-    t3_state = load_file(checkpoint_path, device="cpu")
-    model.t3.load_state_dict(t3_state)
-    return model
-def synthesize_speech(model: ChatterboxTTS, text: str, audio_prompt_path:str, **kwargs) -> torch.Tensor:
-    with torch.inference_mode():
-        return model.generate(text, audio_prompt_path, **kwargs)
-def save_audio(waveform: torch.Tensor, path: str, sample_rate: int):
-    sf.write(path, waveform.squeeze().cpu().numpy(), sample_rate)
-def main():
-    print("Loading model...")
-    device = get_device()
-    model = load_tts_model(MODEL_REPO, CHECKPOINT_FILENAME, device)
-    print(f"Generating speech on {device}...")
-    wav = synthesize_speech(
-        model,
-        TEXT_TO_SYNTHESIZE,
-        audio_prompt_path=None
-        exaggeration=0.5,
-        temperature=0.6,
-        cfg_weight=0.3
-    )
-    print(f"Saving output to: {OUTPUT_PATH}")
-    save_audio(wav, OUTPUT_PATH, model.sr)
-    print("Done.")
-if __name__ == "__main__":
-    main()
-```
-Here is the output:
-<audio controls src="https://huggingface.co/Thomcles/Chatterbox-TTS-French/resolve/main/example.mp3">Your browser does not support audio.</audio>
-### Base model license
-The base model is licensed under the MIT License.
-Base model: [Chatterbox](https://huggingface.co/ResembleAI/chatterbox)
-License: [MIT](https://choosealicense.com/licenses/mit/)
-### Training Data License
-This model was fine-tuned using a dataset licensed under Creative Commons Attribution 4.0 (CC BY 4.0).
-Dataset: [Emilia](https://huggingface.co/datasets/amphion/Emilia-Dataset)
-License: [Creative Commons Attribution 4.0 International](https://choosealicense.com/licenses/cc-by-4.0/)
-### Contact me
-Interested in fine-tuning a TTS model in a specific language or building a multilingual voice solution? Don’t hesitate to reach out.

+---
+license: cc-by-4.0
+datasets:
+- amphion/Emilia-Dataset
+language:
+- fr
+base_model:
+- ResembleAI/chatterbox
+pipeline_tag: text-to-speech
+tags:
+- french
+- audio
+- speech
+- tts
+- fine-tuning
+- chatterbox
+- Emilia
+- voice-cloning
+- zero-shot
+---
+# Chatterbox TTS French 🥖
+**Chatterbox TTS French** is a fine-tuned text-to-speech model specialized for the French language. The model has been trained on high-quality voice data for natural and expressive speech synthesis.
+<div align="center"><img width="400px" src="https://ih1.redbubble.net/image.5397735048.6235/bg,f8f8f8-flat,750x,075,f-pad,750x1000,f8f8f8.jpg" alt="baguette-france-tour-eiffel-image" /></div>
+- 🔊 **Language**: French 🇫🇷
+- 🗣️ **Training dataset**: [Emilia Dataset (FR branch)](https://huggingface.co/datasets/amphion/Emilia-Dataset)
+- ⏱️ **Data quantity**: 1400 hours of audio
+## Usage Example
+Here’s how to generate speech using Chatterbox-TTS French:
+```python
+import torch
+import soundfile as sf
+from chatterbox.tts import ChatterboxTTS
+from huggingface_hub import hf_hub_download
+from safetensors.torch import load_file
+# Configuration
+MODEL_REPO = "Thomcles/Chatterbox-TTS-French"
+CHECKPOINT_FILENAME = "t3_cfg.safetensors"
+OUTPUT_PATH = "output_cloned_voice.wav"
+TEXT_TO_SYNTHESIZE = "Jean-Paul Sartre laisse à la postérité une œuvre considérable, tant littéraire que philosophique, ayant influencée à la fois la vie politique française d'après-guerre et les penseurs de son temps (Merleau-Ponty et Alain Badiou notamment)."
+def get_device() -> str:
+    return "cuda" if torch.cuda.is_available() else "cpu"
+def download_checkpoint(repo: str, filename: str) -> str:
+    return hf_hub_download(repo_id=repo, filename=filename)
+def load_tts_model(repo: str, checkpoint_file: str, device: str) -> ChatterboxTTS:
+    model = ChatterboxTTS.from_pretrained(device=device)
+    checkpoint_path = download_checkpoint(repo, checkpoint_file)
+    t3_state = load_file(checkpoint_path, device="cpu")
+    model.t3.load_state_dict(t3_state)
+    return model
+def synthesize_speech(model: ChatterboxTTS, text: str, audio_prompt_path:str, **kwargs) -> torch.Tensor:
+    with torch.inference_mode():
+        return model.generate(
+            text=text,
+            audio_prompt_path=audio_prompt_path,
+            **kwargs
+        )
+def save_audio(waveform: torch.Tensor, path: str, sample_rate: int):
+    sf.write(path, waveform.squeeze().cpu().numpy(), sample_rate)
+def main():
+    print("Loading model...")
+    device = get_device()
+    model = load_tts_model(MODEL_REPO, CHECKPOINT_FILENAME, device)
+    print(f"Generating speech on {device}...")
+    wav = synthesize_speech(
+        model,
+        TEXT_TO_SYNTHESIZE,
+        audio_prompt_path=None,
+        exaggeration=0.5,
+        temperature=0.6,
+        cfg_weight=0.3
+    )
+    print(f"Saving output to: {OUTPUT_PATH}")
+    save_audio(wav, OUTPUT_PATH, model.sr)
+    print("Done.")
+if __name__ == "__main__":
+    main()
+```
+Here is the output:
+<audio controls src="https://huggingface.co/Thomcles/Chatterbox-TTS-French/resolve/main/example.mp3">Your browser does not support audio.</audio>
+### Base model license
+The base model is licensed under the MIT License.
+Base model: [Chatterbox](https://huggingface.co/ResembleAI/chatterbox)
+License: [MIT](https://choosealicense.com/licenses/mit/)
+### Training Data License
+This model was fine-tuned using a dataset licensed under Creative Commons Attribution 4.0 (CC BY 4.0).
+Dataset: [Emilia](https://huggingface.co/datasets/amphion/Emilia-Dataset)
+License: [Creative Commons Attribution 4.0 International](https://choosealicense.com/licenses/cc-by-4.0/)
+### Contact me
+Interested in fine-tuning a TTS model in a specific language or building a multilingual voice solution? Don’t hesitate to reach out.