niobures commited on
Commit
9e8adec
·
verified ·
1 Parent(s): 96b9357

Update fr/README.md

Browse files
Files changed (1) hide show
  1. fr/README.md +115 -111
fr/README.md CHANGED
@@ -1,111 +1,115 @@
1
- ---
2
- license: cc-by-4.0
3
- datasets:
4
- - amphion/Emilia-Dataset
5
- language:
6
- - fr
7
- base_model:
8
- - ResembleAI/chatterbox
9
- pipeline_tag: text-to-speech
10
- tags:
11
- - french
12
- - audio
13
- - speech
14
- - tts
15
- - fine-tuning
16
- - chatterbox
17
- - Emilia
18
- - voice-cloning
19
- - zero-shot
20
- ---
21
-
22
- # Chatterbox TTS French 🥖
23
-
24
- **Chatterbox TTS French** is a fine-tuned text-to-speech model specialized for the French language. The model has been trained on high-quality voice data for natural and expressive speech synthesis.
25
-
26
- <div align="center"><img width="400px" src="https://ih1.redbubble.net/image.5397735048.6235/bg,f8f8f8-flat,750x,075,f-pad,750x1000,f8f8f8.jpg" alt="baguette-france-tour-eiffel-image" /></div>
27
-
28
- - 🔊 **Language**: French 🇫🇷
29
- - 🗣️ **Training dataset**: [Emilia Dataset (FR branch)](https://huggingface.co/datasets/amphion/Emilia-Dataset/tree/main/FR)
30
- - ⏱️ **Data quantity**: 1400 hours of audio
31
-
32
- ## Usage Example
33
-
34
- Here’s how to generate speech using Chatterbox-TTS French:
35
-
36
- ```python
37
- import torch
38
- import soundfile as sf
39
- from chatterbox.tts import ChatterboxTTS
40
- from huggingface_hub import hf_hub_download
41
- from safetensors.torch import load_file
42
-
43
- # Configuration
44
- MODEL_REPO = "Thomcles/Chatterbox-TTS-French"
45
- CHECKPOINT_FILENAME = "t3_cfg.safetensors"
46
- OUTPUT_PATH = "output_cloned_voice.wav"
47
- TEXT_TO_SYNTHESIZE = "Jean-Paul Sartre laisse à la postérité une œuvre considérable, tant littéraire que philosophique, ayant influencée à la fois la vie politique française d'après-guerre et les penseurs de son temps (Merleau-Ponty et Alain Badiou notamment)."
48
-
49
- def get_device() -> str:
50
- return "cuda" if torch.cuda.is_available() else "cpu"
51
-
52
- def download_checkpoint(repo: str, filename: str) -> str:
53
- return hf_hub_download(repo_id=repo, filename=filename)
54
-
55
- def load_tts_model(repo: str, checkpoint_file: str, device: str) -> ChatterboxTTS:
56
- model = ChatterboxTTS.from_pretrained(device=device)
57
- checkpoint_path = download_checkpoint(repo, checkpoint_file)
58
- t3_state = load_file(checkpoint_path, device="cpu")
59
- model.t3.load_state_dict(t3_state)
60
- return model
61
-
62
- def synthesize_speech(model: ChatterboxTTS, text: str, audio_prompt_path:str, **kwargs) -> torch.Tensor:
63
- with torch.inference_mode():
64
- return model.generate(text, audio_prompt_path, **kwargs)
65
-
66
- def save_audio(waveform: torch.Tensor, path: str, sample_rate: int):
67
- sf.write(path, waveform.squeeze().cpu().numpy(), sample_rate)
68
-
69
- def main():
70
- print("Loading model...")
71
- device = get_device()
72
- model = load_tts_model(MODEL_REPO, CHECKPOINT_FILENAME, device)
73
-
74
- print(f"Generating speech on {device}...")
75
- wav = synthesize_speech(
76
- model,
77
- TEXT_TO_SYNTHESIZE,
78
- audio_prompt_path=None
79
- exaggeration=0.5,
80
- temperature=0.6,
81
- cfg_weight=0.3
82
- )
83
-
84
- print(f"Saving output to: {OUTPUT_PATH}")
85
- save_audio(wav, OUTPUT_PATH, model.sr)
86
- print("Done.")
87
-
88
- if __name__ == "__main__":
89
- main()
90
- ```
91
-
92
- Here is the output:
93
-
94
- <audio controls src="https://huggingface.co/Thomcles/Chatterbox-TTS-French/resolve/main/example.mp3">Your browser does not support audio.</audio>
95
-
96
- ### Base model license
97
-
98
- The base model is licensed under the MIT License.
99
- Base model: [Chatterbox](https://huggingface.co/ResembleAI/chatterbox)
100
- License: [MIT](https://choosealicense.com/licenses/mit/)
101
-
102
- ### Training Data License
103
-
104
- This model was fine-tuned using a dataset licensed under Creative Commons Attribution 4.0 (CC BY 4.0).
105
- Dataset: [Emilia](https://huggingface.co/datasets/amphion/Emilia-Dataset)
106
- License: [Creative Commons Attribution 4.0 International](https://choosealicense.com/licenses/cc-by-4.0/)
107
-
108
-
109
- ### Contact me
110
-
111
- Interested in fine-tuning a TTS model in a specific language or building a multilingual voice solution? Don’t hesitate to reach out.
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ datasets:
4
+ - amphion/Emilia-Dataset
5
+ language:
6
+ - fr
7
+ base_model:
8
+ - ResembleAI/chatterbox
9
+ pipeline_tag: text-to-speech
10
+ tags:
11
+ - french
12
+ - audio
13
+ - speech
14
+ - tts
15
+ - fine-tuning
16
+ - chatterbox
17
+ - Emilia
18
+ - voice-cloning
19
+ - zero-shot
20
+ ---
21
+
22
+ # Chatterbox TTS French 🥖
23
+
24
+ **Chatterbox TTS French** is a fine-tuned text-to-speech model specialized for the French language. The model has been trained on high-quality voice data for natural and expressive speech synthesis.
25
+
26
+ <div align="center"><img width="400px" src="https://ih1.redbubble.net/image.5397735048.6235/bg,f8f8f8-flat,750x,075,f-pad,750x1000,f8f8f8.jpg" alt="baguette-france-tour-eiffel-image" /></div>
27
+
28
+ - 🔊 **Language**: French 🇫🇷
29
+ - 🗣️ **Training dataset**: [Emilia Dataset (FR branch)](https://huggingface.co/datasets/amphion/Emilia-Dataset)
30
+ - ⏱️ **Data quantity**: 1400 hours of audio
31
+
32
+ ## Usage Example
33
+
34
+ Here’s how to generate speech using Chatterbox-TTS French:
35
+
36
+ ```python
37
+ import torch
38
+ import soundfile as sf
39
+ from chatterbox.tts import ChatterboxTTS
40
+ from huggingface_hub import hf_hub_download
41
+ from safetensors.torch import load_file
42
+
43
+ # Configuration
44
+ MODEL_REPO = "Thomcles/Chatterbox-TTS-French"
45
+ CHECKPOINT_FILENAME = "t3_cfg.safetensors"
46
+ OUTPUT_PATH = "output_cloned_voice.wav"
47
+ TEXT_TO_SYNTHESIZE = "Jean-Paul Sartre laisse à la postérité une œuvre considérable, tant littéraire que philosophique, ayant influencée à la fois la vie politique française d'après-guerre et les penseurs de son temps (Merleau-Ponty et Alain Badiou notamment)."
48
+
49
+ def get_device() -> str:
50
+ return "cuda" if torch.cuda.is_available() else "cpu"
51
+
52
+ def download_checkpoint(repo: str, filename: str) -> str:
53
+ return hf_hub_download(repo_id=repo, filename=filename)
54
+
55
+ def load_tts_model(repo: str, checkpoint_file: str, device: str) -> ChatterboxTTS:
56
+ model = ChatterboxTTS.from_pretrained(device=device)
57
+ checkpoint_path = download_checkpoint(repo, checkpoint_file)
58
+ t3_state = load_file(checkpoint_path, device="cpu")
59
+ model.t3.load_state_dict(t3_state)
60
+ return model
61
+
62
+ def synthesize_speech(model: ChatterboxTTS, text: str, audio_prompt_path:str, **kwargs) -> torch.Tensor:
63
+ with torch.inference_mode():
64
+ return model.generate(
65
+ text=text,
66
+ audio_prompt_path=audio_prompt_path,
67
+ **kwargs
68
+ )
69
+
70
+ def save_audio(waveform: torch.Tensor, path: str, sample_rate: int):
71
+ sf.write(path, waveform.squeeze().cpu().numpy(), sample_rate)
72
+
73
+ def main():
74
+ print("Loading model...")
75
+ device = get_device()
76
+ model = load_tts_model(MODEL_REPO, CHECKPOINT_FILENAME, device)
77
+
78
+ print(f"Generating speech on {device}...")
79
+ wav = synthesize_speech(
80
+ model,
81
+ TEXT_TO_SYNTHESIZE,
82
+ audio_prompt_path=None,
83
+ exaggeration=0.5,
84
+ temperature=0.6,
85
+ cfg_weight=0.3
86
+ )
87
+
88
+ print(f"Saving output to: {OUTPUT_PATH}")
89
+ save_audio(wav, OUTPUT_PATH, model.sr)
90
+ print("Done.")
91
+
92
+ if __name__ == "__main__":
93
+ main()
94
+ ```
95
+
96
+ Here is the output:
97
+
98
+ <audio controls src="https://huggingface.co/Thomcles/Chatterbox-TTS-French/resolve/main/example.mp3">Your browser does not support audio.</audio>
99
+
100
+ ### Base model license
101
+
102
+ The base model is licensed under the MIT License.
103
+ Base model: [Chatterbox](https://huggingface.co/ResembleAI/chatterbox)
104
+ License: [MIT](https://choosealicense.com/licenses/mit/)
105
+
106
+ ### Training Data License
107
+
108
+ This model was fine-tuned using a dataset licensed under Creative Commons Attribution 4.0 (CC BY 4.0).
109
+ Dataset: [Emilia](https://huggingface.co/datasets/amphion/Emilia-Dataset)
110
+ License: [Creative Commons Attribution 4.0 International](https://choosealicense.com/licenses/cc-by-4.0/)
111
+
112
+
113
+ ### Contact me
114
+
115
+ Interested in fine-tuning a TTS model in a specific language or building a multilingual voice solution? Don’t hesitate to reach out.