Text-to-Speech
coqui
eginhard commited on
Commit
d7143b9
·
1 Parent(s): 948cc5f

fix: update links and readme

Browse files
Files changed (2) hide show
  1. LICENSE +4 -4
  2. README.md +32 -28
LICENSE CHANGED
@@ -1,5 +1,5 @@
1
  # Coqui Public Model License 1.0.0
2
- https://coqui.ai/cpml.txt
3
 
4
 
5
  This license allows only non-commercial use of a machine learning model and its outputs.
@@ -14,7 +14,7 @@ In order to get any license under these terms, you must agree to them as both st
14
  ## Licenses
15
 
16
 
17
- The licensor grants you a copyright license to do everything you might do with the model that would otherwise infringe the licensor's copyright in it, for any non-commercial purpose. The licensor grants you a patent license that covers patent claims the licensor can license, or becomes able to license, that you would infringe by using the model in the form provided by
18
  the licensor, for any non-commercial purpose.
19
 
20
 
@@ -25,10 +25,10 @@ Non-commercial purposes include any of the following uses of the model or its ou
25
 
26
 
27
  ### Personal use for research, experiment, and testing for the benefit of public knowledge, personal study, private entertainment, hobby projects, amateur pursuits, or religious
28
- observance.
29
 
30
 
31
- ### Use by commercial or for-profit entities for testing, evaluation, or non-commercial research and development. Use of the model to train other models for commercial use is not a non-commercial purpose.
32
 
33
 
34
  ### Use by any charitable organization for charitable purposes, or for testing or evaluation. Use for revenue-generating activity, including projects directly funded by government grants, is not a non-commercial purpose.
 
1
  # Coqui Public Model License 1.0.0
2
+ https://tts-hub.github.io/cpml/LICENSE.txt
3
 
4
 
5
  This license allows only non-commercial use of a machine learning model and its outputs.
 
14
  ## Licenses
15
 
16
 
17
+ The licensor grants you a copyright license to do everything you might do with the model that would otherwise infringe the licensor's copyright in it, for any non-commercial purpose. The licensor grants you a patent license that covers patent claims the licensor can license, or becomes able to license, that you would infringe by using the model in the form provided by
18
  the licensor, for any non-commercial purpose.
19
 
20
 
 
25
 
26
 
27
  ### Personal use for research, experiment, and testing for the benefit of public knowledge, personal study, private entertainment, hobby projects, amateur pursuits, or religious
28
+ observance.
29
 
30
 
31
+ ### Use by commercial or for-profit entities for testing, evaluation, or non-commercial research and development. Use of the model to train other models for commercial use is not a non-commercial purpose.
32
 
33
 
34
  ### Use by any charitable organization for charitable purposes, or for testing or evaluation. Use for revenue-generating activity, including projects directly funded by government grants, is not a non-commercial purpose.
README.md CHANGED
@@ -1,25 +1,26 @@
1
  ---
2
  license: other
3
  license_name: coqui-public-model-license
4
- license_link: https://coqui.ai/cpml
5
  library_name: coqui
6
  pipeline_tag: text-to-speech
 
 
7
  ---
8
 
9
- # ⓍTTS
10
- ⓍTTS is a Voice generation model that lets you clone voices into different languages by using just a quick 6-second audio clip. Built on Tortoise,
11
- ⓍTTS has important model changes that make cross-language voice cloning and multi-lingual speech generation super easy.
12
  There is no need for an excessive amount of training data that spans countless hours.
13
 
14
- This is the same model that powers [Coqui Studio](https://coqui.ai/), and [Coqui API](https://docs.coqui.ai/docs), however we apply
15
- a few tricks to make it faster and support streaming inference.
16
 
17
- ## NOTE: ⓍTTS V2 model is out here [XTTS V2](https://huggingface.co/coqui/XTTS-v2)
18
 
19
  ### Features
20
- - Supports 14 languages.
21
  - Voice cloning with just a 6-second audio clip.
22
- - Emotion and style transfer by cloning.
23
  - Cross-language voice cloning.
24
  - Multi-lingual speech generation.
25
  - 24khz sampling rate.
@@ -31,33 +32,36 @@ Polish, Turkish, Russian, Dutch, Czech, Arabic, Chinese, and Japanese**.
31
  Stay tuned as we continue to add support for more languages. If you have any language requests, please feel free to reach out!
32
 
33
  ### Code
34
- The current implementation supports inference and [fine-tuning](https://tts.readthedocs.io/en/latest/models/xtts.html#training).
35
 
36
  ### License
37
- This model is licensed under [Coqui Public Model License](https://coqui.ai/cpml). There's a lot that goes into a license for generative models, and you can read more of [the origin story of CPML here](https://coqui.ai/blog/tts/cpml).
38
 
39
  ### Contact
40
- Come and join in our 🐸Community. We're active on [Discord](https://discord.gg/fBC58unbKE) and [Twitter](https://twitter.com/coqui_ai).
41
- You can also mail us at info@coqui.ai.
42
 
43
  Using 🐸TTS API:
44
 
45
  ```python
46
  from TTS.api import TTS
47
- tts = TTS("tts_models/multilingual/multi-dataset/xtts_v1", gpu=True)
48
-
49
- # generate speech by cloning a voice using default settings
50
- tts.tts_to_file(text="It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.",
51
- file_path="output.wav",
52
- speaker_wav="/path/to/target/speaker.wav",
53
- language="en")
54
-
55
- # generate speech by cloning a voice using custom settings
56
- tts.tts_to_file(text="It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.",
57
- file_path="output.wav",
58
- speaker_wav="/path/to/target/speaker.wav",
59
- language="en",
60
- decoder_iterations=30)
 
 
 
 
61
  ```
62
 
63
  Using 🐸TTS Command line:
@@ -89,4 +93,4 @@ outputs = model.synthesize(
89
  gpt_cond_len=3,
90
  language="en",
91
  )
92
- ```
 
1
  ---
2
  license: other
3
  license_name: coqui-public-model-license
4
+ license_link: https://tts-hub.github.io/cpml
5
  library_name: coqui
6
  pipeline_tag: text-to-speech
7
+ base_model: coqui/XTTS-v1
8
+ new_version: tts-hub/XTTS-v2
9
  ---
10
 
11
+ # XTTS
12
+ XTTS is a Voice generation model that lets you clone voices into different languages by using just a quick 6-second audio clip. Built on Tortoise,
13
+ XTTS has important model changes that make cross-language voice cloning and multi-lingual speech generation super easy.
14
  There is no need for an excessive amount of training data that spans countless hours.
15
 
16
+ Paper: https://arxiv.org/abs/2406.04904
 
17
 
18
+ ## NOTE: XTTS V2 model is out here: [XTTS V2](https://huggingface.co/tts-hub/XTTS-v2)
19
 
20
  ### Features
21
+ - Supports 14 languages.
22
  - Voice cloning with just a 6-second audio clip.
23
+ - Emotion and style transfer by cloning.
24
  - Cross-language voice cloning.
25
  - Multi-lingual speech generation.
26
  - 24khz sampling rate.
 
32
  Stay tuned as we continue to add support for more languages. If you have any language requests, please feel free to reach out!
33
 
34
  ### Code
35
+ The current implementation supports inference and [fine-tuning](https://coqui-tts.readthedocs.io/en/latest/models/xtts.html#training).
36
 
37
  ### License
38
+ This model is licensed under [Coqui Public Model License](https://tts-hub.github.io/cpml). There's a lot that goes into a license for generative models, and you can read more of [the origin story of CPML here](https://web.archive.org/web/20240217095217/https://coqui.ai/blog/tts/cpml).
39
 
40
  ### Contact
41
+ Come and join in our 🐸Community, we're active on [Discord](https://discord.gg/fBC58unbKE).
 
42
 
43
  Using 🐸TTS API:
44
 
45
  ```python
46
  from TTS.api import TTS
47
+ tts = TTS("tts_models/multilingual/multi-dataset/xtts_v1").to("cuda")
48
+
49
+ # Generate speech by cloning a voice using default settings
50
+ tts.tts_to_file(
51
+ text="It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.",
52
+ file_path="output.wav",
53
+ speaker_wav="/path/to/target/speaker.wav",
54
+ language="en"
55
+ )
56
+
57
+ # Generate speech by cloning a voice using custom settings
58
+ tts.tts_to_file(
59
+ text="It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.",
60
+ file_path="output.wav",
61
+ speaker_wav="/path/to/target/speaker.wav",
62
+ language="en",
63
+ decoder_iterations=30
64
+ )
65
  ```
66
 
67
  Using 🐸TTS Command line:
 
93
  gpt_cond_len=3,
94
  language="en",
95
  )
96
+ ```