FireRedTeam commited on
Commit
9053525
·
verified ·
1 Parent(s): f7af46e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +116 -3
README.md CHANGED
@@ -1,3 +1,116 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ <div align="center">
6
+
7
+
8
+
9
+ <h1>FireRedTTS-1S: An Upgraded Streamable Foundation
10
+ Text-to-Speech System</h1>
11
+
12
+ </div>
13
+
14
+ #### 👉🏻 [FireRedTTS-1S Paper](https://arxiv.org/abs/2503.20499) 👈🏻
15
+ #### 👉🏻 [FireRedTTS-1S Demos](https://fireredteam.github.io/demos/firered_tts_1s/) 👈🏻
16
+
17
+
18
+ ## News
19
+
20
+ - [2025/05/26] 🔥 We add flow-mathing decoder and update the [technical report](https://arxiv.org/abs/2503.20499)
21
+ - [2025/03/25] 🔥 We release the [technical report](https://arxiv.org/abs/2503.20499) and [project page](https://fireredteam.github.io/demos/firered_tts_1s/)
22
+
23
+ ## Roadmap
24
+
25
+ - [x] 2025/04
26
+ - [x] Release the pre-trained checkpoints and inference code.
27
+
28
+ ## Usage
29
+
30
+ #### Clone and install
31
+
32
+ - Clone the repo
33
+
34
+ ```shell
35
+ https://github.com/FireRedTeam/FireRedTTS.git
36
+ cd FireRedTTS
37
+ ```
38
+
39
+ - Create conda env
40
+
41
+ ```shell
42
+ # step1.create env
43
+ conda create --name redtts python=3.10
44
+
45
+ # stpe2.install torch (pytorch should match the cuda-version on your machine)
46
+ # CUDA 11.8
47
+ conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=11.8 -c pytorch -c nvidia
48
+ # CUDA 12.1
49
+ conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia
50
+
51
+ # step3.install fireredtts form source
52
+ cd fireredtts
53
+ pip install -e .
54
+
55
+ # step4.install other requirements
56
+ pip install -r requirements.txt
57
+ ```
58
+
59
+ #### Download models
60
+
61
+ Download the required model files from [**Model_Lists**](https://huggingface.co/FireRedTeam/FireRedTTS-1S/tree/main) and place them in the folder `pretrained_models`
62
+
63
+ #### Basic Usage
64
+
65
+ ```python
66
+ import os
67
+ import torchaudio![alt text](image.png)
68
+ from fireredtts.fireredtts import FireRedTTS
69
+
70
+ # acoustic llm decoder
71
+ tts = FireRedTTS(
72
+ config_path="configs/config_24k.json",
73
+ pretrained_path=<pretrained_models_dir>,
74
+ )
75
+
76
+
77
+ """
78
+ # flow matching decoder
79
+ tts = FireRedTTS(
80
+ config_path="configs/config_24k_flow.json",
81
+ pretrained_path=<pretrained_models_dir>,
82
+ )
83
+ """
84
+
85
+ #same language
86
+ # For the test-hard evaluation, we enabled the use_tn=True configuration setting.
87
+ rec_wavs = tts.synthesize(
88
+ prompt_wav="examples/prompt_1.wav",
89
+ prompt_text="对,所以说你现在的话,这个账单的话,你既然说能处理,那你就想办法处理掉。",
90
+ text="小红书,是中国大陆的网络购物和社交平台,成立于二零一三年六月。",
91
+ lang="zh",
92
+ use_tn=True
93
+ )
94
+
95
+
96
+
97
+
98
+ rec_wavs = rec_wavs.detach().cpu()
99
+ out_wav_path = os.path.join("./example.wav")
100
+ torchaudio.save(out_wav_path, rec_wavs, 24000)
101
+
102
+ ```
103
+
104
+ ## Tips
105
+
106
+ - The reference audio should not be too long or too short; a duration of 3 to 10 seconds is recommended.
107
+ - The reference audio should be smooth and natural, and the accompanying text must be accurate to enhance the stability and naturalness of the synthesized audio.
108
+
109
+
110
+ ## ⚠️ Usage Disclaimer ❗️❗️❗️❗️❗️❗️
111
+
112
+ - The project incorporates zero-shot voice cloning functionality; Please note that this capability is intended **solely for academic research purposes**.
113
+ - **DO NOT** use this model for **ANY illegal activities**❗️❗️❗️❗️❗️❗️
114
+ - The developers assume no liability for any misuse of this model.
115
+ - If you identify any instances of **abuse**, **misuse**, or **fraudulent** activities related to this project, **please report them to our team immediately.**
116
+