|
|
--- |
|
|
license: other |
|
|
language: |
|
|
- id |
|
|
--- |
|
|
|
|
|
# Indonesian TTS Documentation |
|
|
|
|
|
This documentation provides a step-by-step guide on setting up and using the Indonesian Text-to-Speech (TTS) system based on a pretrained model. The instructions cover downloading necessary files, installing required packages, and running a script to synthesize speech from text. |
|
|
|
|
|
## Prerequisites |
|
|
|
|
|
Ensure you have `wget`, `pip`, and `pip3` installed on your system. |
|
|
|
|
|
## Steps |
|
|
|
|
|
### 1. Download the Pretrained Model and Configuration Files |
|
|
|
|
|
Use the following commands to download the necessary files: |
|
|
|
|
|
```bash |
|
|
Download from this acul3/TTS-TESTV3/upload/main |
|
|
``` |
|
|
|
|
|
### 2. Install Required Packages |
|
|
|
|
|
Install the TTS library and the Indonesian Grapheme-to-Phoneme (G2P) converter: |
|
|
|
|
|
```bash |
|
|
!pip install TTS |
|
|
!pip3 install -U git+https://github.com/acul3/g2p-id |
|
|
``` |
|
|
|
|
|
### 3. Import Libraries |
|
|
|
|
|
Import the necessary libraries for TTS and G2P: |
|
|
|
|
|
```python |
|
|
from TTS.api import TTS |
|
|
import torch |
|
|
from TTS.utils.synthesizer import Synthesizer |
|
|
from g2p_id import G2P |
|
|
``` |
|
|
|
|
|
### 4. Check Device |
|
|
|
|
|
Check if a GPU is available and set the device accordingly: |
|
|
|
|
|
```python |
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
``` |
|
|
|
|
|
### 5. Initialize G2P |
|
|
|
|
|
Initialize the Indonesian G2P converter: |
|
|
|
|
|
```python |
|
|
g2p = G2P() |
|
|
``` |
|
|
|
|
|
### 6. Prepare Text |
|
|
|
|
|
Convert the input text to phonemes: |
|
|
|
|
|
```python |
|
|
text = g2p("progress nya baru sampai sini, belum bisa real time baru sekitar dua detik buat generate nya, harus butuh data lebih banyak, sekitar dua kali lebih banyak,") |
|
|
``` |
|
|
|
|
|
### 7. Initialize Synthesizer |
|
|
|
|
|
Initialize the TTS synthesizer with the downloaded checkpoint and configuration files: |
|
|
|
|
|
```python |
|
|
synthesizer = Synthesizer( |
|
|
tts_checkpoint="checkpoint_1260000-inference.pth", |
|
|
tts_config_path="config.json", |
|
|
tts_speakers_file="speakers.pth" |
|
|
).to(device) |
|
|
``` |
|
|
|
|
|
### 8. Synthesize Speech |
|
|
|
|
|
Generate the speech audio from the text: |
|
|
|
|
|
```python |
|
|
wav = synthesizer.tts(text, speaker_name="wibowo") |
|
|
``` |
|
|
|
|
|
### 9. Save the Audio File |
|
|
|
|
|
Save the generated audio to a file: |
|
|
|
|
|
```python |
|
|
synthesizer.save_wav(wav, "wibowo.wav") |
|
|
``` |
|
|
|
|
|
## Notes |
|
|
|
|
|
- Ensure the paths to the checkpoint, config, and speakers files are correctly specified. |
|
|
- Adjust the `speaker_name` parameter based on the available speakers in the `speakers.pth` file. |
|
|
- The synthesized audio will be saved as `wibowo.wav` in the specified directory. |
|
|
|
|
|
This completes the setup and usage guide for the Indonesian TTS system. For further customization and usage, refer to the official documentation of the TTS library and the G2P converter. |