mah92
/

Khadijah-FA_EN-Matcha-TTS-Model

Model card Files Files and versions

mah92 commited on Feb 28, 2025

Commit

4526d1a

·

verified ·

1 Parent(s): 51fb765

Update README.md

Files changed (1) hide show

README.md +2 -82

README.md CHANGED Viewed

@@ -20,89 +20,9 @@ You can test this model [here](https://huggingface.co/spaces/k2-fsa/text-to-spee
 Enjoy!
-## Usage with the Sherpa-onnx repo
-Remember to add metadata to onnx file as in:
-https://github.com/k2-fsa/icefall/blob/master/egs/ljspeech/TTS/matcha/export_onnx.py#L174
-## Usage with the Matcha-TTS repo
-1) In matcha/text/cleaners.py, phonemizer.backend.EspeakBackend part:
-```
-    language="fa",
-```
-2) pip install piper-phonemize
-3) In cleaners.py:
-add below persian_cleaners_piper:
-```
-import piper_phonemize
-def persian_cleaners_piper(text):
-    """Pipeline for Persian text, including abbreviation expansion. + punctuation + stress"""
-    #text = convert_to_ascii(text)
-    text = lowercase(text)
-    text = expand_abbreviations(text)
-    phonemes = "".join(piper_phonemize.phonemize_espeak(text=text, voice="fa")[0])
-    phonemes = collapse_whitespace(phonemes)
-    # Remove unwanted symbols (e.g., '1')
-    unwanted_symbols = {'1', '-'}  # Add any other unwanted symbols here
-    filtered_phonemes = "".join([char for char in phonemes if char not in unwanted_symbols])
-    return filtered_phonemes
-```
-4) In matcha/text/cleaners.py change this line to:
-```
-    intersperse(text_to_sequence(text, ["persian_cleaners_piper"])[0], 0),
-```
-5) Also set cleaner in configs/data/custom.yaml:
-cleaners: [persian_cleaners_piper]
-6) replace symbols.py by:
-```
-def read_tokens():
-    tokens = []
-    with open("/home/oem/Basir/TTS/Matcha/Matcha-TTS/configs/tokens/tokens_sherpa_with_fa.txt", "r", encoding="utf-8") as f:
-        for line in f:
-            # Remove the newline character at the end
-            line = line.rstrip("\n")
-            # Split into token and number, preserving whitespace
-            if " " in line:
-                token = line[:line.index(" ")]  # Extract everything before the first space
-                if len(token) == 0: # White-space
-                    token = ' '
-            else:
-                token = line  # If there's no space, the entire line is the token
-            tokens.append(token)
-    return tokens
-symbols = read_tokens()
-```
-7) For possible errors, change save_figure_to_numpy to:
-```
-import numpy as np
-import matplotlib.pyplot as plt
-from PIL import Image
-import io
-def save_figure_to_numpy(fig):
-    buf = io.BytesIO()
-    fig.savefig(buf, format='png', bbox_inches='tight', pad_inches=0)
-    buf.seek(0)
-    img = Image.open(buf)
-    data = np.array(img)
-    buf.close()
-    return data
-```
-8) After exporting to onnx, add sherpa metadata if you want to use the model with sherpa
-```
-python3 ./add_sherpa_metadata_to_matcha.py
-```
 ## Training results
 ![Training Results](khadijah-22050.png)

 Enjoy!
+## Training method
+see: [how_to_train_matcha_tts](https://huggingface.co/mah92/how_to_train_matcha_tts)
 ## Training results
 ![Training Results](khadijah-22050.png)