smtiitm
/

FastSpeech2_HS_latest_models

hybrid_segmentation

Model card Files Files and versions

utkarsh2299 commited on May 15, 2025

Commit

4335c3c

·

verified ·

1 Parent(s): c343135

Update README.md

Files changed (1) hide show

README.md +10 -11

README.md CHANGED Viewed

@@ -1,14 +1,13 @@
 # Latest Fastspeech2 Models using FLAT Start
-This repository branch `(New-Models)` contains new and high quality Fastspeech2 Models for Indian languages implemented using the Flat Start for speech synthesis. The models are capable of generating mel-spectrograms from text inputs and can be used to synthesize speech.
-**NOTE: The main branch became large in size and underwent few changes in the inference and preprocessing scripts, necessitating the creation of a separate branch. Training information and the script will be shared after further code optimization and footprint reduction.**
-Clone this branch using the command:
-```
-git clone -b New-Models --single-branch https://github.com/smtiitm/Fastspeech2_HS.git
-```
 The Repo is large in size. New Models are in "language"_latest folder.
@@ -54,7 +53,7 @@ The directory paths are Relative. ( But if needed, Make changes to **text_prepro
 Use the inference file to synthesize speech from text inputs:
 ```shell
-python inference.py --sample_text "Your input text here" --language <language> --gender <gender> --alpha <alpha> --output_file <file_name.wav OR path/to/file_name.wav>
 ```
 **Example:**
@@ -72,7 +71,7 @@ If you use this Fastspeech2 Model in your research or work, please consider citi
 “
 COPYRIGHT
-2024, Speech Technology Consortium,
 Bhashini, MeiTY and by Hema A Murthy & S Umesh,
@@ -93,4 +92,4 @@ This work is licensed under a
 [cc-by]: http://creativecommons.org/licenses/by/4.0/
 [cc-by-image]: https://i.creativecommons.org/l/by/4.0/88x31.png
-[cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg

+---
+license: cc-by-4.0
+base_model:
+- espnet/kan-bayashi_ljspeech_fastspeech2
+---
 # Latest Fastspeech2 Models using FLAT Start
+This repository contains new and high quality Fastspeech2 Models for Indian languages implemented using the Flat Start for speech synthesis. The models are capable of generating mel-spectrograms from text inputs and can be used to synthesize speech.
+**NOTE: The main repo became large in size and underwent few changes in the inference and preprocessing scripts, necessitating the creation of a separate repo. Training information and the script will be shared after further code optimization and footprint reduction.**
 The Repo is large in size. New Models are in "language"_latest folder.
 Use the inference file to synthesize speech from text inputs:
 ```shell
+python inference.py --sample_text "Your input text here" --language <language>_latest --gender <gender> --alpha <alpha> --output_file <file_name.wav OR path/to/file_name.wav>
 ```
 **Example:**
 “
 COPYRIGHT
+2025, Speech Technology Consortium,
 Bhashini, MeiTY and by Hema A Murthy & S Umesh,
 [cc-by]: http://creativecommons.org/licenses/by/4.0/
 [cc-by-image]: https://i.creativecommons.org/l/by/4.0/88x31.png
+[cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg