Update README.md
Browse files
README.md
CHANGED
|
@@ -1,14 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# Latest Fastspeech2 Models using FLAT Start
|
| 2 |
|
| 3 |
-
This repository
|
| 4 |
|
| 5 |
-
**NOTE: The main
|
| 6 |
-
|
| 7 |
-
Clone this branch using the command:
|
| 8 |
-
|
| 9 |
-
```
|
| 10 |
-
git clone -b New-Models --single-branch https://github.com/smtiitm/Fastspeech2_HS.git
|
| 11 |
-
```
|
| 12 |
|
| 13 |
The Repo is large in size. New Models are in "language"_latest folder.
|
| 14 |
|
|
@@ -54,7 +53,7 @@ The directory paths are Relative. ( But if needed, Make changes to **text_prepro
|
|
| 54 |
|
| 55 |
Use the inference file to synthesize speech from text inputs:
|
| 56 |
```shell
|
| 57 |
-
python inference.py --sample_text "Your input text here" --language <language> --gender <gender> --alpha <alpha> --output_file <file_name.wav OR path/to/file_name.wav>
|
| 58 |
```
|
| 59 |
|
| 60 |
**Example:**
|
|
@@ -72,7 +71,7 @@ If you use this Fastspeech2 Model in your research or work, please consider citi
|
|
| 72 |
|
| 73 |
“
|
| 74 |
COPYRIGHT
|
| 75 |
-
|
| 76 |
|
| 77 |
Bhashini, MeiTY and by Hema A Murthy & S Umesh,
|
| 78 |
|
|
@@ -93,4 +92,4 @@ This work is licensed under a
|
|
| 93 |
|
| 94 |
[cc-by]: http://creativecommons.org/licenses/by/4.0/
|
| 95 |
[cc-by-image]: https://i.creativecommons.org/l/by/4.0/88x31.png
|
| 96 |
-
[cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-4.0
|
| 3 |
+
base_model:
|
| 4 |
+
- espnet/kan-bayashi_ljspeech_fastspeech2
|
| 5 |
+
---
|
| 6 |
# Latest Fastspeech2 Models using FLAT Start
|
| 7 |
|
| 8 |
+
This repository contains new and high quality Fastspeech2 Models for Indian languages implemented using the Flat Start for speech synthesis. The models are capable of generating mel-spectrograms from text inputs and can be used to synthesize speech.
|
| 9 |
|
| 10 |
+
**NOTE: The main repo became large in size and underwent few changes in the inference and preprocessing scripts, necessitating the creation of a separate repo. Training information and the script will be shared after further code optimization and footprint reduction.**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
The Repo is large in size. New Models are in "language"_latest folder.
|
| 13 |
|
|
|
|
| 53 |
|
| 54 |
Use the inference file to synthesize speech from text inputs:
|
| 55 |
```shell
|
| 56 |
+
python inference.py --sample_text "Your input text here" --language <language>_latest --gender <gender> --alpha <alpha> --output_file <file_name.wav OR path/to/file_name.wav>
|
| 57 |
```
|
| 58 |
|
| 59 |
**Example:**
|
|
|
|
| 71 |
|
| 72 |
“
|
| 73 |
COPYRIGHT
|
| 74 |
+
2025, Speech Technology Consortium,
|
| 75 |
|
| 76 |
Bhashini, MeiTY and by Hema A Murthy & S Umesh,
|
| 77 |
|
|
|
|
| 92 |
|
| 93 |
[cc-by]: http://creativecommons.org/licenses/by/4.0/
|
| 94 |
[cc-by-image]: https://i.creativecommons.org/l/by/4.0/88x31.png
|
| 95 |
+
[cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg
|