speecht5
Collection
3 items โข Updated
ํ๊ตญ์ด TTS(Text-to-Speech)์ฉ์ผ๋ก ํ์ธํ๋ํ SpeechT5 ๋ชจ๋ธ์
๋๋ค.
์ ์ฒด ํ์ดํ๋ผ์ธ๊ณผ ์ถ๋ก ์ฝ๋๋ ๊นํ๋ธ ๋ฆฌํฌ์งํ ๋ฆฌ์์ ํ์ธํ ์ ์์ต๋๋ค.
๊น ํด๋ก ์ ํตํด ์ฌ์ฉ ๊ฐ๋ฅํฉ๋๋ค.
microsoft/speecht5_ttsmicrosoft/speecht5_hifiganmicrosoft/wavlm-base-plus-svko)hobi2k, Hugging Face: ahnhs2k)ํ๋ จ ๋ฐ์ดํฐ์
์ simon3000/genshin-voice๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ์ ์ฒ๋ฆฌํด ์ฌ์ฉํ์ต๋๋ค.
scripts/train.pyscripts/inference.pycheckpoint_last.pt ๋งค epoch ์ ์ฅcheckpoints/epoch_XXXXXX/ ์ ์ฅ (์ต๋ 5๊ฐ ์ ์ง)ํ๋ก์ ํธ ์คํฌ๋ฆฝํธ ๊ธฐ๋ฐ ์ถ๋ก :
uv run scripts/inference.py \
--model_dir /path/to/output_model \
--text "์๋
ํ์ธ์. ํ
์คํธ ๋ฌธ์ฅ์
๋๋ค." \
--out out.wav
์ฃผ๊ธฐ ์ ์ฅ ๋ชจ๋ธ ์ ํ ์ถ๋ก :
uv run scripts/inference.py \
--model_dir /path/to/output_model \
--checkpoint_epoch 40 \
--text "40 epoch ๋ชจ๋ธ ํ
์คํธ" \
--out out_epoch40.wav
@misc{speecht5_korean,
title = {SpeechT5_Korean: Korean SpeechT5 Training and Inference Pipeline},
author = {์ํธ์ฑ (GitHub: hobi2k)},
year = {2026},
url = {https://github.com/hobi2k/SpeechT5_Korean},
note = {Hugging Face: https://huggingface.co/ahnhs2k}
}
microsoft/speecht5_tts: https://huggingface.co/microsoft/speecht5_ttsmicrosoft/speecht5_hifigan: https://huggingface.co/microsoft/speecht5_hifiganmicrosoft/wavlm-base-plus-sv: https://huggingface.co/microsoft/wavlm-base-plus-svBase model
microsoft/speecht5_tts