YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Korean SpeechT5 (Jamo Tokenizer, KSS)

If you use this model in research or production, please cite:

@misc{ahnhs2k_speecht5_korean,  
  author = {Ahn, Hosung},  
  title = {Korean SpeechT5 TTS Model},  
  year = {2025},  
  publisher = {Hugging Face},  
  url = {https://huggingface.co/ahnhs2k/...}  
}

모델 특징

  • Base Model: microsoft/speecht5_tts
  • Dataset: Bingsu/KSS_Dataset
  • Tokenizer: Jamo-based Korean tokenizer (character-level, placeholder aware)
  • Speaker Embedding: microsoft/wavlm-base-plus-sv
  • Vocoder: microsoft/speecht5_hifigan
  • Sample Rate: 16 kHz
  • 단일 화자 한국어 TTS 모델

Text Utils (korean_text_utils.py)

  • number_to_korean(): 숫자 → 한국어 읽기
  • normalize_korean(): 추론용 텍스트 정규화
  • prosody_split(): 구두점 기반 segment 분할
  • inject_tokens_for_training(): 훈련용 placeholder 전처리
  • decompose_jamo_with_placeholders(): placeholder 보존 자모 분해

라이선스

license: apache-2.0
datasets:
- Bingsu/KSS_Dataset
language:
- ko
base_model:
- microsoft/speecht5_tts
pipeline_tag: text-to-speech
Downloads last month
2
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using ahnhs2k/speecht5-korean-jamo 1

Collection including ahnhs2k/speecht5-korean-jamo