F5-TTS Hebrew v2
Hebrew text-to-speech model fine-tuned from F5-TTS on 158 hours of Hebrew speech.
Model Details
- Base: F5-TTS v1 Base (non-autoregressive)
- Steps: 320,000
- Data: 68,569 segments (~158h) from SASPEECH Gold/Auto, FLEURS, Hebrew Speech Campus
- All data re-vocalized with Phonikud G2P (stress marks included)
- Hebrew-filtered: non-Hebrew segments removed
- Output: 24kHz mono WAV
- 58 built-in voices with voice cloning support
Critical Usage Notes
- Use
model_state_dict, NOTema_model_state_dictwhen loading from .pt checkpoints - Override
text_num_embedsto match vocab_size (default 256 is wrong) - Use the included
vocab.txt— char-to-index mapping must match exactly - Call
model.sample()directly — F5TTS API/CLI are broken for fine-tuned models
Full Project
github.com/yzamari/hebAudio — complete Hebrew TTS system with Web UI, 58 voices, preprocessing pipeline, and training guide.
- Downloads last month
- 12