Mihai's F5-TTS Romanian

Second (and my first) F5-TTS model fine-tuned for Romanian language.

After someone else did the first F5-TTS fine-tune for Romanian (Cdorob's F5-TTS Romanian), I went with my own!

This model can also speak English (though not very good). Checkpoint 375 or longer text seems to improve the perfomance, but not as well as the original F5-TTS.

Using EMA is better for English (also improves speaker similarity)

For correct results, disable EMA (Exponential Moving Average) before using it. Pruned SafeTensors checkpoints are without EMA, so use those instead.

Training Data

Common Voice 22.0 Romanian: Just 1200 clips, totating 1.3 hours of Romanian, and nothing else!

Usage

pip install f5-tts
f5-tts_infer-cli \
  --model F5TTS_v1_Base \
  --ckpt_file model_750.pt \
  --vocab_file vocab.txt \
  --ref_audio your_voice.wav \
  --ref_text "Pune textul exact din clipul care ai încărcat" \
  --gen_text "Pune ce text vrei ca modelul să zică aici!" \
  --output_file output.wav

Model Details

Training: 1125 steps / 15 epochs (a checkpoint per every 375)

Downloads last month
680
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MihaiPopa-1/F5-TTS-Romanian

Base model

SWivid/F5-TTS
Finetuned
(83)
this model

Dataset used to train MihaiPopa-1/F5-TTS-Romanian

Space using MihaiPopa-1/F5-TTS-Romanian 1