Copied from https://huggingface.co/marduk-ra/F5-TTS-German, added trained duration model on emilia dataset using https://github.com/eamag/f5-tts-duration

Inference with https://github.com/lucasnewman/f5-tts-mlx

python -m f5_tts_mlx.generate --model "eamag/f5-tts-mlx-german" \
--text "The quick brown fox jumped over the lazy dog." \
--ref-audio /path/to/audio.wav \
--ref-text "This is the caption for the reference audio."

Github: https://github.com/SWivid/F5-TTS
Paper: F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

NOTE: You can set the number of nfe steps to 64 to produce better quality sound.

Downloads last month: 106

MLX

Hardware compatibility

Quantized

Dataset used to train eamag/f5-tts-mlx-german

Paper for eamag/f5-tts-mlx-german

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching

Paper • 2410.06885 • Published Oct 9, 2024 • 48