MegaTTS 3 but with voice cloning!
Generate realistic cloned speech from text and reference audio
Generate cloned speech from text and reference audio