Generate audio from text using VITS model
Generate audio from text using voice synthesis
Generate and convert voice using text and audio inputs