A Family of Open Sourced Music Foundation Models
Generate spoken audio from text in multiple languages
Generate Vietnamese speech from text and audio