FunCineForge: A Unified Dataset Toolkit and Model for Zero-Shot Movie Dubbing in Diverse Cinematic Scenes Paper • 2601.14777 • Published 7 days ago
DiffStyleTTS: Diffusion-based Hierarchical Prosody Modeling for Text-to-Speech with Diverse and Controllable Styles Paper • 2412.03388 • Published Dec 4, 2024 • 1
UDDETTS: Unifying Discrete and Dimensional Emotions for Controllable Emotional Text-to-Speech Paper • 2505.10599 • Published May 15, 2025