DMOSpeech 2: Reinforcement Learning for Duration Prediction in Metric-Optimized Speech Synthesis Paper • 2507.14988 • Published Jul 20, 2025 • 8
DMOSpeech 2: Reinforcement Learning for Duration Prediction in Metric-Optimized Speech Synthesis Paper • 2507.14988 • Published Jul 20, 2025 • 8 • 2
CoSTAast: Cost-Sensitive Toolpath Agent for Multi-turn Image Editing Paper • 2503.10613 • Published Mar 13, 2025 • 79
S2S-Arena, Evaluating Speech2Speech Protocols on Instruction Following with Paralinguistic Information Paper • 2503.05085 • Published Mar 7, 2025 • 47
Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published Mar 7, 2025 • 123
Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis Paper • 2407.09732 • Published Jul 13, 2024 • 10
Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue Generation Paper • 2408.11849 • Published Aug 13, 2024
Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify and Understand Speaker in Spoken Dialogue Paper • 2409.04927 • Published Sep 7, 2024
StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion Paper • 2409.10058 • Published Sep 16, 2024 • 2
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform Paper • 2309.09493 • Published Sep 18, 2023
AAD-LLM: Neural Attention-Driven Auditory Scene Understanding Paper • 2502.16794 • Published Feb 24, 2025 • 5
AAD-LLM: Neural Attention-Driven Auditory Scene Understanding Paper • 2502.16794 • Published Feb 24, 2025 • 5 • 3
AAD-LLM: Neural Attention-Driven Auditory Scene Understanding Paper • 2502.16794 • Published Feb 24, 2025 • 5
AAD-LLM: Neural Attention-Driven Auditory Scene Understanding Paper • 2502.16794 • Published Feb 24, 2025 • 5 • 3
Slamming: Training a Speech Language Model on One GPU in a Day Paper • 2502.15814 • Published Feb 19, 2025 • 69
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published Jan 9, 2025 • 95