UAT: Unified Audio-Text Diffusion for Audio Generation, Editing, and Captioning Paper • 2606.04939 • Published 23 days ago
Evaluating the Expressive Appropriateness of Speech in Rich Contexts Paper • 2605.09413 • Published May 10 • 5
Evaluating the Expressive Appropriateness of Speech in Rich Contexts Paper • 2605.09413 • Published May 10 • 5
WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling Paper • 2605.06407 • Published May 7
SPEAR: A Unified SSL Framework for Learning Speech and Audio Representations Paper • 2510.25955 • Published Oct 29, 2025 • 1
OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language World Models Paper • 2604.10866 • Published Apr 13 • 68
Representation-Regularized Convolutional Audio Transformer for Audio Understanding Paper • 2601.21612 • Published Jan 29 • 1
Typhoon ASR Real-time: FastConformer-Transducer for Thai Automatic Speech Recognition Paper • 2601.13044 • Published Jan 19 • 12