Efficient Audio-Visual Speech Separation with Discrete Lip Semantics and Multi-Scale Global-Local Attention Paper • 2509.23610 • Published Sep 28, 2025 • 13