Nudging Hidden States: Training-Free Model Steering for Chain-of-Thought Reasoning in Large Audio-Language Models Paper • 2603.14636 • Published 17 days ago • 4
MUGEN: Evaluating and Improving Multi-audio Understanding of Large Audio-Language Models Paper • 2603.09714 • Published 23 days ago
Nudging Hidden States: Training-Free Model Steering for Chain-of-Thought Reasoning in Large Audio-Language Models Paper • 2603.14636 • Published 17 days ago • 4
On the Fallacy of Global Token Perplexity in Spoken Language Model Evaluation Paper • 2601.06329 • Published Jan 9 • 2
SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models Paper • 2510.16917 • Published Oct 19, 2025 • 20
Investigating Safety Vulnerabilities of Large Audio-Language Models Under Speaker Emotional Variations Paper • 2510.16893 • Published Oct 19, 2025 • 18
ML-SUPERB: Multilingual Speech Universal PERformance Benchmark Paper • 2305.10615 • Published May 18, 2023 • 1
REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR Paper • 2402.03988 • Published Feb 6, 2024
Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs Paper • 2301.12950 • Published Jan 30, 2023
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks Paper • 2411.05361 • Published Nov 8, 2024 • 5
BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights Paper • 2501.17790 • Published Jan 29, 2025 • 3
DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment Paper • 2507.02768 • Published Jul 3, 2025 • 19
DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment Paper • 2507.02768 • Published Jul 3, 2025 • 19
Analyzing Mitigation Strategies for Catastrophic Forgetting in End-to-End Training of Spoken Language Models Paper • 2505.17496 • Published May 23, 2025 • 2
Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models Paper • 2408.07665 • Published Aug 14, 2024
Game-Time: Evaluating Temporal Dynamics in Spoken Language Models Paper • 2509.26388 • Published Sep 30, 2025 • 27
Game-Time: Evaluating Temporal Dynamics in Spoken Language Models Paper • 2509.26388 • Published Sep 30, 2025 • 27
Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models Paper • 2408.07665 • Published Aug 14, 2024
EMO-Debias: Benchmarking Gender Debiasing Techniques in Multi-Label Speech Emotion Recognition Paper • 2506.04652 • Published Jun 5, 2025 • 1