LLaSE / README.md
BeauKang01's picture
Update README.md
cd7ebca verified
---
license: apache-2.0
pipeline_tag: audio-to-audio
---
LLaSE: Maximizing Acoustic Preservation for LLaMA based Speech Enhancement
Demo Page: https://kevin-naticl.github.io/LLaSE-Demopage/
Github: https://github.com/Kevin-naticl/LLaSE
Abstract
Language Models (LMs) have shown strong capabilities in semantic understanding and contextual modeling, making them promising for speech enhancement.
Building on SELM, our previous work that first introduced LMs to speech enhancement, we note that SELM and other existing generative
speech enhancement approaches still face challenges, such as variations in timbre and content before and after enhancement.
To address these limitations, we propose LLaSE, which utilizes continuous representations from WavLM and integrates a LLaMA
backbone combined with the more powerful Xcodec decoder, significantly improving contextual modeling capabilities and enabling
more accurate and stable enhancement. Experimental results demonstrate that LLaSE achieves state-of-the-art performance on speech enhancement,
offering a robust and scalable solution for speech enhancement.