File size: 639 Bytes
d8a6011 05647f1 2748cf0 d8a6011 |
1 2 3 4 5 6 7 8 9 10 11 |
---
license: cc-by-nc-sa-4.0
---
# DIFFA: Large Language Diffusion Models Can Listen and Understand
[](https://arxiv.org/abs/2507.18452)
[](https://huggingface.co/zhoujiaming777/DIFFA)
[](https://github.com/NKU-HLT/DIFFA)
**DIFFA** is the first **diffusion-based large audio-language model** for spoken language understanding.
It combines a frozen diffusion LLM with **dual adapters** (semantic + acoustic) to enhance **audio perception and reasoning**. |