File size: 639 Bytes
d8a6011
 
 
 
 
 
05647f1
2748cf0
 
d8a6011
 
1
2
3
4
5
6
7
8
9
10
11
---
license: cc-by-nc-sa-4.0
---
# DIFFA: Large Language Diffusion Models Can Listen and Understand
[![arXiv](https://img.shields.io/badge/Paper-arXiv-red.svg)](https://arxiv.org/abs/2507.18452)
[![deploy](https://img.shields.io/badge/Hugging%20Face-DIFFA-FFEB3B)](https://huggingface.co/zhoujiaming777/DIFFA)
[![Github](https://img.shields.io/badge/Github-DIFFA-blue)](https://github.com/NKU-HLT/DIFFA)


**DIFFA** is the first **diffusion-based large audio-language model** for spoken language understanding.  
It combines a frozen diffusion LLM with **dual adapters** (semantic + acoustic) to enhance **audio perception and reasoning**.