zhoujiaming777
/

DIFFA

Model card Files Files and versions

DIFFA / README.md

zhoujiaming777's picture

Update README.md

2748cf0 verified 6 months ago

|

history blame contribute delete

639 Bytes

	---
	license: cc-by-nc-sa-4.0
	---
	# DIFFA: Large Language Diffusion Models Can Listen and Understand
	[![arXiv](https://img.shields.io/badge/Paper-arXiv-red.svg)](https://arxiv.org/abs/2507.18452)
	[![deploy](https://img.shields.io/badge/Hugging%20Face-DIFFA-FFEB3B)](https://huggingface.co/zhoujiaming777/DIFFA)
	[![Github](https://img.shields.io/badge/Github-DIFFA-blue)](https://github.com/NKU-HLT/DIFFA)


	DIFFA is the first diffusion-based large audio-language model for spoken language understanding.
	It combines a frozen diffusion LLM with dual adapters (semantic + acoustic) to enhance audio perception and reasoning.