AustinZhang
/

resp-agent-models

Audio Classification

respiratory-sound

audio-generation

Model card Files Files and versions

resp-agent-models / README.md

AustinZhang's picture

Update README.md

b11817d verified 8 days ago

|

history blame contribute delete

3.13 kB

	---
	license: mit
	tags:
	- respiratory-sound
	- medical-ai
	- audio-generation
	- audio-classification
	language:
	- en
	---

	# Resp-Agent Models

	Model weights for Resp-Agent - An intelligent respiratory sound analysis and generation system.

	📦 GitHub Repository: [AustinZhang/Resp-Agent](https://github.com/zpforlove/Resp-Agent)

	## 📁 Contents

	\| Model \| Size \| Description \|
	\|-------\|------\|-------------\|
	\| Diagnoser/checkpoints/longformer \| 952 MB \| Fine-tuned Longformer for EHR + audio analysis \|
	\| Diagnoser/pretrained_models \| 695 MB \| BEATs & Tokenizer pretrained weights \|
	\| Generator/checkpoints/llm \| 3.8 GB \| Fine-tuned LLM for audio generation \|
	\| Generator/checkpoints/flow \| 2.0 GB \| CFM flow matching model \|
	\| Generator/pretrained_models \| 695 MB \| BEATs & Tokenizer pretrained weights \|
	\| audio_descriptions.jsonl \| 87 MB×2 \| Audio description data \|

	> Note: DeepSeek-R1 model is NOT included here. Please download separately from:
	> 🔗 [deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)

	## 🚀 Quick Download

	```python
	from huggingface_hub import snapshot_download

	# Download all models
	snapshot_download(
	repo_id="AustinZhang/resp-agent-models",
	local_dir="./",
	ignore_patterns=["*.md", ".gitattributes"]
	)

	# Download DeepSeek-R1 separately
	snapshot_download(
	repo_id="deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
	local_dir="./Diagnoser/checkpoints/deepseek-r1"
	)
	```

	## 📂 Expected Directory Structure

	After downloading, your project should look like:

	```
	Resp-Agent/
	├── Diagnoser/
	│ ├── checkpoints/
	│ │ ├── deepseek-r1/ # From deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
	│ │ └── longformer/ # From this repo
	│ └── pretrained_models/ # From this repo
	└── Generator/
	├── checkpoints/
	│ ├── llm/ # From this repo
	│ └── flow/ # From this repo
	└── pretrained_models/ # From this repo
	```

	## 📝 Paper

	[Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis](https://openreview.net/forum?id=ZkoojtEm3W&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DICLR.cc%2F2026%2FConference%2FAuthors%23your-submissions)) (ICLR 2026)

	If you find this work useful, please cite our paper:

	```bibtex
	@inproceedings{zhangresp,
	title={Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis},
	author={ZHANG, Pengfei and Xie, Tianxin and Yang, Minghao and Liu, Li},
	booktitle={The Fourteenth International Conference on Learning Representations}
	}
	```

	## 🙏 Acknowledgements

	- [BEATs](https://github.com/microsoft/unilm/tree/master/beats) - Audio pre-training framework
	- [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) - Reasoning model
	- [Longformer](https://github.com/allenai/longformer) - Long document transformer

	## 📄 License

	These model weights are released for academic research purposes only.