---
license: mit
tags:
  - respiratory-sound
  - medical-ai
  - audio-generation
  - audio-classification
language:
  - en
---

# Resp-Agent Models

Model weights for **Resp-Agent** - An intelligent respiratory sound analysis and generation system.

📦 **GitHub Repository**: [AustinZhang/Resp-Agent](https://github.com/zpforlove/Resp-Agent)

## 📁 Contents

| Model | Size | Description |
|-------|------|-------------|
| **Diagnoser/checkpoints/longformer** | 952 MB | Fine-tuned Longformer for EHR + audio analysis |
| **Diagnoser/pretrained_models** | 695 MB | BEATs & Tokenizer pretrained weights |
| **Generator/checkpoints/llm** | 3.8 GB | Fine-tuned LLM for audio generation |
| **Generator/checkpoints/flow** | 2.0 GB | CFM flow matching model |
| **Generator/pretrained_models** | 695 MB | BEATs & Tokenizer pretrained weights |
| **audio_descriptions.jsonl** | 87 MB×2 | Audio description data |

> **Note**: DeepSeek-R1 model is **NOT** included here. Please download separately from:  
> 🔗 [deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)

## 🚀 Quick Download

```python
from huggingface_hub import snapshot_download

# Download all models
snapshot_download(
    repo_id="AustinZhang/resp-agent-models",
    local_dir="./",
    ignore_patterns=["*.md", ".gitattributes"]
)

# Download DeepSeek-R1 separately
snapshot_download(
    repo_id="deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
    local_dir="./Diagnoser/checkpoints/deepseek-r1"
)
```

## 📂 Expected Directory Structure

After downloading, your project should look like:

```
Resp-Agent/
├── Diagnoser/
│   ├── checkpoints/
│   │   ├── deepseek-r1/          # From deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
│   │   └── longformer/           # From this repo
│   └── pretrained_models/        # From this repo
└── Generator/
    ├── checkpoints/
    │   ├── llm/                  # From this repo
    │   └── flow/                 # From this repo
    └── pretrained_models/        # From this repo
```

## 📝 Paper

**[Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis](https://openreview.net/forum?id=ZkoojtEm3W&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DICLR.cc%2F2026%2FConference%2FAuthors%23your-submissions))** (ICLR 2026)

If you find this work useful, please cite our paper:

```bibtex
@inproceedings{zhangresp,
  title={Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis},
  author={ZHANG, Pengfei and Xie, Tianxin and Yang, Minghao and Liu, Li},
  booktitle={The Fourteenth International Conference on Learning Representations}
}
```

## 🙏 Acknowledgements

- [BEATs](https://github.com/microsoft/unilm/tree/master/beats) - Audio pre-training framework
- [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) - Reasoning model
- [Longformer](https://github.com/allenai/longformer) - Long document transformer

## 📄 License

These model weights are released for academic research purposes only.