File size: 3,133 Bytes
d2fbef5 f89d64f c5003f1 f89d64f b4ea910 b11817d b4ea910 f89d64f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 | ---
license: mit
tags:
- respiratory-sound
- medical-ai
- audio-generation
- audio-classification
language:
- en
---
# Resp-Agent Models
Model weights for **Resp-Agent** - An intelligent respiratory sound analysis and generation system.
π¦ **GitHub Repository**: [AustinZhang/Resp-Agent](https://github.com/zpforlove/Resp-Agent)
## π Contents
| Model | Size | Description |
|-------|------|-------------|
| **Diagnoser/checkpoints/longformer** | 952 MB | Fine-tuned Longformer for EHR + audio analysis |
| **Diagnoser/pretrained_models** | 695 MB | BEATs & Tokenizer pretrained weights |
| **Generator/checkpoints/llm** | 3.8 GB | Fine-tuned LLM for audio generation |
| **Generator/checkpoints/flow** | 2.0 GB | CFM flow matching model |
| **Generator/pretrained_models** | 695 MB | BEATs & Tokenizer pretrained weights |
| **audio_descriptions.jsonl** | 87 MBΓ2 | Audio description data |
> **Note**: DeepSeek-R1 model is **NOT** included here. Please download separately from:
> π [deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B)
## π Quick Download
```python
from huggingface_hub import snapshot_download
# Download all models
snapshot_download(
repo_id="AustinZhang/resp-agent-models",
local_dir="./",
ignore_patterns=["*.md", ".gitattributes"]
)
# Download DeepSeek-R1 separately
snapshot_download(
repo_id="deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
local_dir="./Diagnoser/checkpoints/deepseek-r1"
)
```
## π Expected Directory Structure
After downloading, your project should look like:
```
Resp-Agent/
βββ Diagnoser/
β βββ checkpoints/
β β βββ deepseek-r1/ # From deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
β β βββ longformer/ # From this repo
β βββ pretrained_models/ # From this repo
βββ Generator/
βββ checkpoints/
β βββ llm/ # From this repo
β βββ flow/ # From this repo
βββ pretrained_models/ # From this repo
```
## π Paper
**[Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis](https://openreview.net/forum?id=ZkoojtEm3W&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DICLR.cc%2F2026%2FConference%2FAuthors%23your-submissions))** (ICLR 2026)
If you find this work useful, please cite our paper:
```bibtex
@inproceedings{zhangresp,
title={Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis},
author={ZHANG, Pengfei and Xie, Tianxin and Yang, Minghao and Liu, Li},
booktitle={The Fourteenth International Conference on Learning Representations}
}
```
## π Acknowledgements
- [BEATs](https://github.com/microsoft/unilm/tree/master/beats) - Audio pre-training framework
- [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1) - Reasoning model
- [Longformer](https://github.com/allenai/longformer) - Long document transformer
## π License
These model weights are released for academic research purposes only.
|