Resp-Agent Models

Model weights for Resp-Agent - An intelligent respiratory sound analysis and generation system.

📦 GitHub Repository: AustinZhang/Resp-Agent

📁 Contents

Model	Size	Description
Diagnoser/checkpoints/longformer	952 MB	Fine-tuned Longformer for EHR + audio analysis
Diagnoser/pretrained_models	695 MB	BEATs & Tokenizer pretrained weights
Generator/checkpoints/llm	3.8 GB	Fine-tuned LLM for audio generation
Generator/checkpoints/flow	2.0 GB	CFM flow matching model
Generator/pretrained_models	695 MB	BEATs & Tokenizer pretrained weights
audio_descriptions.jsonl	87 MB×2	Audio description data

Note: DeepSeek-R1 model is NOT included here. Please download separately from:
🔗 deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

🚀 Quick Download

from huggingface_hub import snapshot_download

# Download all models
snapshot_download(
    repo_id="AustinZhang/resp-agent-models",
    local_dir="./",
    ignore_patterns=["*.md", ".gitattributes"]
)

# Download DeepSeek-R1 separately
snapshot_download(
    repo_id="deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
    local_dir="./Diagnoser/checkpoints/deepseek-r1"
)

📂 Expected Directory Structure

After downloading, your project should look like:

Resp-Agent/
├── Diagnoser/
│   ├── checkpoints/
│   │   ├── deepseek-r1/          # From deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
│   │   └── longformer/           # From this repo
│   └── pretrained_models/        # From this repo
└── Generator/
    ├── checkpoints/
    │   ├── llm/                  # From this repo
    │   └── flow/                 # From this repo
    └── pretrained_models/        # From this repo

📝 Paper

Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis (ICLR 2026)

If you find this work useful, please cite our paper:

@inproceedings{zhangresp,
  title={Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis},
  author={ZHANG, Pengfei and Xie, Tianxin and Yang, Minghao and Liu, Li},
  booktitle={The Fourteenth International Conference on Learning Representations}
}

🙏 Acknowledgements

BEATs - Audio pre-training framework
DeepSeek-R1 - Reasoning model
Longformer - Long document transformer

📄 License

These model weights are released for academic research purposes only.

Downloads last month: -; Downloads are not tracked for this model. How to track