Resp-Agent Models
Model weights for Resp-Agent - An intelligent respiratory sound analysis and generation system.
π¦ GitHub Repository: AustinZhang/Resp-Agent
π Contents
| Model | Size | Description |
|---|---|---|
| Diagnoser/checkpoints/longformer | 952 MB | Fine-tuned Longformer for EHR + audio analysis |
| Diagnoser/pretrained_models | 695 MB | BEATs & Tokenizer pretrained weights |
| Generator/checkpoints/llm | 3.8 GB | Fine-tuned LLM for audio generation |
| Generator/checkpoints/flow | 2.0 GB | CFM flow matching model |
| Generator/pretrained_models | 695 MB | BEATs & Tokenizer pretrained weights |
| audio_descriptions.jsonl | 87 MBΓ2 | Audio description data |
Note: DeepSeek-R1 model is NOT included here. Please download separately from:
π deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
π Quick Download
from huggingface_hub import snapshot_download
# Download all models
snapshot_download(
repo_id="AustinZhang/resp-agent-models",
local_dir="./",
ignore_patterns=["*.md", ".gitattributes"]
)
# Download DeepSeek-R1 separately
snapshot_download(
repo_id="deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
local_dir="./Diagnoser/checkpoints/deepseek-r1"
)
π Expected Directory Structure
After downloading, your project should look like:
Resp-Agent/
βββ Diagnoser/
β βββ checkpoints/
β β βββ deepseek-r1/ # From deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
β β βββ longformer/ # From this repo
β βββ pretrained_models/ # From this repo
βββ Generator/
βββ checkpoints/
β βββ llm/ # From this repo
β βββ flow/ # From this repo
βββ pretrained_models/ # From this repo
π Paper
Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis (ICLR 2026)
If you find this work useful, please cite our paper:
@inproceedings{zhangresp,
title={Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis},
author={ZHANG, Pengfei and Xie, Tianxin and Yang, Minghao and Liu, Li},
booktitle={The Fourteenth International Conference on Learning Representations}
}
π Acknowledgements
- BEATs - Audio pre-training framework
- DeepSeek-R1 - Reasoning model
- Longformer - Long document transformer
π License
These model weights are released for academic research purposes only.