resp-agent-models / README.md
nielsr's picture
nielsr HF Staff
Improve model card and add metadata
e79ef1b verified
|
raw
history blame
3.18 kB
metadata
language:
  - en
license: mit
pipeline_tag: audio-text-to-text
tags:
  - respiratory-sound
  - medical-ai
  - audio-generation
  - audio-classification

Resp-Agent Models

Model weights for Resp-Agent, an autonomous multimodal system for respiratory sound generation and disease diagnosis presented in the paper: Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis (ICLR 2026).

πŸ“¦ GitHub Repository: zpforlove/Resp-Agent

πŸ“ Contents

Model Size Description
Diagnoser/checkpoints/longformer 952 MB Fine-tuned Longformer for EHR + audio analysis
Diagnoser/pretrained_models 695 MB BEATs & Tokenizer pretrained weights
Generator/checkpoints/llm 3.8 GB Fine-tuned LLM for audio generation
Generator/checkpoints/flow 2.0 GB CFM flow matching model
Generator/pretrained_models 695 MB BEATs & Tokenizer pretrained weights
audio_descriptions.jsonl 87 MBΓ—2 Audio description data

Note: DeepSeek-R1 model is NOT included here. Please download separately from:
πŸ”— deepseek-ai/DeepSeek-R1-Distill-Qwen-7B

πŸš€ Quick Download

from huggingface_hub import snapshot_download

# Download all models
snapshot_download(
    repo_id="AustinZhang/resp-agent-models",
    local_dir="./",
    ignore_patterns=["*.md", ".gitattributes"]
)

# Download DeepSeek-R1 separately
snapshot_download(
    repo_id="deepseek-ai/DeepSeek-R1-Distill-Qwen-7B",
    local_dir="./Diagnoser/checkpoints/deepseek-r1"
)

πŸ“‚ Expected Directory Structure

After downloading, your project should look like:

Resp-Agent/
β”œβ”€β”€ Diagnoser/
β”‚   β”œβ”€β”€ checkpoints/
β”‚   β”‚   β”œβ”€β”€ deepseek-r1/          # From deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
β”‚   β”‚   └── longformer/           # From this repo
β”‚   └── pretrained_models/        # From this repo
└── Generator/
    β”œβ”€β”€ checkpoints/
    β”‚   β”œβ”€β”€ llm/                  # From this repo
    β”‚   └── flow/                 # From this repo
    └── pretrained_models/        # From this repo

πŸ“ Citation

If you find this work useful, please cite our paper:

@inproceedings{
zhang2026respagent,
title={Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis},
author={Pengfei ZHANG and Tianxin Xie and Minghao Yang and Li Liu},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=ZkoojtEm3W}
}

πŸ™ Acknowledgements

πŸ“„ License

These model weights are released for academic research purposes only under the MIT License.