Image-Text-to-Text
Transformers
Safetensors
English
qwen3_vl
text-generation
medical
ecg
reasoning
conversational
Instructions to use PKUDigitalHealth/ECG-R1-8B-SFT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use PKUDigitalHealth/ECG-R1-8B-SFT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="PKUDigitalHealth/ECG-R1-8B-SFT") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForSeq2SeqLM processor = AutoProcessor.from_pretrained("PKUDigitalHealth/ECG-R1-8B-SFT") model = AutoModelForSeq2SeqLM.from_pretrained("PKUDigitalHealth/ECG-R1-8B-SFT") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use PKUDigitalHealth/ECG-R1-8B-SFT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "PKUDigitalHealth/ECG-R1-8B-SFT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PKUDigitalHealth/ECG-R1-8B-SFT", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/PKUDigitalHealth/ECG-R1-8B-SFT
- SGLang
How to use PKUDigitalHealth/ECG-R1-8B-SFT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "PKUDigitalHealth/ECG-R1-8B-SFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PKUDigitalHealth/ECG-R1-8B-SFT", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "PKUDigitalHealth/ECG-R1-8B-SFT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PKUDigitalHealth/ECG-R1-8B-SFT", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use PKUDigitalHealth/ECG-R1-8B-SFT with Docker Model Runner:
docker model run hf.co/PKUDigitalHealth/ECG-R1-8B-SFT
Improve model card metadata and links
Browse filesHi! I'm Niels from the community science team at Hugging Face.
This PR improves the model card metadata by adding the `arxiv` ID, which links the model to its corresponding paper, and the training dataset ID. I've also added `library_name: transformers` as the model architecture is compatible with the library according to the `config.json`.
Additionally, I've added some relevant tags like `ecg` and `reasoning` to improve discoverability.
README.md
CHANGED
|
@@ -1,12 +1,18 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
base_model:
|
| 6 |
- Qwen/Qwen3-VL-8B-Instruct
|
|
|
|
|
|
|
|
|
|
| 7 |
pipeline_tag: image-text-to-text
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
tags:
|
| 9 |
- medical
|
|
|
|
|
|
|
| 10 |
---
|
| 11 |
|
| 12 |
<div align="center">
|
|
@@ -32,19 +38,15 @@ tags:
|
|
| 32 |
|
| 33 |
## Introduction
|
| 34 |
|
| 35 |
-
Electrocardiography (ECG) serves as an indispensable diagnostic tool in clinical practice, yet existing multimodal large language models (MLLMs) remain unreliable for ECG interpretation, often producing plausible but clinically incorrect analyses. To address this, we propose ECG-R1, the first reasoning MLLM designed for reliable ECG interpretation via three innovations. First, we construct the interpretation corpus using Protocol-Guided Instruction Data Generation, grounding interpretation in measurable ECG features and monograph-defined quantitative thresholds and diagnostic logic. Second, we present a modality-decoupled architecture with Interleaved Modality Dropout to improve robustness and cross-modal consistency when either the ECG signal or ECG image is missing. Third, we present Reinforcement Learning with ECG Diagnostic Evidence Rewards to strengthen evidence-grounded ECG interpretation. Additionally, we systematically evaluate the ECG interpretation capabilities of proprietary, open-source, and medical MLLMs, and provide the first quantitative evidence that severe hallucinations are widespread, suggesting that the public should not directly trust these outputs without independent verification.
|
| 36 |
|
| 37 |
## Resource
|
| 38 |
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
#### Model: 🤗 [ECG-R1-8B](https://huggingface.co/PKUDigitalHealth/ECG-R1-8B-RL)
|
| 45 |
-
|
| 46 |
-
#### Data: 🤗 [ECG-Protocol-Guided-Grounding-CoT](https://huggingface.co/datasets/PKUDigitalHealth/ECG-Protocol-Guided-Grounding-CoT)
|
| 47 |
-
|
| 48 |
|
| 49 |
## Citation
|
| 50 |
|
|
@@ -63,4 +65,4 @@ If you find ECG-R1 helpful for your research and applications, please cite our p
|
|
| 63 |
```
|
| 64 |
|
| 65 |
## Acknowledgement
|
| 66 |
-
We thank the authors of [PULSE](https://github.com/AIMedLab/PULSE/tree/dev), [ECG-Chat](https://github.com/YubaoZhao/ECG-Chat), [GEM](https://github.com/lanxiang1017/GEM), and [Swift](https://github.com/modelscope/ms-swift) for their publicly released models, datasets, and training codes.
|
|
|
|
| 1 |
---
|
|
|
|
|
|
|
|
|
|
| 2 |
base_model:
|
| 3 |
- Qwen/Qwen3-VL-8B-Instruct
|
| 4 |
+
language:
|
| 5 |
+
- en
|
| 6 |
+
license: apache-2.0
|
| 7 |
pipeline_tag: image-text-to-text
|
| 8 |
+
library_name: transformers
|
| 9 |
+
arxiv: 2602.04279
|
| 10 |
+
datasets:
|
| 11 |
+
- PKUDigitalHealth/ECG-Protocol-Guided-Grounding-CoT
|
| 12 |
tags:
|
| 13 |
- medical
|
| 14 |
+
- ecg
|
| 15 |
+
- reasoning
|
| 16 |
---
|
| 17 |
|
| 18 |
<div align="center">
|
|
|
|
| 38 |
|
| 39 |
## Introduction
|
| 40 |
|
| 41 |
+
Electrocardiography (ECG) serves as an indispensable diagnostic tool in clinical practice, yet existing multimodal large language models (MLLMs) remain unreliable for ECG interpretation, often producing plausible but clinically incorrect analyses. To address this, we propose ECG-R1, the first reasoning MLLM designed for reliable ECG interpretation via three innovations. First, we construct the interpretation corpus using Protocol-Guided Instruction Data Generation, grounding interpretation in measurable ECG features and monograph-defined quantitative thresholds and diagnostic logic. Second, we present a modality-decoupled architecture with Interleaved Modality Dropout to improve robustness and cross-modal consistency when either the ECG signal or ECG image is missing. Third, we present Reinforcement Learning with ECG Diagnostic Evidence Rewards to strengthen evidence-grounded ECG interpretation. Additionally, we systematically evaluate the ECG interpretation capabilities of proprietary, open-source, and medical MLLMs, and provide the first quantitative evidence that severe hallucinations are widespread, suggesting that the public should not directly trust these outputs without independent verification.
|
| 42 |
|
| 43 |
## Resource
|
| 44 |
|
| 45 |
+
- **Paper:** [ECG-R1: Protocol-Guided and Modality-Agnostic MLLM for Reliable ECG Interpretation](https://arxiv.org/abs/2602.04279)
|
| 46 |
+
- **GitHub Repository:** [PKUDigitalHealth/ECG-R1](https://github.com/PKUDigitalHealth/ECG-R1)
|
| 47 |
+
- **Online Platform:** [ECG-R1-Online-Platform](http://ai.heartvoice.com.cn/ECG-R1/)
|
| 48 |
+
- **Model:** 🤗 [ECG-R1-8B](https://huggingface.co/PKUDigitalHealth/ECG-R1-8B-RL)
|
| 49 |
+
- **Data:** 🤗 [ECG-Protocol-Guided-Grounding-CoT](https://huggingface.co/datasets/PKUDigitalHealth/ECG-Protocol-Guided-Grounding-CoT)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 50 |
|
| 51 |
## Citation
|
| 52 |
|
|
|
|
| 65 |
```
|
| 66 |
|
| 67 |
## Acknowledgement
|
| 68 |
+
We thank the authors of [PULSE](https://github.com/AIMedLab/PULSE/tree/dev), [ECG-Chat](https://github.com/YubaoZhao/ECG-Chat), [GEM](https://github.com/lanxiang1017/GEM), and [Swift](https://github.com/modelscope/ms-swift) for their publicly released models, datasets, and training codes.
|