Constant8868 commited on
Commit
3ce2701
·
verified ·
1 Parent(s): 140fbaf

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +105 -0
README.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ language:
4
+ - en
5
+ - zh
6
+ tags:
7
+ - medical
8
+ - radiology
9
+ - chest-xray
10
+ - report-generation
11
+ - discrete-diffusion
12
+ - vision-language
13
+ pipeline_tag: image-text-to-text
14
+ ---
15
+
16
+ <div align="center">
17
+ <img src="https://raw.githubusercontent.com/clf28/ECHO/main/assets/ECHO_repo_head.png" alt="ECHO" width="75%">
18
+ <br><br>
19
+ <a href="https://huggingface.co/collections/Midea-AIRC/echo"><img src="https://img.shields.io/badge/HuggingFace-ECHO-yellow.svg" alt="Hugging Face: ECHO"></a>
20
+ &nbsp;
21
+ <a href="https://echo-midea-airc.github.io/"><img src="https://img.shields.io/badge/Website-ECHO-blue.svg" alt="Website: ECHO"></a>
22
+ &nbsp;
23
+ <a href="https://arxiv.org/abs/2604.09450"><img src="https://img.shields.io/badge/Technical%20Report-arXiv-red.svg" alt="Technical Report: arXiv"></a>
24
+ </div>
25
+
26
+ # ECHO_Base_block4
27
+
28
+ **ECHO_Base_block4** is the **RAD (Response-Asymmetric Diffusion) stage** teacher model with a block length of 4. It performs multi-step block diffusion decoding and serves as the teacher for Direct Conditional Distillation (DCD) to produce the distilled `ECHO_block4` student.
29
+
30
+ ECHO (Efficient Chest X-ray Report Generation with One-step Block Diffusion) is a discrete diffusion vision–language model for automated chest X-ray report generation. It converts a pretrained autoregressive model into a one-step-per-block decoder via RAD adaptation and DCD.
31
+
32
+ ## Model Details
33
+
34
+ | Property | Value |
35
+ |----------|-------|
36
+ | Stage | RAD (teacher) |
37
+ | Block Length | 4 |
38
+ | Decoding | Multi-step block diffusion |
39
+ | Architecture | `EchoForConditionalGeneration` (based on Qwen2.5-VL) |
40
+ | Hidden Size | 3584 |
41
+ | Languages | English, Chinese |
42
+ | License | [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) |
43
+
44
+ ## Usage
45
+
46
+ ```bash
47
+ git clone https://github.com/clf28/ECHO.git
48
+ cd ECHO
49
+ pip install transformers==4.55.4
50
+ ```
51
+
52
+ ```python
53
+ # Multi-step inference with ECHO_Base_block4
54
+ python inference/generate_vl_block.py \
55
+ --model_dir Midea-AIRC/ECHO_Base_block4 \
56
+ --image_path /path/to/chest_xray.jpg \
57
+ --prompt_text "Review this chest X-ray and write a report. Use this format: Findings: {}, Impression: {}." \
58
+ --remasking_strategy "low_confidence_dynamic" \
59
+ --block_length 4 \
60
+ --denoising_steps 4
61
+ ```
62
+
63
+ For Chinese prompts:
64
+
65
+ ```python
66
+ python inference/generate_vl_block.py \
67
+ --model_dir Midea-AIRC/ECHO_Base_block4 \
68
+ --image_path /path/to/chest_xray.jpg \
69
+ --prompt_text "这是一组胸部X光图像,请生成一份医学报告,包括所见和结论。以以下格式返回报告:所见:{} 结论:{}。" \
70
+ --remasking_strategy "low_confidence_dynamic" \
71
+ --block_length 4 \
72
+ --denoising_steps 4
73
+ ```
74
+
75
+ ## Model Zoo
76
+
77
+ | Model | Stage | Description | Link |
78
+ |-------|-------|-------------|------|
79
+ | `ECHO_Base_block4` | RAD | Multi-step block diffusion (block length 4), teacher for distillation | [ECHO_Base_block4](https://huggingface.co/Midea-AIRC/ECHO_Base_block4) |
80
+ | `ECHO_Base_block8` | RAD | Multi-step block diffusion (block length 8), teacher for distillation | [ECHO_Base_block8](https://huggingface.co/Midea-AIRC/ECHO_Base_block8) |
81
+ | `ECHO_block4` | DCD | Single-step distilled student (block length 4) | [ECHO_block4](https://huggingface.co/Midea-AIRC/ECHO_block4) |
82
+ | `ECHO_block8` | DCD | Single-step distilled student (block length 8) | [ECHO_block8](https://huggingface.co/Midea-AIRC/ECHO_block8) |
83
+
84
+ ## License
85
+
86
+ The model weights are released under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). This model was trained on datasets that restrict commercial use, including MIMIC-CXR, CheXpert Plus, ReXGradient, and IU X-ray. Please use the weights in accordance with these restrictions.
87
+
88
+ ## Citation
89
+
90
+ ```bibtex
91
+ @misc{chen2026echoefficientchestxray,
92
+ title={ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion},
93
+ author={Lifeng Chen and Tianqi You and Hao Liu and Zhimin Bao and Jile Jiao and Xiao Han and Zhicai Ou and Tao Sun and Xiaofeng Mou and Xiaojie Jin and Yi Xu},
94
+ year={2026},
95
+ eprint={2604.09450},
96
+ archivePrefix={arXiv},
97
+ primaryClass={cs.LG},
98
+ url={https://arxiv.org/abs/2604.09450},
99
+ }
100
+ ```
101
+
102
+ ## Contact
103
+
104
+ - **Lifeng Chen**, Beijing Jiaotong University (lfchen@bjtu.edu.cn)
105
+ - **Hao Liu** (Corresponding Author), AI Research Center, Midea Group (liuhao249@midea.com)