Constant8868 commited on
Commit
7209d51
·
verified ·
1 Parent(s): 4562f4a

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +103 -0
README.md ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ language:
4
+ - en
5
+ - zh
6
+ tags:
7
+ - medical
8
+ - radiology
9
+ - chest-xray
10
+ - report-generation
11
+ - discrete-diffusion
12
+ - vision-language
13
+ pipeline_tag: image-text-to-text
14
+ ---
15
+
16
+ <div align="center">
17
+ <img src="https://raw.githubusercontent.com/clf28/ECHO/main/assets/ECHO_repo_head.png" alt="ECHO" width="75%">
18
+ <br><br>
19
+ <a href="https://huggingface.co/collections/Midea-AIRC/echo"><img src="https://img.shields.io/badge/HuggingFace-ECHO-yellow.svg" alt="Hugging Face: ECHO"></a>
20
+ &nbsp;
21
+ <a href="https://echo-midea-airc.github.io/"><img src="https://img.shields.io/badge/Website-ECHO-blue.svg" alt="Website: ECHO"></a>
22
+ &nbsp;
23
+ <a href="https://arxiv.org/abs/2604.09450"><img src="https://img.shields.io/badge/Technical%20Report-arXiv-red.svg" alt="Technical Report: arXiv"></a>
24
+ </div>
25
+
26
+ # ECHO_block8
27
+
28
+ **ECHO_block8** is the **DCD (Direct Conditional Distillation) stage** distilled student model with a block length of 8. It achieves coherent report generation in a **single forward pass per block**, offering up to **8× inference speedup** over multi-step baselines while maintaining high clinical quality.
29
+
30
+ ECHO (Efficient Chest X-ray Report Generation with One-step Block Diffusion) is a discrete diffusion vision–language model for automated chest X-ray report generation. DCD constructs non-factorized supervision from on-policy teacher trajectories, enabling coherent single-step decoding that was previously unachievable in discrete diffusion models.
31
+
32
+ ## Model Details
33
+
34
+ | Property | Value |
35
+ |----------|-------|
36
+ | Stage | DCD (distilled student) |
37
+ | Block Length | 8 |
38
+ | Decoding | Single-step per block |
39
+ | Architecture | `EchoForConditionalGeneration` (based on Qwen2.5-VL) |
40
+ | Hidden Size | 3584 |
41
+ | Languages | English, Chinese |
42
+ | License | [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) |
43
+
44
+ ## Usage
45
+
46
+ ```bash
47
+ git clone https://github.com/clf28/ECHO.git
48
+ cd ECHO
49
+ pip install transformers==4.55.4
50
+ ```
51
+
52
+ ```python
53
+ # Single-step inference with ECHO_block8 (distilled)
54
+ python inference/generate_echo.py \
55
+ --model_dir Midea-AIRC/ECHO_block8 \
56
+ --image_path /path/to/chest_xray.jpg \
57
+ --prompt_text "Review this chest X-ray and write a report. Use this format: Findings: {}, Impression: {}." \
58
+ --block_length 8 \
59
+ --denoising_steps 1
60
+ ```
61
+
62
+ For Chinese prompts:
63
+
64
+ ```python
65
+ python inference/generate_echo.py \
66
+ --model_dir Midea-AIRC/ECHO_block8 \
67
+ --image_path /path/to/chest_xray.jpg \
68
+ --prompt_text "这是一组胸部X光图像,请生成一份医学报告,包括所见和结论。以以下格式返回报告:所见:{} 结论:{}。" \
69
+ --block_length 8 \
70
+ --denoising_steps 1
71
+ ```
72
+
73
+ ## Model Zoo
74
+
75
+ | Model | Stage | Description | Link |
76
+ |-------|-------|-------------|------|
77
+ | `ECHO_Base_block4` | RAD | Multi-step block diffusion (block length 4), teacher for distillation | [ECHO_Base_block4](https://huggingface.co/Midea-AIRC/ECHO_Base_block4) |
78
+ | `ECHO_Base_block8` | RAD | Multi-step block diffusion (block length 8), teacher for distillation | [ECHO_Base_block8](https://huggingface.co/Midea-AIRC/ECHO_Base_block8) |
79
+ | `ECHO_block4` | DCD | Single-step distilled student (block length 4) | [ECHO_block4](https://huggingface.co/Midea-AIRC/ECHO_block4) |
80
+ | `ECHO_block8` | DCD | Single-step distilled student (block length 8) | [ECHO_block8](https://huggingface.co/Midea-AIRC/ECHO_block8) |
81
+
82
+ ## License
83
+
84
+ The model weights are released under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/). This model was trained on datasets that restrict commercial use, including MIMIC-CXR, CheXpert Plus, ReXGradient, and IU X-ray. Please use the weights in accordance with these restrictions.
85
+
86
+ ## Citation
87
+
88
+ ```bibtex
89
+ @misc{chen2026echoefficientchestxray,
90
+ title={ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion},
91
+ author={Lifeng Chen and Tianqi You and Hao Liu and Zhimin Bao and Jile Jiao and Xiao Han and Zhicai Ou and Tao Sun and Xiaofeng Mou and Xiaojie Jin and Yi Xu},
92
+ year={2026},
93
+ eprint={2604.09450},
94
+ archivePrefix={arXiv},
95
+ primaryClass={cs.LG},
96
+ url={https://arxiv.org/abs/2604.09450},
97
+ }
98
+ ```
99
+
100
+ ## Contact
101
+
102
+ - **Lifeng Chen**, Beijing Jiaotong University (lfchen@bjtu.edu.cn)
103
+ - **Hao Liu** (Corresponding Author), AI Research Center, Midea Group (liuhao249@midea.com)