File size: 4,149 Bytes
3ce2701
01bf2f4
3ce2701
 
 
 
 
 
 
 
 
 
 
 
 
 
a514ac6
3ce2701
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
01bf2f4
3ce2701
 
 
 
fb8491a
3ce2701
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
01bf2f4
3ce2701
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
---
license: other
language:
  - en
  - zh
tags:
  - medical
  - radiology
  - chest-xray
  - report-generation
  - discrete-diffusion
  - vision-language
pipeline_tag: image-text-to-text
---

<div align="center">
  <img src="./ECHO_repo_head.png" alt="ECHO" width="75%">
  <br><br>
  <a href="https://huggingface.co/collections/Midea-AIRC/echo"><img src="https://img.shields.io/badge/HuggingFace-ECHO-yellow.svg" alt="Hugging Face: ECHO"></a>
  &nbsp;
  <a href="https://echo-midea-airc.github.io/"><img src="https://img.shields.io/badge/Website-ECHO-blue.svg" alt="Website: ECHO"></a>
  &nbsp;
  <a href="https://arxiv.org/abs/2604.09450"><img src="https://img.shields.io/badge/Technical%20Report-arXiv-red.svg" alt="Technical Report: arXiv"></a>
</div>

# ECHO_Base_block4

**ECHO_Base_block4** is the **RAD (Response-Asymmetric Diffusion) stage** teacher model with a block length of 4. It performs multi-step block diffusion decoding and serves as the teacher for Direct Conditional Distillation (DCD) to produce the distilled `ECHO_block4` student.

ECHO (Efficient Chest X-ray Report Generation with One-step Block Diffusion) is a discrete diffusion vision–language model for automated chest X-ray report generation. It converts a pretrained autoregressive model into a one-step-per-block decoder via RAD adaptation and DCD.

## Model Details

| Property | Value |
|----------|-------|
| Stage | RAD (teacher) |
| Block Length | 4 |
| Decoding | Multi-step block diffusion |
| Architecture | `EchoForConditionalGeneration` (based on Qwen2.5-VL) |
| Hidden Size | 3584 |
| Languages | English, Chinese |
| License | [Midea NC](LICENSE) |

## Usage

```bash
git clone https://github.com/midea-ai/ECHO.git
cd ECHO
pip install transformers==4.55.4
```

```python
# Multi-step inference with ECHO_Base_block4
python inference/generate_vl_block.py \
  --model_dir Midea-AIRC/ECHO_Base_block4 \
  --image_path /path/to/chest_xray.jpg \
  --prompt_text "Review this chest X-ray and write a report. Use this format: Findings: {}, Impression: {}." \
  --remasking_strategy "low_confidence_dynamic" \
  --block_length 4 \
  --denoising_steps 4
```

For Chinese prompts:

```python
python inference/generate_vl_block.py \
  --model_dir Midea-AIRC/ECHO_Base_block4 \
  --image_path /path/to/chest_xray.jpg \
  --prompt_text "这是一组胸部X光图像,请生成一份医学报告,包括所见和结论。以以下格式返回报告:所见:{} 结论:{}。" \
  --remasking_strategy "low_confidence_dynamic" \
  --block_length 4 \
  --denoising_steps 4
```

## Model Zoo

| Model | Stage | Description | Link |
|-------|-------|-------------|------|
| `ECHO_Base_block4` | RAD | Multi-step block diffusion (block length 4), teacher for distillation | [ECHO_Base_block4](https://huggingface.co/Midea-AIRC/ECHO_Base_block4) |
| `ECHO_Base_block8` | RAD | Multi-step block diffusion (block length 8), teacher for distillation | [ECHO_Base_block8](https://huggingface.co/Midea-AIRC/ECHO_Base_block8) |
| `ECHO_block4` | DCD | Single-step distilled student (block length 4) | [ECHO_block4](https://huggingface.co/Midea-AIRC/ECHO_block4) |
| `ECHO_block8` | DCD | Single-step distilled student (block length 8) | [ECHO_block8](https://huggingface.co/Midea-AIRC/ECHO_block8) |

## License

This model is released under the [Midea Model License Agreement - Non-Commercial Use Version](LICENSE). Use for research, study, and personal non-commercial purposes only. Commercial use is strictly prohibited.

## Citation

```bibtex
@misc{chen2026echoefficientchestxray,
      title={ECHO: Efficient Chest X-ray Report Generation with One-step Block Diffusion}, 
      author={Lifeng Chen and Tianqi You and Hao Liu and Zhimin Bao and Jile Jiao and Xiao Han and Zhicai Ou and Tao Sun and Xiaofeng Mou and Xiaojie Jin and Yi Xu},
      year={2026},
      eprint={2604.09450},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2604.09450}, 
}
```

## Contact

- **Lifeng Chen**, Beijing Jiaotong University (lfchen@bjtu.edu.cn)
- **Hao Liu** (Corresponding Author), AI Research Center, Midea Group (liuhao249@midea.com)