File size: 1,242 Bytes
6caf283 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | ---
license: apache-2.0
language:
- en
tags:
- multimodal
- image-restoration
- unified-model
- BAGEL
- VLM
pipeline_tag: image-text-to-text
---
# CLEAR: Unlocking Generative Potential for Degraded Image Understanding
CLEAR is a unified multimodal model that leverages generative capabilities (image restoration) to improve visual understanding of degraded images. It introduces an **interleaved reasoning** paradigm where the model adaptively decides whether to invoke image restoration before answering.
> [[Paper]](https://arxiv.org/abs/2604.04780) | [[Code]](https://github.com/haoxiangzhao12138/CLEAR) | [[Project Page]](https://haoxiangzhao12138.github.io/CLEAR/) | [[MMD-Bench]](https://huggingface.co/datasets/CUDAOUTOFMEMORY/MMD-Bench)
## Citation
```bibtex
@misc{hao2026clearunlockinggenerativepotential,
title={CLEAR: Unlocking Generative Potential for Degraded Image Understanding in Unified Multimodal Models},
author={Xiangzhao Hao and Zefeng Zhang and Zhenyu Zhang and Linhao Yu and Yao Chen and Yiqian Zhang and Haiyun Guo and Shuohuan Wang and Yu Sun},
year={2026},
eprint={2604.04780},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2604.04780},
}
```
|