| --- |
| license: apache-2.0 |
| language: |
| - en |
| tags: |
| - multimodal |
| - image-restoration |
| - unified-model |
| - BAGEL |
| - VLM |
| pipeline_tag: image-text-to-text |
| --- |
| |
| # CLEAR: Unlocking Generative Potential for Degraded Image Understanding |
|
|
| CLEAR is a unified multimodal model that leverages generative capabilities (image restoration) to improve visual understanding of degraded images. It introduces an **interleaved reasoning** paradigm where the model adaptively decides whether to invoke image restoration before answering. |
|
|
| > [[Paper]](https://arxiv.org/abs/2604.04780) | [[Code]](https://github.com/haoxiangzhao12138/CLEAR) | [[Project Page]](https://haoxiangzhao12138.github.io/CLEAR/) | [[MMD-Bench]](https://huggingface.co/datasets/CUDAOUTOFMEMORY/MMD-Bench) |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{hao2026clearunlockinggenerativepotential, |
| title={CLEAR: Unlocking Generative Potential for Degraded Image Understanding in Unified Multimodal Models}, |
| author={Xiangzhao Hao and Zefeng Zhang and Zhenyu Zhang and Linhao Yu and Yao Chen and Yiqian Zhang and Haiyun Guo and Shuohuan Wang and Yu Sun}, |
| year={2026}, |
| eprint={2604.04780}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.CV}, |
| url={https://arxiv.org/abs/2604.04780}, |
| } |
| ``` |
|
|