CUDAOUTOFMEMORY
/

CLEAR

Image-Text-to-Text

image-restoration

Model card Files Files and versions

CLEAR / README.md

CUDAOUTOFMEMORY's picture

CUDAOUTOFMEMORY

Create README.md

6caf283 verified about 2 months ago

|

history blame contribute delete

1.24 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- multimodal
	- image-restoration
	- unified-model
	- BAGEL
	- VLM
	pipeline_tag: image-text-to-text
	---

	# CLEAR: Unlocking Generative Potential for Degraded Image Understanding

	CLEAR is a unified multimodal model that leverages generative capabilities (image restoration) to improve visual understanding of degraded images. It introduces an interleaved reasoning paradigm where the model adaptively decides whether to invoke image restoration before answering.

	> [[Paper]](https://arxiv.org/abs/2604.04780) \| [[Code]](https://github.com/haoxiangzhao12138/CLEAR) \| [[Project Page]](https://haoxiangzhao12138.github.io/CLEAR/) \| [[MMD-Bench]](https://huggingface.co/datasets/CUDAOUTOFMEMORY/MMD-Bench)

	## Citation

	```bibtex
	@misc{hao2026clearunlockinggenerativepotential,
	title={CLEAR: Unlocking Generative Potential for Degraded Image Understanding in Unified Multimodal Models},
	author={Xiangzhao Hao and Zefeng Zhang and Zhenyu Zhang and Linhao Yu and Yao Chen and Yiqian Zhang and Haiyun Guo and Shuohuan Wang and Yu Sun},
	year={2026},
	eprint={2604.04780},
	archivePrefix={arXiv},
	primaryClass={cs.CV},
	url={https://arxiv.org/abs/2604.04780},
	}
	```