Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?

🏰 Pretrained checkpoints (reference)

Checkpoint Link Note
BAGEL-7B-MoT ByteDance-Seed/BAGEL-7B-MoT Used as initial weights for training.
Robust-U1 Jiaqi-hkust/Robust-U1 Final model for visual self-recovery and multimodal reasoning.
Robust-U1-RL Jiaqi-hkust/Robust-U1-RL Fine-tuned with reinforcement learning.
Robust-U1-SFT Jiaqi-hkust/Robust-U1-SFT Fine-tuned with supervised learning.

⭐️ Citation

If you find this repository useful, please cite our paper:

@inproceedings{
2026robustu,
title={Robust-U1: Can {MLLM}s Self-Recover Corrupted Visual Content for Robust Understanding?},
author={Jiaqi Tang, Jianmin Chen, Youyang Zhai, Wei Wei, Runtao Liu, Mengjie Zhao, Xiangyu Wu, Qingfa Xiao, Qifeng Chen},
booktitle={Forty-third International Conference on Machine Learning},
year={2026},
url={https://openreview.net/forum?id=I6W6cxVVts}
}
Downloads last month
7
Safetensors
Model size
15B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using Jiaqi-hkust/Robust-U1 1

Collection including Jiaqi-hkust/Robust-U1

Paper for Jiaqi-hkust/Robust-U1