---
base_model:
- llava-hf/llava-1.5-7b-hf
- OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-7B
datasets:
- MLLM-CL/MLLM-CL
- MLLM-CL/MLLM-CL-ReplayData
language:
- en
library_name: transformers
license: apache-2.0
metrics:
- accuracy
pipeline_tag: image-text-to-text
tags:
- finance
- medical
- AD
- MLLM-CL
- Sci
- RS
- Math
- OCR
- Count
- GUI-Agent
- DCL
- ACL
- llava
- multimodal
- image-to-text
- text-generation
base_model_relation: adapter
---

## MLLM-CL Benchmark Description
MLLM-CL is a novel benchmark encompassing domain and ability continual learning, where the former focuses on independently and identically distributed (IID) evaluation across evolving mainstream domains, 
whereas the latter evaluates on non-IID scenarios with emerging model ability.
For more details, please refer to: 

**MLLM-CL: Continual Learning for Multimodal Large Language Models** [[paper](https://arxiv.org/abs/2506.05453)], [[HF paper](https://huggingface.co/papers/2506.05453)], [[code](https://github.com/bjzhb666/MLLM-CL/)].
![](MLLM-CL.png "Magic Gardens")
[‪Hongbo Zhao](https://scholar.google.com/citations?user=Gs22F0UAAAAJ&hl=zh-CN), [Fei Zhu](https://impression2805.github.io/), [Haiyang Guo](https://ghy0501.github.io/guohaiyang0501.github.io/), [Meng Wang](https://moenupa.github.io/), Rundong Wang, [‪Gaofeng Meng](https://scholar.google.com/citations?hl=zh-CN&user=5hti_r0AAAAJ), [‪Zhaoxiang Zhang‬](https://scholar.google.com/citations?hl=zh-CN&user=qxWfV6cAAAAJ)

## Usage
This repo is used to open-source all the experts in MLLM-CL experiments, including 4 branches (DCL_InternVL, DCL_LLaVA, ACL_InternVL, ACL_LLaVA).

## Citation
```
@article{zhao2025mllm,
  title={MLLM-CL: Continual Learning for Multimodal Large Language Models},
  author={Zhao, Hongbo and Zhu, Fei and Guo, Haiyang and Wang, Meng and Wang, Rundong and Meng, Gaofeng and Zhang, Zhaoxiang},
  journal={arXiv preprint arXiv:2506.05453},
  year={2025}
}
```
## Contact
Please post an issue on our GitHub.

## About us: MLLM-CL Community

We are the members from MLLM-CL, an open-source community focused on Continual learning of Multimodal Large Language Models. 
If you are interested in our community, feel free to contact us on GitHub or by email.