license: apache-2.0
pipeline_tag: image-text-to-text
library_name: transformers
FundusExpert
FundusExpert is an ophthalmology-specific Multimodal Large Language Model (MLLM) with integrated positioning-diagnosis reasoning capabilities, introduced in the paper Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning.
FundusExpert aims to bridge the visual-language gap in specialized medical domains by integrating region-level localization with diagnostic reasoning chains.
Access
Our model weights and benchmarks are hosted on HuggingFace and require application to access. This model and its benchmarks are for academic research only. By applying for access, you agree to these terms.
Please send an email to liuxinyao@mail.ustc.edu.cn and `songdiping@pjlab.org.cn. Please include your Hugging Face username and a brief self-introduction in the email. We will authorize you as soon as possible.
Code and Usage
The official implementation can be found on the GitHub repository.
Quick Start
After gaining access, clone the repository and install the dependencies. Please refer to the InternVL Installation or use the src/internvl25_requirements.txt to build the environment.
Inference with single GPU:
python src/quick_start.py
Acknowledgements
Our model is based on the OpenGVLab/InternVL. We would like to thank its authors for their excellent work and open source contributions.
Citation
If you find our work helpful or inspiring, please cite it:
@misc{liuxinyao2025fundusexpert,
title={Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning},
author={Xinyao Liu and Diping Song and Yuxin Feng and Jiarui Zhang and Yiting Li and Jiaqi Li and Xintong Li and Yixuan Hu and Yueyang Li and Zhenyu Yuan and Yuying Tang and Yanhua Zhu and Chenchen Wang and Jing Zhang and Jun Zhu and Min Meng and Yao Zhang and Bo Li and Xinyuan Wang and Min Yang and Jinsong Wang and Yizhe Zhang and Shanshan Cui},
year={2025},
eprint={2507.17539},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2507.17539},
}