Falcon-8B / README.md
renns's picture
Update README.md
87cca4d verified
---
license: apache-2.0
language:
- en
base_model:
- meta-llama/Llama-3.1-8B-Instruct
- google/siglip-large-patch16-384
pipeline_tag: visual-question-answering
---
# Falcon-8B
## Description
\[[Paper](https://arxiv.org/abs/2501.16297)\] \[[GitHub](https://github.com/JiuTian-VL/FALCON)\] \[[Project Page](https://jiutian-vl.github.io/FALCON.github.io/)\]
This is the official model weights of *FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers*. In this work, we propose the FALCON model, which introduces a novel visual register technique to simultaneously address the issues of visual redundancy and fragmentation in the high-resolution visual encoding of MLLMs.
![image/png](https://jiutian-vl.github.io/FALCON.github.io/assets/images/FALCON_arch.png)
## How to Run?
Please refer to the instructions in the [Githhub repository](https://github.com/JiuTian-VL/FALCON).
## Citation
If you find this work useful for your research, please kindly cite our paper:
```BibTeX
@InProceedings{zhang2025falcon,
author={Zhang, Renshan and Shao, Rui and Chen, Gongwei and Zhang, Miao and Zhou, Kaiwen and Guan, Weili and Nie, Liqiang},
title={FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
month= {October},
year={2025},
}
```