FocusUI-3B / README.md
yyyang's picture
Create README.md
fd1b56c verified
# FocusUI-3B
This model was introduced in the paper:
**FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection**
- ๐Ÿ–ผ๏ธ Project Page: https://showlab.github.io/FocusUI/
- ๐Ÿ  Github Repo: https://github.com/showlab/FocusUI
- ๐Ÿ“ Paper: https://arxiv.org/pdf/2601.03928
### Model Zoo
| Model | Backbone | ๐Ÿค— HuggingFace |
|-------|----------|-------------|
| FocusUI-3B | Qwen2.5-VL-3B | [https://huggingface.co/yyyang/FocusUI-3B](https://huggingface.co/yyyang/FocusUI-3B) |
| FocusUI-7B | Qwen2.5-VL-7B | [https://huggingface.co/yyyang/FocusUI-7B](https://huggingface.co/yyyang/FocusUI-7B) |
| FocusUI-2B | Qwen3-VL-2B | [https://huggingface.co/yyyang/FocusUI-Qwen3-VL-2B](https://huggingface.co/yyyang/FocusUI-Qwen3-VL-2B) |
### Dataset & Benchmarks
For the training and evaluation data, see [FocusUI-Training-Data](https://huggingface.co/datasets/yyyang/FocusUI-Training-Data) and [UI-Grounding-Benchmarks](https://huggingface.co/datasets/yyyang/UI-Grounding-Benchmarks/).
### Citation
```
@article{ouyang2025focusui,
title = {FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection},
author = {Ouyang, Mingyu and Lin, Kevin Qinghong and Shou, Mike Zheng and Ng, Hwee Tou},
year = {2025},
journal = {arXiv preprint},
}
```