File size: 1,300 Bytes
fd1b56c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | # FocusUI-3B
This model was introduced in the paper:
**FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection**
- 🖼️ Project Page: https://showlab.github.io/FocusUI/
- 🏠 Github Repo: https://github.com/showlab/FocusUI
- 📝 Paper: https://arxiv.org/pdf/2601.03928
### Model Zoo
| Model | Backbone | 🤗 HuggingFace |
|-------|----------|-------------|
| FocusUI-3B | Qwen2.5-VL-3B | [https://huggingface.co/yyyang/FocusUI-3B](https://huggingface.co/yyyang/FocusUI-3B) |
| FocusUI-7B | Qwen2.5-VL-7B | [https://huggingface.co/yyyang/FocusUI-7B](https://huggingface.co/yyyang/FocusUI-7B) |
| FocusUI-2B | Qwen3-VL-2B | [https://huggingface.co/yyyang/FocusUI-Qwen3-VL-2B](https://huggingface.co/yyyang/FocusUI-Qwen3-VL-2B) |
### Dataset & Benchmarks
For the training and evaluation data, see [FocusUI-Training-Data](https://huggingface.co/datasets/yyyang/FocusUI-Training-Data) and [UI-Grounding-Benchmarks](https://huggingface.co/datasets/yyyang/UI-Grounding-Benchmarks/).
### Citation
```
@article{ouyang2025focusui,
title = {FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection},
author = {Ouyang, Mingyu and Lin, Kevin Qinghong and Shou, Mike Zheng and Ng, Hwee Tou},
year = {2025},
journal = {arXiv preprint},
}
```
|