FocusUI-7B / README.md
yyyang's picture
Create README.md
27f706b verified

FocusUI-7B

This model was introduced in the paper:

FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection

Model Zoo

Model Backbone 🤗 HuggingFace
FocusUI-3B Qwen2.5-VL-3B https://huggingface.co/yyyang/FocusUI-3B
FocusUI-7B Qwen2.5-VL-7B https://huggingface.co/yyyang/FocusUI-7B
FocusUI-2B Qwen3-VL-2B https://huggingface.co/yyyang/FocusUI-Qwen3-VL-2B

Dataset & Benchmarks

For the training and evaluation data, see FocusUI-Training-Data and UI-Grounding-Benchmarks.

Citation

@article{ouyang2025focusui,
  title   = {FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection},
  author  = {Ouyang, Mingyu and Lin, Kevin Qinghong and Shou, Mike Zheng and Ng, Hwee Tou},
  year    = {2025},
  journal = {arXiv preprint},
}