|
|
--- |
|
|
language: |
|
|
- en |
|
|
license: mit |
|
|
pipeline_tag: image-text-to-text |
|
|
--- |
|
|
|
|
|
# Mobile-Agent-v3.5 (GUI-Owl-1.5) |
|
|
|
|
|
**Mobile-Agent-v3.5** introduces **GUI-Owl-1.5**, a family of native multi-platform GUI agent foundation models. Built on the Qwen3-VL architecture, GUI-Owl-1.5 supports automation across desktop, mobile, and browser platforms, achieving state-of-the-art results on more than 20 GUI benchmarks like OSWorld, AndroidWorld, and WebArena. |
|
|
|
|
|
It unifies perception, grounding, reasoning, planning, and action execution within a single policy network, supporting both "Instruct" and "Thinking" variants. |
|
|
|
|
|
- **Paper:** [Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents](https://huggingface.co/papers/2602.16855) |
|
|
- **Repository:** [https://github.com/X-PLUG/MobileAgent](https://github.com/X-PLUG/MobileAgent) |
|
|
- **Project Page:** [Mobile-Agent Family](https://github.com/X-PLUG/MobileAgent) |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you find this model useful, please cite our paper: |
|
|
|
|
|
```bibtex |
|
|
@article{MobileAgentv3.5, |
|
|
title={Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents}, |
|
|
author={Haiyang Xu, Xi Zhang, Haowei Liu, Junyang Wang, Zhaozai Zhu, Shengjie Zhou, Xuhao Hu, Feiyu Gao, Junjie Cao, Zihua Wang, Zhiyuan Chen, Jitong Liao, Qi Zheng, Jiahui Zeng, Ze Xu, Shuai Bai, Junyang Lin, Jingren Zhou, Ming Yan}, |
|
|
journal={arXiv preprint arXiv:2602.16855}, |
|
|
year={2026} |
|
|
} |
|
|
``` |