--- language: - en license: mit pipeline_tag: image-text-to-text library_name: transformers --- # Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents (GUI-Owl-1.5) Mobile-Agent-v3.5 (also known as **GUI-Owl-1.5**) is a family of native multi-platform GUI agent foundation models. It supports automation across desktop, mobile, and browser environments, enabling cloud-edge collaboration and real-time interaction. The model is built on the Qwen3-VL architecture and achieves state-of-the-art results on over 20 GUI benchmarks, excelling in tasks such as GUI automation (OSWorld, AndroidWorld, WebArena), grounding (ScreenSpotPro), and tool-calling. - **Paper:** [Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents](https://huggingface.co/papers/2602.16855) - **Repository:** [GitHub - X-PLUG/MobileAgent](https://github.com/X-PLUG/MobileAgent) - **Demo:** [ModelScope online demo](http://modelscope.cn/studios/MobileAgentTest/computer_use) ## Key Features - **Multi-platform Support:** Native support for desktop, mobile, and browser automation. - **Unified Capability:** Combines UI understanding, reasoning, and trajectory generation. - **Enhanced Reasoning:** Incorporates a thought-synthesis pipeline to improve decision-making and memory. ## Citation If you find this model useful, please cite the paper: ```bibtex @article{MobileAgentv3.5, title={Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents}, author={Haiyang Xu, Xi Zhang, Haowei Liu, Junyang Wang, Zhaozai Zhu, Shengjie Zhou, Xuhao Hu, Feiyu Gao, Junjie Cao, Zihua Wang, Zhiyuan Chen, Jitong Liao, Qi Zheng, Jiahui Zeng, Ze Xu, Shuai Bai, Junyang Lin, Jingren Zhou, Ming Yan}, journal={arXiv preprint arXiv:2602.16855}, year={2026} } ```