--- language: - en license: mit pipeline_tag: image-text-to-text library_name: transformers --- # Mobile-Agent-v3.5 (GUI-Owl-1.5) This repository contains the model weights for **Mobile-Agent-v3.5** (also known as **GUI-Owl-1.5**), a native multi-platform GUI agent foundation model. - **Paper:** [Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents](https://huggingface.co/papers/2602.16855) - **Repository:** [GitHub - X-PLUG/MobileAgent](https://github.com/X-PLUG/MobileAgent) - **Demos:** [ModelScope Online Demo](http://modelscope.cn/studios/MobileAgentTest/computer_use) | [Bailian Online Demo](https://bailian.console.aliyun.com/next?tab=demohouse#/experience/adk-computer-use/pc) ## Introduction GUI-Owl-1.5 is a native multi-platform GUI agent model family featuring instruct and thinking variants. It supports a wide range of platforms, including desktop, mobile, and browser environments, to enable cloud-edge collaboration and real-time interaction. The model unifies perception, grounding, reasoning, planning, and action execution within a single policy network. Key features include: - **Hybrid Data Flywheel:** A data pipeline for UI understanding and trajectory generation based on simulated and cloud-based sandbox environments. - **Unified Enhancement of Agent Capabilities:** A unified thought-synthesis pipeline to enhance reasoning, memory, and Tool/MCP (Model Context Protocol) usage. - **Multi-platform Environment RL Scaling:** Uses a new environment RL algorithm, MRPO, to address challenges in multi-platform interaction and long-horizon task training. ## Benchmarks GUI-Owl-1.5 achieves state-of-the-art results on more than 20 GUI benchmarks: - **GUI Automation:** OSWorld (56.5), AndroidWorld (71.6), and WebArena (48.4). - **Grounding:** ScreenSpotPro (80.3). - **Tool-calling:** OSWorld-MCP (47.6) and MobileWorld (46.8). - **Memory & Knowledge:** GUI-Knowledge Bench (75.5). ## Citation If you find this model useful, please cite our paper: ```bibtex @article{MobileAgentv3.5, title={Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents}, author={Haiyang Xu, Xi Zhang, Haowei Liu, Junyang Wang, Zhaozai Zhu, Shengjie Zhou, Xuhao Hu, Feiyu Gao, Junjie Cao, Zihua Wang, Zhiyuan Chen, Jitong Liao, Qi Zheng, Jiahui Zeng, Ze Xu, Shuai Bai, Junyang Lin, Jingren Zhou, Ming Yan}, journal={arXiv preprint arXiv:2602.16855}, year={2026} } ```