metadata
language:
- en
license: mit
pipeline_tag: image-text-to-text
library_name: transformers
GUI-Owl-1.5 (Mobile-Agent-v3.5)
GUI-Owl-1.5 is a family of native multi-platform GUI agent foundation models built on Qwen3-VL. It supports automation across desktop, mobile, and browser environments, achieving state-of-the-art results on more than 20 GUI benchmarks.
- Paper: Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents
- Code: GitHub - X-PLUG/MobileAgent
- Demo: ModelScope Online Demo
Model Description
GUI-Owl-1.5 incorporates several key innovations:
- Hybrid Data Flywheel: A data pipeline for UI understanding and trajectory generation based on simulated and cloud-based sandbox environments.
- Unified Enhancement: A thought-synthesis pipeline to enhance reasoning capabilities, including Tool/MCP use and memory.
- Multi-platform Environment RL: The MRPO algorithm to address multi-platform conflicts and improve training efficiency for long-horizon tasks.
Citation
If you find this model useful, please cite our paper:
@article{MobileAgentv3.5,
title={Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents},
author={Haiyang Xu, Xi Zhang, Haowei Liu, Junyang Wang, Zhaozai Zhu, Shengjie Zhou, Xuhao Hu, Feiyu Gao, Junjie Cao, Zihua Wang, Zhiyuan Chen, Jitong Liao, Qi Zheng, Jiahui Zeng, Ze Xu, Shuai Bai, Junyang Lin, Jingren Zhou, Ming Yan},
journal={arXiv preprint arXiv:2602.16855},
year={2026}
}