nielsr's picture
nielsr HF Staff
Improve model card: add metadata and resources
8c785b6 verified
|
raw
history blame
1.66 kB
metadata
language:
  - en
license: mit
pipeline_tag: image-text-to-text
library_name: transformers

Mobile-Agent-v3.5 (GUI-Owl-1.5)

Mobile-Agent-v3.5 (also known as GUI-Owl-1.5) is a family of native multi-platform GUI agent foundation models. Built on the Qwen3-VL architecture, it is designed for native automation across various platforms including desktop (Windows/macOS), mobile (Android), and web browsers.

Key Features

  • Multi-platform GUI Automation: Unifies perception, grounding, and action execution across mobile, desktop, and web environments.
  • Enhanced Agent Capabilities: Improved reasoning through a unified thought-synthesis pipeline, with a specific focus on Tool/MCP use and long-horizon memory.
  • State-of-the-Art Performance: Achieves leading results on benchmarks such as OSWorld, AndroidWorld, and WebArena among open-source models.

Citation

If you find this model useful, please cite the paper:

@article{MobileAgentv3.5,
  title={Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents},
  author={Haiyang Xu, Xi Zhang, Haowei Liu, Junyang Wang, Zhaozai Zhu, Shengjie Zhou, Xuhao Hu, Feiyu Gao, Junjie Cao, Zihua Wang, Zhiyuan Chen, Jitong Liao, Qi Zheng, Jiahui Zeng, Ze Xu, Shuai Bai, Junyang Lin, Jingren Zhou, Ming Yan},
  journal={arXiv preprint arXiv:2602.16855},
  year={2026}
}