nielsr's picture
nielsr HF Staff
Improve model card: add metadata and documentation
156b78f verified
|
raw
history blame
1.62 kB
metadata
language:
  - en
license: mit
pipeline_tag: image-text-to-text
library_name: transformers

GUI-Owl-1.5 (Mobile-Agent-v3.5)

GUI-Owl-1.5 is a family of native multi-platform GUI agent foundation models built on Qwen3-VL. It supports automation across desktop, mobile, and browser environments, achieving state-of-the-art results on more than 20 GUI benchmarks.

Model Description

GUI-Owl-1.5 incorporates several key innovations:

  1. Hybrid Data Flywheel: A data pipeline for UI understanding and trajectory generation based on simulated and cloud-based sandbox environments.
  2. Unified Enhancement: A thought-synthesis pipeline to enhance reasoning capabilities, including Tool/MCP use and memory.
  3. Multi-platform Environment RL: The MRPO algorithm to address multi-platform conflicts and improve training efficiency for long-horizon tasks.

Citation

If you find this model useful, please cite our paper:

@article{MobileAgentv3.5,
  title={Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents},
  author={Haiyang Xu, Xi Zhang, Haowei Liu, Junyang Wang, Zhaozai Zhu, Shengjie Zhou, Xuhao Hu, Feiyu Gao, Junjie Cao, Zihua Wang, Zhiyuan Chen, Jitong Liao, Qi Zheng, Jiahui Zeng, Ze Xu, Shuai Bai, Junyang Lin, Jingren Zhou, Ming Yan},
  journal={arXiv preprint arXiv:2602.16855},
  year={2026}
}