nielsr HF Staff commited on
Commit
9a838e7
·
verified ·
1 Parent(s): ffc0dfb

Improve model card: add metadata and links to paper/code

Browse files

Hi! I'm Niels from the community science team at Hugging Face.

This PR improves the model card for **Mobile-Agent-v3.5** (GUI-Owl-1.5) by:
- Adding the `image-text-to-text` pipeline tag for better discoverability.
- Specifying the `library_name: transformers` metadata based on the model configuration.
- Removing the `arxiv` tag from the YAML metadata to follow standard documentation practices, while ensuring the paper is linked in the Markdown section.
- Adding direct links to the paper ([Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents](https://huggingface.co/papers/2602.16855)) and the [official GitHub repository](https://github.com/X-PLUG/MobileAgent).
- Providing a summary of the model's capabilities and linking to the available online demos.

These updates help users better understand the model's purpose and how to use it.

Files changed (1) hide show
  1. README.md +28 -3
README.md CHANGED
@@ -1,11 +1,35 @@
1
  ---
2
- license: mit
3
  language:
4
  - en
5
- tags:
6
- - arxiv:2602.16855
 
7
  ---
8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
  ## Citation
11
 
@@ -18,3 +42,4 @@ If you find this model useful, please cite our paper:
18
  journal={arXiv preprint arXiv:2602.16855},
19
  year={2026}
20
  }
 
 
1
  ---
 
2
  language:
3
  - en
4
+ license: mit
5
+ pipeline_tag: image-text-to-text
6
+ library_name: transformers
7
  ---
8
 
9
+ # Mobile-Agent-v3.5 (GUI-Owl-1.5)
10
+
11
+ This repository contains the model weights for **Mobile-Agent-v3.5** (also known as **GUI-Owl-1.5**), a native multi-platform GUI agent foundation model.
12
+
13
+ - **Paper:** [Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents](https://huggingface.co/papers/2602.16855)
14
+ - **Repository:** [GitHub - X-PLUG/MobileAgent](https://github.com/X-PLUG/MobileAgent)
15
+ - **Demos:** [ModelScope Online Demo](http://modelscope.cn/studios/MobileAgentTest/computer_use) | [Bailian Online Demo](https://bailian.console.aliyun.com/next?tab=demohouse#/experience/adk-computer-use/pc)
16
+
17
+ ## Introduction
18
+
19
+ GUI-Owl-1.5 is a native multi-platform GUI agent model family featuring instruct and thinking variants. It supports a wide range of platforms, including desktop, mobile, and browser environments, to enable cloud-edge collaboration and real-time interaction. The model unifies perception, grounding, reasoning, planning, and action execution within a single policy network.
20
+
21
+ Key features include:
22
+ - **Hybrid Data Flywheel:** A data pipeline for UI understanding and trajectory generation based on simulated and cloud-based sandbox environments.
23
+ - **Unified Enhancement of Agent Capabilities:** A unified thought-synthesis pipeline to enhance reasoning, memory, and Tool/MCP (Model Context Protocol) usage.
24
+ - **Multi-platform Environment RL Scaling:** Uses a new environment RL algorithm, MRPO, to address challenges in multi-platform interaction and long-horizon task training.
25
+
26
+ ## Benchmarks
27
+
28
+ GUI-Owl-1.5 achieves state-of-the-art results on more than 20 GUI benchmarks:
29
+ - **GUI Automation:** OSWorld (56.5), AndroidWorld (71.6), and WebArena (48.4).
30
+ - **Grounding:** ScreenSpotPro (80.3).
31
+ - **Tool-calling:** OSWorld-MCP (47.6) and MobileWorld (46.8).
32
+ - **Memory & Knowledge:** GUI-Knowledge Bench (75.5).
33
 
34
  ## Citation
35
 
 
42
  journal={arXiv preprint arXiv:2602.16855},
43
  year={2026}
44
  }
45
+ ```