nielsr HF Staff commited on
Commit
33458c9
·
verified ·
1 Parent(s): a0e121a

Improve model card metadata and content

Browse files

Hi! I'm Niels from the community science team at Hugging Face. I'm opening this PR to improve the model card for Mobile-Agent-v3.5 (GUI-Owl-1.5).

The improvements include:
- Adding the `image-text-to-text` pipeline tag for better discoverability.
- Adding `library_name: transformers` as the model architecture is compatible with the Transformers library.
- Moving the arXiv reference from the YAML metadata to the markdown section as per our best practices.
- Adding a model description and links to the paper and code.

This helps users understand the model's capabilities and find the associated research and code.

Files changed (1) hide show
  1. README.md +19 -4
README.md CHANGED
@@ -1,15 +1,29 @@
1
  ---
2
- license: mit
3
  language:
4
  - en
5
- tags:
6
- - arxiv:2602.16855
 
7
  ---
8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
  ## Citation
11
 
12
- If you find this model useful, please cite our paper:
13
 
14
  ```bibtex
15
  @article{MobileAgentv3.5,
@@ -18,3 +32,4 @@ If you find this model useful, please cite our paper:
18
  journal={arXiv preprint arXiv:2602.16855},
19
  year={2026}
20
  }
 
 
1
  ---
 
2
  language:
3
  - en
4
+ license: mit
5
+ pipeline_tag: image-text-to-text
6
+ library_name: transformers
7
  ---
8
 
9
+ # Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents (GUI-Owl-1.5)
10
+
11
+ Mobile-Agent-v3.5 (also known as **GUI-Owl-1.5**) is a family of native multi-platform GUI agent foundation models. It supports automation across desktop, mobile, and browser environments, enabling cloud-edge collaboration and real-time interaction.
12
+
13
+ The model is built on the Qwen3-VL architecture and achieves state-of-the-art results on over 20 GUI benchmarks, excelling in tasks such as GUI automation (OSWorld, AndroidWorld, WebArena), grounding (ScreenSpotPro), and tool-calling.
14
+
15
+ - **Paper:** [Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents](https://huggingface.co/papers/2602.16855)
16
+ - **Repository:** [GitHub - X-PLUG/MobileAgent](https://github.com/X-PLUG/MobileAgent)
17
+ - **Demo:** [ModelScope online demo](http://modelscope.cn/studios/MobileAgentTest/computer_use)
18
+
19
+ ## Key Features
20
+ - **Multi-platform Support:** Native support for desktop, mobile, and browser automation.
21
+ - **Unified Capability:** Combines UI understanding, reasoning, and trajectory generation.
22
+ - **Enhanced Reasoning:** Incorporates a thought-synthesis pipeline to improve decision-making and memory.
23
 
24
  ## Citation
25
 
26
+ If you find this model useful, please cite the paper:
27
 
28
  ```bibtex
29
  @article{MobileAgentv3.5,
 
32
  journal={arXiv preprint arXiv:2602.16855},
33
  year={2026}
34
  }
35
+ ```