Safetensors
nielsr HF Staff commited on
Commit
8f38d44
·
verified ·
1 Parent(s): d4e8297

Improve model card: Add pipeline tag, library, abstract, and overview visuals

Browse files

This PR significantly enhances the model card for UI-S1-7B by:

- Adding `pipeline_tag: image-text-to-text` to enable better discoverability on the Hugging Face Hub (https://huggingface.co/models?pipeline_tag=image-text-to-text), reflecting its multimodal GUI automation capabilities.
- Adding `library_name: transformers` to correctly identify its compatibility with the Hugging Face Transformers library, which enables an automated code snippet for users.
- Incorporating the paper abstract, an overview of the methodology, and detailed results from the GitHub repository to provide a comprehensive understanding of the model's capabilities and performance.
- Including relevant images from the GitHub repository for better visualization of the method and results.
- Ensuring prominent links to the paper and the GitHub repository.

Please review and merge if these improvements align with your expectations.

Files changed (1) hide show
  1. README.md +29 -3
README.md CHANGED
@@ -1,8 +1,34 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
4
 
5
- ## Introduction
6
- This repository contains the efficient GUI grounding model, **UI-S1-7B**, presented in [UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning](https://huggingface.co/papers/2509.11543).
7
 
8
- Project page: https://github.com/X-PLUG/MobileAgent/tree/main/UI-S1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ pipeline_tag: image-text-to-text
4
+ library_name: transformers
5
  ---
6
 
7
+ # UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning
 
8
 
9
+ This repository contains the efficient GUI grounding model, **UI-S1-7B**, presented in the paper [UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning](https://huggingface.co/papers/2509.11543).
10
+
11
+ Project page / Code: [https://github.com/X-PLUG/MobileAgent/tree/main/UI-S1](https://github.com/X-PLUG/MobileAgent/tree/main/UI-S1)
12
+
13
+ ## Paper Abstract
14
+ Graphical User Interface (GUI) agents have demonstrated remarkable progress in automating complex user interface interactions through reinforcement learning. However, current approaches face a fundamental dilemma: offline RL enables stable training on pre-collected trajectories, but struggles with multi-step task execution for lack of trajectory-level reward signals; online RL captures these signals through environment interaction, but suffers from sparse rewards and prohibitive deployment costs. To address it, we present Semi-online Reinforcement Learning, a novel paradigm that simulates online RL on offline trajectories. During each rollout process, we preserve the original model output within the multi-turn dialogue, where a Patch Module adaptively recovers the divergence between rollout and expert trajectories. To capture long-term training signals, Semi-online RL introduces discounted future returns into the reward computation and optimizes the policy with weighted step-level and episode-level advantages. We further introduce Semi-Online Performance (SOP), a metric that aligns better with true online performance, serving as a practical and effective proxy for real-world evaluation. Experiments show that ours Semi-online RL achieves SOTA performance among 7B models across four dynamic benchmarks, with significant gains over the base model (e.g., +12.0% on AndroidWorld, +23.8% on AITW), demonstrating significant progress in bridging the gap between offline training efficiency and online multi-turn reasoning.
15
+
16
+ ## Overview
17
+
18
+ We present **Semi-online RL**, a novel paradigm that simulates online reinforcement learning using offline trajectories, thereby enabling the efficient training of MLLM-based GUI agents with enhanced multi-turn interaction capabilities.
19
+
20
+ <div align="center">
21
+ <img src="https://github.com/X-PLUG/MobileAgent/raw/main/UI-S1/assets/method_comparison.png" alt="Method Comparison" style="width:80%;">
22
+ </div>
23
+
24
+ Ours **UI-S1-7B** achieves SOTA performance on both semi-online metric (SOP) and online metric (AndroidWorld) among open-source 7B models.
25
+
26
+ <div align="center">
27
+ <img src="https://github.com/X-PLUG/MobileAgent/raw/main/UI-S1/assets/metric.png" alt="Metrics" style="width:80%;">
28
+ </div>
29
+
30
+ ## Detailed results
31
+
32
+ <div align="center">
33
+ <img src="https://github.com/X-PLUG/MobileAgent/raw/main/UI-S1/assets/result.png" alt="Results" style="width:80%;">
34
+ </div>