Add missing metadata and link to paper

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +8 -7
README.md CHANGED
@@ -3,11 +3,17 @@ base_model:
3
  - Qwen/Qwen2.5-VL-7B-Instruct
4
  datasets:
5
  - HanXiao1999/UI-Genie-Agent-5k
 
 
 
6
  ---
7
 
 
8
 
 
 
9
 
10
- # UI-Genie-Agent-7B
11
 
12
  ## Model Description
13
 
@@ -15,8 +21,6 @@ datasets:
15
 
16
  This model achieves state-of-the-art performance on mobile GUI benchmarks by eliminating the need for manual annotation through synthetic trajectory generation guided by our specialized reward model UI-Genie-RM.
17
 
18
-
19
-
20
  ## Model Architecture
21
 
22
  - **Base Model**: [Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct)
@@ -53,7 +57,6 @@ Our model is trained on a combination of:
53
  - [**AndroidLab**](https://github.com/THUDM/Android-Lab): 726 trajectories (high-level tasks)
54
  - [**UI-Genie-Agent-16k**]((https://huggingface.co/datasets/HanXiao1999/UI-Genie-Agent-5k)): 2.2K synthetic trajectories (our generated data)
55
 
56
-
57
  ## Action Space
58
 
59
  The model supports a comprehensive action space for mobile interactions:
@@ -69,7 +72,6 @@ The model supports a comprehensive action space for mobile interactions:
69
  | `wait` | time, action_desc | Wait operations |
70
  | `terminate` | status, action_desc | Task completion |
71
 
72
-
73
  ## Citation
74
 
75
  ```bibtex
@@ -82,5 +84,4 @@ The model supports a comprehensive action space for mobile interactions:
82
  primaryClass={cs.CL},
83
  url={https://arxiv.org/abs/2505.21496},
84
  }
85
- ```
86
-
 
3
  - Qwen/Qwen2.5-VL-7B-Instruct
4
  datasets:
5
  - HanXiao1999/UI-Genie-Agent-5k
6
+ pipeline_tag: image-text-to-text
7
+ library_name: transformers
8
+ license: mit
9
  ---
10
 
11
+ # UI-Genie-Agent-7B
12
 
13
+ This model is presented in [UI-Genie: A Self-Improving Approach for Iteratively Boosting MLLM-based
14
+ Mobile GUI Agents](https://huggingface.co/papers/2505.21496).
15
 
16
+ Code: https://github.com/Euphoria16/UI-Genie
17
 
18
  ## Model Description
19
 
 
21
 
22
  This model achieves state-of-the-art performance on mobile GUI benchmarks by eliminating the need for manual annotation through synthetic trajectory generation guided by our specialized reward model UI-Genie-RM.
23
 
 
 
24
  ## Model Architecture
25
 
26
  - **Base Model**: [Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct)
 
57
  - [**AndroidLab**](https://github.com/THUDM/Android-Lab): 726 trajectories (high-level tasks)
58
  - [**UI-Genie-Agent-16k**]((https://huggingface.co/datasets/HanXiao1999/UI-Genie-Agent-5k)): 2.2K synthetic trajectories (our generated data)
59
 
 
60
  ## Action Space
61
 
62
  The model supports a comprehensive action space for mobile interactions:
 
72
  | `wait` | time, action_desc | Wait operations |
73
  | `terminate` | status, action_desc | Task completion |
74
 
 
75
  ## Citation
76
 
77
  ```bibtex
 
84
  primaryClass={cs.CL},
85
  url={https://arxiv.org/abs/2505.21496},
86
  }
87
+ ```