BigTaige commited on
Commit
f9fe443
·
verified ·
1 Parent(s): 56253ee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -1
README.md CHANGED
@@ -1,3 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  import requests
2
  import json
3
  from tqdm import tqdm
@@ -49,7 +80,6 @@ def act2sum_fn(meta_data):
49
  return pred
50
  #############################################################################################
51
 
52
-
53
  url = "http://localhost:8000/v1/chat/completions"
54
  headers = {
55
  "Content-Type": "application/json"
@@ -143,3 +173,4 @@ if __name__ == "__main__":
143
  # evaluate(inference_data)
144
  with open("your_saving_path.json", "w") as f:
145
  f.write(json.dumps(inference_data, indent=4))
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ - zh
6
+ base_model:
7
+ - Qwen/Qwen2.5-VL-3B-Instruct
8
+ pipeline_tag: image-text-to-text
9
+ tags:
10
+ - GUI-Agent
11
+ - GUI-Perception
12
+ - Screen-Understanding
13
+ ---
14
+
15
+ ## Introduction
16
+
17
+ **HAR-GUI-3B** is a GUI-tailored native model (native end-to-end GUI agent) built upon Qwen2.5-VL-3B-Instruct. It was developed through our HAR Framework, incorporating a series of tailored training strategies. HAR-GUI-3B integrates a stable short-term memory for episodic reasoning, which can perceive the sequential clues of the episode flexibly and make reasonable use of it. This enhancement of reasoning can assist the GUI agent in executing long-horizon interaction and achieving consistent and persistent growth across GUI-oriented tasks. Further details can be found in our article.
18
+
19
+ ## Quick Start
20
+ The following Python script demonstrates how to use the HAR-GUI-3B for GUI automation. This example assumes you have a local vLLM server running the model. You can adapt the code to fit your specific needs.
21
+ ```bash
22
+ # Start vllm service
23
+ nohup python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2.5-VL-3B-Instruct --model ./HAR-GUI-3B -tp 4 > log.txt &
24
+ #nohup python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2.5-VL-72B-Instruct --model ./Qwen2.5-VL-72B-Instruct -tp 8 > log.txt &
25
+
26
+ # Mount your local directory
27
+ cd ./your_directory/
28
+ python3 -m http.server 6666
29
+ ```
30
+
31
+ ```python
32
  import requests
33
  import json
34
  from tqdm import tqdm
 
80
  return pred
81
  #############################################################################################
82
 
 
83
  url = "http://localhost:8000/v1/chat/completions"
84
  headers = {
85
  "Content-Type": "application/json"
 
173
  # evaluate(inference_data)
174
  with open("your_saving_path.json", "w") as f:
175
  f.write(json.dumps(inference_data, indent=4))
176
+ ```