Add pipeline tag and library name

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +5 -10
README.md CHANGED
@@ -1,23 +1,22 @@
1
  ---
2
- license: apache-2.0
 
3
  datasets:
4
  - yolay/SmartSnap-FT
5
  - yolay/SmartSnap-RL
6
  language:
7
  - en
 
8
  metrics:
9
  - accuracy
10
- base_model:
11
- - meta-llama/Llama-3.1-8B-Instruct
12
  tags:
13
  - agent
14
  - mobile
15
  - gui
16
  ---
17
 
18
-
19
-
20
-
21
  <div align="center">
22
  <img src="https://raw.githubusercontent.com/yuleiqin/images/master/SmartSnap/mascot_smartsnap.png" width="400"/>
23
  </div>
@@ -28,7 +27,6 @@ tags:
28
  &nbsp;
29
  </p>
30
 
31
-
32
  We introduce **SmartSnap**, a paradigm shift that transforms GUI agents📱💻🤖 from passive task executors into proactive self-verifiers. By empowering agents to curate their own evidence of success through the **3C Principles** (Completeness, Conciseness, Creativity), we eliminate the bottleneck of expensive post-hoc verification while boosting reliability and performance on complex mobile tasks.
33
 
34
  # 📖 Overview
@@ -116,13 +114,10 @@ We release the following resources to accelerate research in self-verifying agen
116
  | **FT (ours)** | Qwen3-32B-Instruct | 28.98<sup>(+10.86%)</sup> | 35.92 | 97.79 | 97.33 |
117
  | **RL (ours)** | Qwen3-32B-Instruct | <u>34.78</u><sup>(+16.66%)</sup> | 40.26 | 89.47 | 93.67 |
118
 
119
-
120
-
121
  *<sup>*</sup> LLaMA3.1 models only natively support tool calling w/o reasoning.*
122
  *<sup>†</sup> The Android Instruct dataset is used for fine-tuning where self-verification is not performed.*
123
  *<sup>‡</sup> The official results are cited here for comparison.*
124
 
125
-
126
  ---
127
 
128
  - **Performance gains**: All model families achieve >16% improvement over prompting baselines, reaching competitive performance with models 10-30× larger.
 
1
  ---
2
+ base_model:
3
+ - meta-llama/Llama-3.1-8B-Instruct
4
  datasets:
5
  - yolay/SmartSnap-FT
6
  - yolay/SmartSnap-RL
7
  language:
8
  - en
9
+ license: apache-2.0
10
  metrics:
11
  - accuracy
12
+ pipeline_tag: image-text-to-text
13
+ library_name: transformers
14
  tags:
15
  - agent
16
  - mobile
17
  - gui
18
  ---
19
 
 
 
 
20
  <div align="center">
21
  <img src="https://raw.githubusercontent.com/yuleiqin/images/master/SmartSnap/mascot_smartsnap.png" width="400"/>
22
  </div>
 
27
  &nbsp;
28
  </p>
29
 
 
30
  We introduce **SmartSnap**, a paradigm shift that transforms GUI agents📱💻🤖 from passive task executors into proactive self-verifiers. By empowering agents to curate their own evidence of success through the **3C Principles** (Completeness, Conciseness, Creativity), we eliminate the bottleneck of expensive post-hoc verification while boosting reliability and performance on complex mobile tasks.
31
 
32
  # 📖 Overview
 
114
  | **FT (ours)** | Qwen3-32B-Instruct | 28.98<sup>(+10.86%)</sup> | 35.92 | 97.79 | 97.33 |
115
  | **RL (ours)** | Qwen3-32B-Instruct | <u>34.78</u><sup>(+16.66%)</sup> | 40.26 | 89.47 | 93.67 |
116
 
 
 
117
  *<sup>*</sup> LLaMA3.1 models only natively support tool calling w/o reasoning.*
118
  *<sup>†</sup> The Android Instruct dataset is used for fine-tuning where self-verification is not performed.*
119
  *<sup>‡</sup> The official results are cited here for comparison.*
120
 
 
121
  ---
122
 
123
  - **Performance gains**: All model families achieve >16% improvement over prompting baselines, reaching competitive performance with models 10-30× larger.