This model is covnert from HelloKKMe/GTA1-7B by AutoAwq.
Model Description
GTA1-7B is an agent-grounding model that achieves state-of-the-art performance across a variety of GUI benchmarks.
Paper: GTA1: GUI Test-time Scaling Agent
Github: https://github.com/Yan98/GTA1
| Model | Size | Open Source | ScreenSpot-V2 | ScreenSpotPro | OSWORLD-G |
|---|---|---|---|---|---|
| OpenAI CUA | β | β | 87.9 | 23.4 | β |
| Claude 3.7 | β | β | 87.6 | 27.7 | β |
| JEDI-7B | 7B | β | 91.7 | 39.5 | 54.1 |
| SE-GUI | 7B | β | 90.3 | 47.0 | β |
| UI-TARS | 7B | β | 91.6 | 35.7 | 47.5 |
| UI-TARS-1.5* | 7B | β | 89.7* | 42.0* | 64.2* |
| UGround-v1-7B | 7B | β | β | 31.1 | 36.4 |
| Qwen2.5-VL-32B-Instruct | 32B | β | 91.9* | 48.0 | 59.6* |
| UGround-v1-72B | 72B | β | β | 34.5 | β |
| Qwen2.5-VL-72B-Instruct | 72B | β | 94.00* | 53.3 | 62.2* |
| UI-TARS | 72B | β | 90.3 | 38.1 | β |
| GTA1 (Ours) | 7B | β | 92.4 (β +2.7) | 50.1(β +8.1) | 67.7 (β +3.5) |
| GTA1 (Ours) | 32B | β | 93.2 (β +1.3) | 53.6 (β +5.6) | 61.9(β +2.3) |
| GTA1 (Ours) | 72B | β | 94.8(β +0.8) | 58.4 (β +5.1) | 66.7(β +4.5) |
Note:
- Model size is indicated in billions (B) of parameters.
- A dash (β) denotes results that are currently unavailable.
- A superscript asterisk (οΉ‘) denotes our evaluated result.
- UI-TARS-1.5 7B, Qwen2.5-VL-32B-Instruct, and Qwen2.5-VL-72B-Instruct are applied as our baseline models.
- β indicates the performance improvement (β) of our model compared to its baseline.
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
Model tree for flin775/GTA1-7B-AWQ
Base model
HelloKKMe/GTA1-7B