This is the AWQ model of GroundNext-7B-V0, which runs on lower VRAM(e.g. 12G). GroundNext-7B-V0 is the GUI agent model trained on large-scaled, human-expert-annoated desktop application dataset.
Perforamnce
Desktop Grounding Benchmarks
| Qwen2.5-VL-7B | UI-TARS-72B | GroundNext-7B-V0 | |
|---|---|---|---|
| ScreenSpot-Pro | 29.7 | 38.1 | 52.9 |
| OSWorld-G | 42.7 | 57.1 | 67.7 |
| UI-Vision | 16.5 | 25.5 | 60.3 |
| Avg (Desktop) | 29.6 | 40.2 | 60.3 |
Cross-Platform Generalization (Desktop, Mobile & Web)
| Qwen2.5-VL-7B | UI-TARS-72B | GroundNext-7B-V0 | |
|---|---|---|---|
| MMBench-GUI | 33.9 | 74.3 | 81.1 |
| ScreenSpot-v2 | 88.8 | 90.3 | 90.4 |
| Avg (Mobile/Web) | 61.4 | 82.3 | 85.8 |
Agentic Performance on OSWorld
When combined with OpenAI o3 for reasoning, GroundNext-7B-V0 demonstrates strong end-to-end computer use capabilities:
| Model | OS | Office | Daily | Pro | Workflow | Overall |
|---|---|---|---|---|---|---|
| OpenAI o3 | 62.5 | 14.5 | 21.4 | 38.8 | 16.5 | 23.0 |
| CUA | 23.9 | 34.6 | 55.1 | 18.3 | 18.3 | 31.4 |
| OpenCUA-72B | 58.3 | 47.0 | 53.8 | 73.5 | 20.4 | 46.1 |
| UI-TARS-1.5-7B | 33.3 | 29.9 | 37.9 | 53.1 | 9.1 | 29.6 |
| JEDI-7B w/ o3 | 50.0 | 46.1 | 61.9 | 75.5 | 35.3 | 51.0 |
| GroundNext-3B w/ o3 | 62.5 | 47.0 | 55.0 | 73.5 | 36.5 | 50.6 |
Deployment with VLLM
vllm serve flin775/GroundNext-7B-V0-AWQ \
--max-num-seqs 8 \
--max_model_len 20608 \
--gpu-memory-utilization 0.85
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
馃檵
Ask for provider support
Model tree for flin775/GroundNext-7B-V0-AWQ
Base model
Qwen/Qwen2.5-VL-7B-Instruct
Finetuned
ServiceNow/GroundNext-7B-V0