This model is convert by mlx_vlm from mPLUG/GUI-Owl-7B

Model Description

GUI-Owl is a model series developed as part of the Mobile-Agent-V3 project. It achieves state-of-the-art performance across a range of GUI automation benchmarks, including ScreenSpot-V2, ScreenSpot-Pro, OSWorld-G, MMBench-GUI, Android Control, Android World, and OSWorld.

ScreenSpot-V2, ScreenSpot-Pro and OSWorld-G

Android World and OSWorld-Verified

Quick Start

mlx_vlm.generate --model mlx-community/GUI-Owl-7B-4bit \
  --max-tokens 1024 \
  --temperature 0.0 \
  --prompt "List all contacts’ names and their corresponding grounding boxes([x1, y1, x2, y2]) from the left sidebar of the IM chat interface, return the results in JSON format." \
  --image https://wechat.qpic.cn/uploads/2016/05/WeChat-Windows-2.11.jpg
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mlx-community/GUI-Owl-7B-4bit

Finetuned
mPLUG/GUI-Owl-7B
Quantized
(5)
this model