--- license: apache-2.0 pipeline_tag: image-text-to-text tags: - grounding - agent GUI_Spotlight is a `think-with-image` GUI visual grounding model. For each step, it first calls tooling to crop the image according to its own predictions, and then returns an exact coordinate location. For evaluation and inference details, please refer to [the GUI_Spotlight repository](https://github.com/bin123apple/GUI_Spotlight)