UI-TARS-1.5-7B-GUI-Perturbed

Introduction

This checkpoint was produced as part of a study on GUI grounding robustness. We investigate whether synthetically perturbed training data generated via GUI-DR, a data augmentation pipeline applied to the Mind2Web training set, can improve model performance on visually diverse web UIs.

We release this checkpoint to support further research into synthetic data strategies and LoRA-based post-training for GUI grounding models. See our technical report for the full experimental discussion.

Training Configuration

Training config	Value
Base model	`ByteDance-Seed/UI-TARS-1.5-7B`
Fine-tuning method	LoRA (PEFT)
Training infrastructure	Qwen-VL-Series-Finetune
LoRA rank	8
Training epochs	1
Training samples	24,935

Training Data

Data was generated from the Mind2Web training set using the GUI-DR data augmentation pipeline and quality-filtered using Holo2-30B-A3B (ScreenSpot-Pro SOTA, 66.1% accuracy).

Perturbation Type	Variants	Description
Style	5	Visual domain randomization (colors, themes, fonts, element orders)
Text Shrink	1	Reduced font sizes
Precision	1	Changed page zoom level to 0.7
Combined	1	1 original + 5 style + 1 precision + 1 text shrink
Total	8	~4,319 steps per variant

Results

Fine-tuning on this dataset did not improve GUI grounding performance over the base model UI-TARS-1.5-7B on ScreenSpot-V2 and GUI-Perturbed. See the technical report for full benchmark results and comparison experiments.

Citation

If you find this model helpful, please cite our technical report and paper:

@misc{wang2026guiperturbeddomainrandomizationreveals,
      title={GUI-Perturbed: Domain Randomization Reveals Systematic Brittleness in GUI Grounding Models},
      author={Yangyue Wang and Harshvardhan Sikka and Yash Mathur and Tony Zhou and Jinu Nyachhyon and Pranav Guruprasad},
      year={2026},
      eprint={2604.14262},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2604.14262},
}

@online{training_on_gui_perturbed_technical_report_2026,
  title   = {Training on GUI-Perturbed: Why More Data Isn’t Enough},
  author  = {Wang, Yangyue and Sikka, Harsh and Mathur, Yash, and Zhou, Tony and Nyachhyon, Jinu and Guruprasad, Pranav},
  year    = {2026},
  url     = {https://blog.fig.inc/training-on-gui-perturbed-why-more-data-isnt-enough},
  note    = {Part 3: Finetuning Experiments}
}

Acknowledgements

Base model: ByteDance-Seed/UI-TARS-1.5-7B
Training infrastructure: Qwen-VL-Series-Finetune
Quality filtering: Holo2-30B-A3B
Training data source: Mind2Web

Downloads last month: -

Safetensors

Model size

8B params

Tensor type

F16

Model tree for figai/UI-TARS-1.5-7B-GUI-Perturbed

Base model

ByteDance-Seed/UI-TARS-1.5-7B

Adapter

(2)

this model

Dataset used to train figai/UI-TARS-1.5-7B-GUI-Perturbed

Paper for figai/UI-TARS-1.5-7B-GUI-Perturbed

GUI-Perturbed: Domain Randomization Reveals Systematic Brittleness in GUI Grounding Models

Paper • 2604.14262 • Published Apr 15

Evaluation results

figai/GUI-Perturbed · Gui Perturbed View evaluation results

source

83.61
os-copilot/ScreenSpot-v2 · Screenspot V2 View evaluation results

source

82.1