UI-TARS-1.5-7B-GUI-Perturbed

Introduction

This checkpoint was produced as part of a study on GUI grounding robustness. We investigate whether synthetically perturbed training data generated via GUI-DR, a data augmentation pipeline applied to the Mind2Web training set, can improve model performance on visually diverse web UIs.

We release this checkpoint to support further research into synthetic data strategies and LoRA-based post-training for GUI grounding models. See our technical report for the full experimental discussion.

Training Configuration

Training config Value
Base model ByteDance-Seed/UI-TARS-1.5-7B
Fine-tuning method LoRA (PEFT)
Training infrastructure Qwen-VL-Series-Finetune
LoRA rank 8
Training epochs 1
Training samples 24,935

Training Data

Data was generated from the Mind2Web training set using the GUI-DR data augmentation pipeline and quality-filtered using Holo2-30B-A3B (ScreenSpot-Pro SOTA, 66.1% accuracy).

Perturbation Type Variants Description
Style 5 Visual domain randomization (colors, themes, fonts, element orders)
Text Shrink 1 Reduced font sizes
Precision 1 Changed page zoom level to 0.7
Combined 1 1 original + 5 style + 1 precision + 1 text shrink
Total 8 ~4,319 steps per variant

Results

Fine-tuning on this dataset did not improve GUI grounding performance over the base model UI-TARS-1.5-7B on ScreenSpot-V2 and GUI-Perturbed. See the technical report for full benchmark results and comparison experiments.

Citation

If you find this model helpful, please cite our technical report and paper:

@misc{wang2026guiperturbeddomainrandomizationreveals,
      title={GUI-Perturbed: Domain Randomization Reveals Systematic Brittleness in GUI Grounding Models},
      author={Yangyue Wang and Harshvardhan Sikka and Yash Mathur and Tony Zhou and Jinu Nyachhyon and Pranav Guruprasad},
      year={2026},
      eprint={2604.14262},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2604.14262},
}

@online{training_on_gui_perturbed_technical_report_2026,
  title   = {Training on GUI-Perturbed: Why More Data Isn’t Enough},
  author  = {Wang, Yangyue and Sikka, Harsh and Mathur, Yash, and Zhou, Tony and Nyachhyon, Jinu and Guruprasad, Pranav},
  year    = {2026},
  url     = {https://blog.fig.inc/training-on-gui-perturbed-why-more-data-isnt-enough},
  note    = {Part 3: Finetuning Experiments}
}

Acknowledgements

Downloads last month
-
Safetensors
Model size
8B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for figai/UI-TARS-1.5-7B-GUI-Perturbed

Adapter
(2)
this model

Dataset used to train figai/UI-TARS-1.5-7B-GUI-Perturbed

Paper for figai/UI-TARS-1.5-7B-GUI-Perturbed