Instructions to use figai/UI-TARS-1.5-7B-GUI-Perturbed with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use figai/UI-TARS-1.5-7B-GUI-Perturbed with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
UI-TARS-1.5-7B-GUI-Perturbed
Introduction
This checkpoint was produced as part of a study on GUI grounding robustness. We investigate whether synthetically perturbed training data generated via GUI-DR, a data augmentation pipeline applied to the Mind2Web training set, can improve model performance on visually diverse web UIs.
We release this checkpoint to support further research into synthetic data strategies and LoRA-based post-training for GUI grounding models. See our technical report for the full experimental discussion.
Training Configuration
| Training config | Value |
|---|---|
| Base model | ByteDance-Seed/UI-TARS-1.5-7B |
| Fine-tuning method | LoRA (PEFT) |
| Training infrastructure | Qwen-VL-Series-Finetune |
| LoRA rank | 8 |
| Training epochs | 1 |
| Training samples | 24,935 |
Training Data
Data was generated from the Mind2Web training set using the GUI-DR data augmentation pipeline and quality-filtered using Holo2-30B-A3B (ScreenSpot-Pro SOTA, 66.1% accuracy).
| Perturbation Type | Variants | Description |
|---|---|---|
| Style | 5 | Visual domain randomization (colors, themes, fonts, element orders) |
| Text Shrink | 1 | Reduced font sizes |
| Precision | 1 | Changed page zoom level to 0.7 |
| Combined | 1 | 1 original + 5 style + 1 precision + 1 text shrink |
| Total | 8 | ~4,319 steps per variant |
Results
Fine-tuning on this dataset did not improve GUI grounding performance over the base model UI-TARS-1.5-7B on ScreenSpot-V2 and GUI-Perturbed. See the technical report for full benchmark results and comparison experiments.
Citation
If you find this model helpful, please cite our technical report and paper:
@misc{wang2026guiperturbeddomainrandomizationreveals,
title={GUI-Perturbed: Domain Randomization Reveals Systematic Brittleness in GUI Grounding Models},
author={Yangyue Wang and Harshvardhan Sikka and Yash Mathur and Tony Zhou and Jinu Nyachhyon and Pranav Guruprasad},
year={2026},
eprint={2604.14262},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2604.14262},
}
@online{training_on_gui_perturbed_technical_report_2026,
title = {Training on GUI-Perturbed: Why More Data Isn’t Enough},
author = {Wang, Yangyue and Sikka, Harsh and Mathur, Yash, and Zhou, Tony and Nyachhyon, Jinu and Guruprasad, Pranav},
year = {2026},
url = {https://blog.fig.inc/training-on-gui-perturbed-why-more-data-isnt-enough},
note = {Part 3: Finetuning Experiments}
}
Acknowledgements
- Base model: ByteDance-Seed/UI-TARS-1.5-7B
- Training infrastructure: Qwen-VL-Series-Finetune
- Quality filtering: Holo2-30B-A3B
- Training data source: Mind2Web
- Downloads last month
- -
Model tree for figai/UI-TARS-1.5-7B-GUI-Perturbed
Base model
ByteDance-Seed/UI-TARS-1.5-7BDataset used to train figai/UI-TARS-1.5-7B-GUI-Perturbed
Paper for figai/UI-TARS-1.5-7B-GUI-Perturbed
Evaluation results
- figai/GUI-Perturbed · Gui Perturbed View evaluation results source
83.61 - os-copilot/ScreenSpot-v2 · Screenspot V2 View evaluation results source
82.1