ShowUI-pi / README.md
h-siyuan's picture
Upload README.md with huggingface_hub
7555b5a verified
---
language:
- en
library_name: lerobot
pipeline_tag: robotics
tags:
- vision-language-action
- gui-agent
- flow-matching
- drag-and-drop
- lerobot
inference: false
---
# ShowUI-π
ShowUI-π is a Vision-Language-Action model for GUI drag-and-drop, built on [SmolVLA](https://huggingface.co/lerobot/smolvla_base) (500M). It uses a flow-matching action head to predict drag trajectories from a single screenshot and a natural-language instruction.
**Paper:** [ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands](https://arxiv.org/abs/2512.24965)
**Code:** [https://github.com/showlab/showui-pi](https://github.com/showlab/showui-pi)
**Training Data:** [showlab/ShowUI-pi-data](https://huggingface.co/datasets/showlab/ShowUI-pi-data)
**Evaluation Benchmark:** [h-siyuan/ScreenDrag](https://huggingface.co/datasets/h-siyuan/ScreenDrag)
## Quick start
```bash
git clone https://github.com/showlab/showui-pi.git
cd showui-pi
pip install -e .
```
### Inference
```python
import torch
from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy
from lerobot.policies.factory import make_pre_post_processors
policy = SmolVLAPolicy.from_pretrained("showlab/ShowUI-pi").to("cuda").eval()
preprocessor, postprocessor = make_pre_post_processors(
policy.config,
"showlab/ShowUI-pi",
preprocessor_overrides={"device_processor": {"device": "cuda"}},
)
```
## Training
```bash
bash scripts/train_showui_pi.sh
```
See the [training script](https://github.com/showlab/showui-pi/blob/main/scripts/train_showui_pi.sh) for all flags and defaults.
## Evaluation
### DEX Benchmark
```bash
PYTHONPATH=lerobot/src \
python scripts/eval_dex.py \
--ckpt <path/to/checkpoint> \
--output_dir outputs/eval_dex
```
### ScreenSpot-Pro
```bash
PYTHONPATH=lerobot/src \
python scripts/eval_screenspot_pro.py \
--ckpt <path/to/checkpoint> \
--annotations_root <path/to/ScreenSpot-Pro/annotations> \
--images_root <path/to/ScreenSpot-Pro/images>
```
## Citation
```bibtex
@article{hu2025showui,
title={ShowUI-$$\backslash$pi $: Flow-based Generative Models as GUI Dexterous Hands},
author={Hu, Siyuan and Lin, Kevin Qinghong and Shou, Mike Zheng},
journal={arXiv preprint arXiv:2512.24965},
year={2025}
}
```