Instructions to use showlab/ShowUI-pi with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use showlab/ShowUI-pi with LeRobot:
- Notebooks
- Google Colab
- Kaggle
| language: | |
| - en | |
| library_name: lerobot | |
| pipeline_tag: robotics | |
| tags: | |
| - vision-language-action | |
| - gui-agent | |
| - flow-matching | |
| - drag-and-drop | |
| - lerobot | |
| inference: false | |
| # ShowUI-π | |
| ShowUI-π is a Vision-Language-Action model for GUI drag-and-drop, built on [SmolVLA](https://huggingface.co/lerobot/smolvla_base) (500M). It uses a flow-matching action head to predict drag trajectories from a single screenshot and a natural-language instruction. | |
| **Paper:** [ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands](https://arxiv.org/abs/2512.24965) | |
| **Code:** [https://github.com/showlab/showui-pi](https://github.com/showlab/showui-pi) | |
| **Training Data:** [showlab/ShowUI-pi-data](https://huggingface.co/datasets/showlab/ShowUI-pi-data) | |
| **Evaluation Benchmark:** [h-siyuan/ScreenDrag](https://huggingface.co/datasets/h-siyuan/ScreenDrag) | |
| ## Quick start | |
| ```bash | |
| git clone https://github.com/showlab/showui-pi.git | |
| cd showui-pi | |
| pip install -e . | |
| ``` | |
| ### Inference | |
| ```python | |
| import torch | |
| from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy | |
| from lerobot.policies.factory import make_pre_post_processors | |
| policy = SmolVLAPolicy.from_pretrained("showlab/ShowUI-pi").to("cuda").eval() | |
| preprocessor, postprocessor = make_pre_post_processors( | |
| policy.config, | |
| "showlab/ShowUI-pi", | |
| preprocessor_overrides={"device_processor": {"device": "cuda"}}, | |
| ) | |
| ``` | |
| ## Training | |
| ```bash | |
| bash scripts/train_showui_pi.sh | |
| ``` | |
| See the [training script](https://github.com/showlab/showui-pi/blob/main/scripts/train_showui_pi.sh) for all flags and defaults. | |
| ## Evaluation | |
| ### DEX Benchmark | |
| ```bash | |
| PYTHONPATH=lerobot/src \ | |
| python scripts/eval_dex.py \ | |
| --ckpt <path/to/checkpoint> \ | |
| --output_dir outputs/eval_dex | |
| ``` | |
| ### ScreenSpot-Pro | |
| ```bash | |
| PYTHONPATH=lerobot/src \ | |
| python scripts/eval_screenspot_pro.py \ | |
| --ckpt <path/to/checkpoint> \ | |
| --annotations_root <path/to/ScreenSpot-Pro/annotations> \ | |
| --images_root <path/to/ScreenSpot-Pro/images> | |
| ``` | |
| ## Citation | |
| ```bibtex | |
| @article{hu2025showui, | |
| title={ShowUI-$$\backslash$pi $: Flow-based Generative Models as GUI Dexterous Hands}, | |
| author={Hu, Siyuan and Lin, Kevin Qinghong and Shou, Mike Zheng}, | |
| journal={arXiv preprint arXiv:2512.24965}, | |
| year={2025} | |
| } | |
| ``` | |