SABER: A Stealthy Agentic Black-Box Attack Framework for Vision-Language-Action Models
Paper • 2603.24935 • Published • 2
LoRA adapter (rank 8) for Qwen/Qwen2.5-3B-Instruct, trained with GRPO to generate adversarial instruction perturbations targeting inducing task failure in victim VLA models.
Part of the SABER framework: Paper | GitHub
| Type | LoRA adapter (adapter_model.safetensors) |
| Base model | Qwen/Qwen2.5-3B-Instruct |
| Attack objective | task_failure |
| Training | Cold-start SFT → GRPO (step 150) on LIBERO |
| LoRA config | r=8, alpha=16, all attn + MLP projections |
| Victim VLA (training) | Pi0.5 (OpenPI) |
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B-Instruct", torch_dtype="bfloat16", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B-Instruct")
model = PeftModel.from_pretrained(base, "IntelligenceLab/saber-attack-agent-task-failure")
For the complete attack pipeline (ReAct tool-calling, VLA rollouts, LIBERO evaluation):
git clone https://github.com/wuxiyang1996/SABER && cd SABER && bash install.sh
python eval_attack_vla.py \
--victim openpi_pi05 \
--objective task_failure \
--attack_gpus 2,3 --vla_gpu 0
See the GitHub repo for training, evaluation, and cross-model transfer instructions.
@misc{wu2026saber,
title={SABER: A Stealthy Agentic Black-Box Attack Framework for Vision-Language-Action Models},
author={Xiyang Wu and Guangyao Shi and Qingzi Wang and Zongxia Li and Amrit Singh Bedi and Dinesh Manocha},
year={2026},
eprint={2603.24935},
archivePrefix={arXiv},
primaryClass={cs.RO},
}
BSD 3-Clause — see https://github.com/wuxiyang1996/SABER/blob/main/LICENSE.