SABER Attack Agent — Constraint Violation

LoRA adapter (rank 8) for Qwen/Qwen2.5-3B-Instruct, trained with GRPO to generate adversarial instruction perturbations targeting increasing constraint violations (collisions, boundary violations).

Part of the SABER framework: Paper | GitHub

Details

Type LoRA adapter (adapter_model.safetensors)
Base model Qwen/Qwen2.5-3B-Instruct
Attack objective constraint_violation
Training Cold-start SFT → GRPO (step 150) on LIBERO
LoRA config r=8, alpha=16, all attn + MLP projections
Victim VLA (training) Pi0.5 (OpenPI)

Quick Start

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B-Instruct", torch_dtype="bfloat16", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B-Instruct")
model = PeftModel.from_pretrained(base, "IntelligenceLab/saber-attack-agent-constraint-violation")

Full Pipeline

For the complete attack pipeline (ReAct tool-calling, VLA rollouts, LIBERO evaluation):

git clone https://github.com/wuxiyang1996/SABER && cd SABER && bash install.sh

python eval_attack_vla.py \
    --victim openpi_pi05 \
    --objective constraint_violation \
    --attack_gpus 2,3 --vla_gpu 0

See the GitHub repo for training, evaluation, and cross-model transfer instructions.

Citation

@misc{wu2026saber,
      title={SABER: A Stealthy Agentic Black-Box Attack Framework for Vision-Language-Action Models},
      author={Xiyang Wu and Guangyao Shi and Qingzi Wang and Zongxia Li and Amrit Singh Bedi and Dinesh Manocha},
      year={2026},
      eprint={2603.24935},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
}

License

BSD 3-Clause — see https://github.com/wuxiyang1996/SABER/blob/main/LICENSE.

Downloads last month
-
Video Preview
loading

Model tree for IntelligenceLab/saber-attack-agent-constraint-violation

Base model

Qwen/Qwen2.5-3B
Adapter
(1003)
this model

Paper for IntelligenceLab/saber-attack-agent-constraint-violation