wuxiyang's picture
Simplify model card: brief, clear LoRA description
0881e86 verified
metadata
library_name: peft
base_model: Qwen/Qwen2.5-3B-Instruct
tags:
  - saber
  - adversarial-attack
  - vla
  - robotics
  - lora
  - grpo
  - qwen2.5
  - libero
license: bsd-3-clause

SABER Attack Agent — Action Inflation

LoRA adapter (rank 8) for Qwen/Qwen2.5-3B-Instruct, trained with GRPO to generate adversarial instruction perturbations targeting inflating action sequences (victim VLA takes unnecessarily many steps).

Part of the SABER framework: Paper | GitHub

Details

Type LoRA adapter (adapter_model.safetensors)
Base model Qwen/Qwen2.5-3B-Instruct
Attack objective action_inflation
Training Cold-start SFT → GRPO (step 50) on LIBERO
LoRA config r=8, alpha=16, all attn + MLP projections
Victim VLA (training) Pi0.5 (OpenPI)

Quick Start

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B-Instruct", torch_dtype="bfloat16", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B-Instruct")
model = PeftModel.from_pretrained(base, "IntelligenceLab/saber-attack-agent-action-inflation")

Full Pipeline

For the complete attack pipeline (ReAct tool-calling, VLA rollouts, LIBERO evaluation):

git clone https://github.com/wuxiyang1996/SABER && cd SABER && bash install.sh

python eval_attack_vla.py \
    --victim openpi_pi05 \
    --objective action_inflation \
    --attack_gpus 2,3 --vla_gpu 0

See the GitHub repo for training, evaluation, and cross-model transfer instructions.

Citation

@misc{wu2026saber,
      title={SABER: A Stealthy Agentic Black-Box Attack Framework for Vision-Language-Action Models},
      author={Xiyang Wu and Guangyao Shi and Qingzi Wang and Zongxia Li and Amrit Singh Bedi and Dinesh Manocha},
      year={2026},
      eprint={2603.24935},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
}

License

BSD 3-Clause — see https://github.com/wuxiyang1996/SABER/blob/main/LICENSE.