AttackVLA: Benchmarking Adversarial and Backdoor Attacks on Vision-Language-Action Models
Paper β’ 2511.12149 β’ Published
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
CUDA-accelerated adversarial defense model that detects and blocks attacks on Vision-Language-Action (VLA) robot models.
DefenseNet (0.63M parameters) β lightweight defense guard that sits in front of any VLA model:
local_tv_map kernel to extract per-pixel total variation features, detecting adversarial patchesfused_dual_normalize kernel for VLA-compatible DINOv2+SigLIP dual normalization, then classifies clean vs adversarialfused_smooth_clamp kernel for inference-time noise+clamp in a single pass| Task Suite | Accuracy | TPR | FPR |
|---|---|---|---|
| libero_long (tasks 0-9) | 97.4% | 95.4% | 0.6% |
| libero_object (tasks 10-19) | 97.6% | 95.8% | 0.6% |
| libero_spatial (tasks 20-29) | 97.8% | 96.2% | 0.6% |
| libero_goal (tasks 30-39) | 97.6% | 95.9% | 0.6% |
| Overall | 97.6% | 95.8% | 0.6% |
| Model | Type | Params |
|---|---|---|
| OpenVLA-7B | openvla | 7B |
| Pi0-Fast | pi0_fast | 3B |
| Pi0.5 | pi0 | 3B |
| SmolVLA | smolvla | 400M |
| BitVLA | bitvla_llava | 3B |
| X-VLA-Pt | xvla | 1.5B |
AttackVLA: Benchmarking Adversarial and Backdoor Attacks on Vision-Language-Action Models
| File | Format | Size |
|---|---|---|
| defense_net.safetensors | SafeTensors | 2.5MB |
| defense_net.pth | PyTorch | 2.5MB |
| defense_net.onnx | ONNX | 2.5MB |
| libero_eval_report.json | Eval metrics | 9KB |
| train_real.toml | Training config | <1KB |
| anima_module.yaml | Module manifest | <1KB |
import torch
from anima_def_attackvla.models.defense_net import DefenseNet
model = DefenseNet()
ckpt = torch.load("defense_net.pth", weights_only=True)
model.load_state_dict(ckpt["model"])
model.eval().cuda()
image = torch.rand(1, 3, 224, 224, device="cuda")
blocked, score, sanitized = model.detect_and_sanitize(image, threshold=0.5)
MIT