DEF-roboticattack — Adversarial Patch Detector for VLA Robotic Systems

Part of the ANIMA Perception Suite by Robot Flow Labs.

Paper

On the Adversarial Vulnerability of Vision-Language-Action Models for Robotic Manipulation (ICCV 2025)

William Wang et al. — arXiv:2411.13587

This module provides the defense counterpart to the paper's UADA/UPA/TMA adversarial attacks on VLA models like OpenVLA.

Architecture

PatchDetectorNet — Lightweight multi-branch CNN (19,841 parameters) that detects adversarial patches in VLA image inputs through three parallel analysis branches:

Branch	Purpose	Filters	Output
Frequency	High-pass anomaly detection	16	16-d
Edge	Multi-scale edge energy (3×3 + 5×5)	8+8	16-d
Spatial	Patch boundary consistency (7→3→3)	16→32→32	32-d
Classifier	Binary patch detection	64→32→1	logit

Input: [B, 3, 224, 224] float32 — Output: [B, 1] (sigmoid → patch probability)

Results

Trained on 85,000 real images (80K LIBERO robot manipulation frames + 5K COCO val2017) with paper-accurate UADA/UPA/TMA adversarial patches including geometric transforms (rotation + shear).

Metric	Value
Accuracy	98.6%
Precision	98.7%
Recall	98.5%
F1 Score	0.986
TP / FP / TN / FN	2939 / 38 / 2979 / 44

Inference Performance (NVIDIA L4)

Metric	Model Only	Full Pipeline
Latency (mean)	1.50 ms	3.13 ms
Throughput	5,351 samples/s	2,557 samples/s

CUDA Kernels

Two custom CUDA kernels compiled for sm_89 (L4/Ada):

fused_patch_apply: 25.5× speedup over CPU for adversarial patch application
fused_action_perturb: PGD action-space perturbation with sign-correct zero-gradient handling

Exported Formats

Format	File	Size	Use Case
PyTorch (.pth)	`pytorch/def_roboticattack_v1.pth`	91.7 KB	Training, fine-tuning
SafeTensors	`pytorch/def_roboticattack_v1.safetensors`	81.7 KB	Fast loading, safe deployment
ONNX	`onnx/def_roboticattack_v1.onnx`	81.4 KB	Cross-platform inference
TensorRT FP16	`tensorrt/def_roboticattack_v1_fp16.trt`	264 KB	Edge deployment (Jetson/L4)
TensorRT FP32	`tensorrt/def_roboticattack_v1_fp32.trt`	305 KB	Full precision inference

Usage

PyTorch

import torch
from safetensors.torch import load_file

# Load model
state = load_file("pytorch/def_roboticattack_v1.safetensors")
# See src/def_roboticattack/models/patch_detector.py for PatchDetectorNet class
model.load_state_dict(state)
model.eval()

# Inference
images = torch.rand(1, 3, 224, 224).cuda()
prob = torch.sigmoid(model(images))  # probability of adversarial patch

Full Defense Pipeline

from def_roboticattack.pipeline.runtime import DefenseRuntime

runtime = DefenseRuntime(backend="cuda")
runtime.load_model("pytorch/def_roboticattack_v1.pth")

result = runtime.full_defense(images)
# result["combined_risk"] — float in [0, 1]
# result["neural"]["flagged"] — list of booleans per image

ONNX Runtime

import onnxruntime as ort
import numpy as np

session = ort.InferenceSession("onnx/def_roboticattack_v1.onnx")
image = np.random.randn(1, 3, 224, 224).astype(np.float32)
logit = session.run(None, {"image": image})[0]

Training

Hardware: NVIDIA L4 (23 GB VRAM)
Framework: PyTorch 2.11 + CUDA 12.8
Optimizer: AdamW (lr=5e-4, weight_decay=0.01)
Scheduler: Warmup cosine (500 warmup steps)
Precision: FP16 mixed precision
Batch size: 512 (auto-detected)
Epochs: 30 (3,420 seconds)
Config: See configs/training.toml

Attack Types Used for Training

Attack	Description	Paper Section
UADA	Universal adversarial patch with geometric jitter	§3.2
UPA	Untargeted high-frequency checkerboard patch	§3.3
TMA	Targeted manipulation with directional gradients	§3.4

All patches applied with random position, rotation (±30°), shear (±0.2), and alpha blending (0.7–1.0).

Defense Pipeline

The full defense stack includes:

Input sanitization: intensity clamping + Gaussian blur to suppress patch edges
Heuristic detection: Sobel edge energy with fixed threshold
Neural detection: PatchDetectorNet binary classifier
Risk aggregation: max(heuristic, neural) → combined risk score [0, 1]

Repository

GitHub: RobotFlow-Labs/DEF-roboticattack
Paper: arXiv:2411.13587
Wave: 8 (Defense-Only)

License

Apache 2.0 — Robot Flow Labs / AIFLOW LABS LIMITED

Downloads last month: -; Downloads are not tracked for this model. How to track

Datasets used to train ilessio-aiflowlab/DEF-roboticattack

Paper for ilessio-aiflowlab/DEF-roboticattack

Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics

Paper • 2411.13587 • Published Aug 1, 2025

Evaluation results

accuracy
self-reported

0.986
f1
self-reported

0.986
precision
self-reported

0.987
recall
self-reported

0.985