DEF-roboticattack β Adversarial Patch Detector for VLA Robotic Systems
Part of the ANIMA Perception Suite by Robot Flow Labs.
Paper
On the Adversarial Vulnerability of Vision-Language-Action Models for Robotic Manipulation (ICCV 2025)
William Wang et al. β arXiv:2411.13587
This module provides the defense counterpart to the paper's UADA/UPA/TMA adversarial attacks on VLA models like OpenVLA.
Architecture
PatchDetectorNet β Lightweight multi-branch CNN (19,841 parameters) that detects adversarial patches in VLA image inputs through three parallel analysis branches:
| Branch | Purpose | Filters | Output |
|---|---|---|---|
| Frequency | High-pass anomaly detection | 16 | 16-d |
| Edge | Multi-scale edge energy (3Γ3 + 5Γ5) | 8+8 | 16-d |
| Spatial | Patch boundary consistency (7β3β3) | 16β32β32 | 32-d |
| Classifier | Binary patch detection | 64β32β1 | logit |
Input: [B, 3, 224, 224] float32 β Output: [B, 1] (sigmoid β patch probability)
Results
Trained on 85,000 real images (80K LIBERO robot manipulation frames + 5K COCO val2017) with paper-accurate UADA/UPA/TMA adversarial patches including geometric transforms (rotation + shear).
| Metric | Value |
|---|---|
| Accuracy | 98.6% |
| Precision | 98.7% |
| Recall | 98.5% |
| F1 Score | 0.986 |
| TP / FP / TN / FN | 2939 / 38 / 2979 / 44 |
Inference Performance (NVIDIA L4)
| Metric | Model Only | Full Pipeline |
|---|---|---|
| Latency (mean) | 1.50 ms | 3.13 ms |
| Throughput | 5,351 samples/s | 2,557 samples/s |
CUDA Kernels
Two custom CUDA kernels compiled for sm_89 (L4/Ada):
fused_patch_apply: 25.5Γ speedup over CPU for adversarial patch applicationfused_action_perturb: PGD action-space perturbation with sign-correct zero-gradient handling
Exported Formats
| Format | File | Size | Use Case |
|---|---|---|---|
| PyTorch (.pth) | pytorch/def_roboticattack_v1.pth |
91.7 KB | Training, fine-tuning |
| SafeTensors | pytorch/def_roboticattack_v1.safetensors |
81.7 KB | Fast loading, safe deployment |
| ONNX | onnx/def_roboticattack_v1.onnx |
81.4 KB | Cross-platform inference |
| TensorRT FP16 | tensorrt/def_roboticattack_v1_fp16.trt |
264 KB | Edge deployment (Jetson/L4) |
| TensorRT FP32 | tensorrt/def_roboticattack_v1_fp32.trt |
305 KB | Full precision inference |
Usage
PyTorch
import torch
from safetensors.torch import load_file
# Load model
state = load_file("pytorch/def_roboticattack_v1.safetensors")
# See src/def_roboticattack/models/patch_detector.py for PatchDetectorNet class
model.load_state_dict(state)
model.eval()
# Inference
images = torch.rand(1, 3, 224, 224).cuda()
prob = torch.sigmoid(model(images)) # probability of adversarial patch
Full Defense Pipeline
from def_roboticattack.pipeline.runtime import DefenseRuntime
runtime = DefenseRuntime(backend="cuda")
runtime.load_model("pytorch/def_roboticattack_v1.pth")
result = runtime.full_defense(images)
# result["combined_risk"] β float in [0, 1]
# result["neural"]["flagged"] β list of booleans per image
ONNX Runtime
import onnxruntime as ort
import numpy as np
session = ort.InferenceSession("onnx/def_roboticattack_v1.onnx")
image = np.random.randn(1, 3, 224, 224).astype(np.float32)
logit = session.run(None, {"image": image})[0]
Training
- Hardware: NVIDIA L4 (23 GB VRAM)
- Framework: PyTorch 2.11 + CUDA 12.8
- Optimizer: AdamW (lr=5e-4, weight_decay=0.01)
- Scheduler: Warmup cosine (500 warmup steps)
- Precision: FP16 mixed precision
- Batch size: 512 (auto-detected)
- Epochs: 30 (3,420 seconds)
- Config: See
configs/training.toml
Attack Types Used for Training
| Attack | Description | Paper Section |
|---|---|---|
| UADA | Universal adversarial patch with geometric jitter | Β§3.2 |
| UPA | Untargeted high-frequency checkerboard patch | Β§3.3 |
| TMA | Targeted manipulation with directional gradients | Β§3.4 |
All patches applied with random position, rotation (Β±30Β°), shear (Β±0.2), and alpha blending (0.7β1.0).
Defense Pipeline
The full defense stack includes:
- Input sanitization: intensity clamping + Gaussian blur to suppress patch edges
- Heuristic detection: Sobel edge energy with fixed threshold
- Neural detection: PatchDetectorNet binary classifier
- Risk aggregation: max(heuristic, neural) β combined risk score [0, 1]
Repository
- GitHub: RobotFlow-Labs/DEF-roboticattack
- Paper: arXiv:2411.13587
- Wave: 8 (Defense-Only)
License
Apache 2.0 β Robot Flow Labs / AIFLOW LABS LIMITED
Datasets used to train ilessio-aiflowlab/DEF-roboticattack
Paper for ilessio-aiflowlab/DEF-roboticattack
Evaluation results
- accuracyself-reported0.986
- f1self-reported0.986
- precisionself-reported0.987
- recallself-reported0.985