DEF-sariad -- SAR Anomaly Detection

Wave 8 Defense Module | ANIMA Framework | Apache 2.0

Two-phase SAR anomaly detection system with 8 custom CUDA kernels for real-time defense applications. Designed for NEMESIS (terrain navigation) and ATLAS (fleet autonomy) stacks.

Architecture

Phase 1: Pixel-Space Reconstruction
  Input (256x256 RGB) -> CUDA Median Filter (101x speedup)
    -> Convolutional AE (22.4M params) -> Reconstruction Error -> Anomaly Map

Phase 2: Feature-Space Detection (recommended for deployment)
  Input (256x256 RGB) -> DINOv2 ViT-B/14 (frozen, 86.6M params)
    -> 768-dim features -> KNN/Gaussian/MLP detector -> Anomaly Score

Models

Phase 1: Pixel-Space Autoencoders

Model Params Latent Val Loss Throughput Files
Combined AE (primary) 22.4M 256 2.600 877 img/s model.*, combined/
Deep AE 39.2M 512 2.600 902 img/s combined/ variant
VAE 55.9M 512 2.603 878 img/s combined/ variant
SAR-only AE 22.4M 256 2.808 330 img/s best.pth

Trained on 62.8K combined SAR ship + VIVID++ thermal images. All architectures plateau at val_loss ~ 2.600 (pixel reconstruction ceiling).

Phase 2: DINOv2 Feature-Space Detectors

Detector Method CUDA Speedup Test Anomaly Rate Files
KNN (recommended) k=5 NN in 10K bank 81.4x 4.9% feature_detector/knn_detector.pth
Gaussian Mahalanobis distance 2.2x 38.0% feature_detector/gaussian_detector.pth
MLP head 768->256->128->1 -- learned feature_detector/mlp_head.*

Features extracted with DINOv2 ViT-B/14 (frozen). 71,917 features across 24 VIVID++ scenes. KNN detector at 81x CUDA speedup is recommended for real-time deployment.

CUDA Kernels (8 ops, sm_89)

Custom CUDA kernels compiled for NVIDIA L4 (compute 8.9), CUDA 12, torch cu128:

Kernel Speedup Use Case
fused_median_filter_3x3 101x SAR speckle denoising
fused_median_filter_5x5 14x Heavy speckle denoising
sar_log_normalize 4.5x SAR amplitude preprocessing
fused_reconstruct_error_map 1.9x Pixel anomaly scoring (4D)
fused_sar_nlm_denoise -- Non-local means denoising
anomaly_score -- Per-pixel reconstruction error (3D)
fused_mahalanobis_distance 2.2x Gaussian feature-space scoring
fused_knn_distance 81.4x KNN feature-space scoring

All kernels have automatic PyTorch fallbacks when custom CUDA extensions aren't available.

Export Formats

Phase 1 (Combined AE)

Format File Size
PyTorch model.pth 86MB
SafeTensors model.safetensors 86MB
ONNX model.onnx 86MB
TensorRT FP16 model_fp16.engine 44MB
TensorRT FP32 model_fp32.engine 86MB

Phase 2 (MLP Head)

Format File Size
PyTorch feature_detector/mlp_head.pth 0.9MB
SafeTensors feature_detector/mlp_head.safetensors 0.9MB
ONNX feature_detector/mlp_head.onnx 921KB
TensorRT FP16 feature_detector/mlp_head_fp16.engine 965KB
TensorRT FP32 feature_detector/mlp_head_fp32.engine 965KB

Usage

Phase 1: Pixel-Space Anomaly Detection

import torch
from def_sariad.models import SARAutoencoder
from def_sariad.backends.cuda_ops import cuda_median_filter

# Load model
model = SARAutoencoder(in_channels=3, latent_dim=256).cuda()
state = torch.load("model.pth", map_location="cuda")
model.load_state_dict(state["model"] if "model" in state else state)
model.eval()

# Preprocess with CUDA median filter (101x speedup)
sar_image = torch.randn(1, 3, 256, 256).cuda()
denoised = cuda_median_filter(sar_image, kernel_size=3)

# Detect anomalies
with torch.no_grad():
    anomaly_map = model.compute_anomaly_score(denoised)
# anomaly_map shape: [1, 256, 256] -- higher values = more anomalous

Phase 2: Feature-Space Anomaly Detection (Recommended)

import torch

# Load pre-extracted DINOv2 features (768-dim)
features = torch.load("vivid_dinov2_features/scene_features.pt")

# KNN detector (81x CUDA speedup)
knn_state = torch.load("feature_detector/knn_detector.pth")
feature_bank = knn_state["feature_bank"].cuda()  # [10000, 768]
threshold = knn_state["threshold"]                # 23.94

# Score new features
query = features[:100].cuda()  # [100, 768]
dists = torch.cdist(query, feature_bank)  # [100, 10000]
knn_dists, _ = dists.topk(5, largest=False, dim=1)  # k=5
scores = knn_dists.mean(dim=1)  # [100]
anomalies = scores > threshold

ONNX Inference

import onnxruntime as ort
import numpy as np

session = ort.InferenceSession("model.onnx", providers=["CUDAExecutionProvider"])
input_data = np.random.randn(1, 3, 256, 256).astype(np.float32)
outputs = session.run(None, {"input": input_data})

TensorRT Inference

# Use model_fp16.engine for fastest inference
# Requires tensorrt Python package
import tensorrt as trt
# Load engine and run inference via TRT runtime

Training Details

Parameter Phase 1 Phase 2
Dataset 62.8K SAR+thermal 71.9K DINOv2 features
Input 256x256 RGB 768-dim vectors
Optimizer AdamW -- (KNN/Gaussian are non-parametric)
Learning rate 3e-4 --
Scheduler Warmup (5%) + Cosine --
Epochs 50-100 1 (feature extraction)
GPU NVIDIA L4 (23GB) NVIDIA L4 (23GB)
Precision FP32 FP32

File Structure

DEF-sariad/
+-- model.pth                          # Phase 1: Combined AE weights
+-- model.safetensors                  # Phase 1: SafeTensors format
+-- model.onnx                         # Phase 1: ONNX export
+-- model_fp16.engine                  # Phase 1: TensorRT FP16
+-- model_fp32.engine                  # Phase 1: TensorRT FP32
+-- best.pth                           # Phase 1: SAR-only AE weights
+-- combined/                          # Phase 1: Combined AE full export set
+-- feature_detector/
|   +-- gaussian_detector.pth          # Phase 2: Mahalanobis detector
|   +-- knn_detector.pth               # Phase 2: KNN detector (recommended)
|   +-- mlp_head.pth                   # Phase 2: Learned MLP head
|   +-- mlp_head.safetensors           # Phase 2: MLP SafeTensors
|   +-- mlp_head.onnx                  # Phase 2: MLP ONNX
|   +-- mlp_head_fp16.engine           # Phase 2: MLP TRT FP16
|   +-- mlp_head_fp32.engine           # Phase 2: MLP TRT FP32
|   +-- training_report.json           # Phase 2: Detector metrics
+-- config/
|   +-- anima_module.yaml              # ANIMA module manifest
|   +-- autoencoder.toml               # AE training config
|   +-- paper.toml                     # Paper reproduction config
+-- export_manifest.json               # Export metadata
+-- training_report.json               # Phase 1 training metrics
+-- TRAINING_REPORT.md                 # Full training report
+-- README.md                          # This model card

Citation

@article{sariad2025,
  title={SARIAD: A Comprehensive Benchmark for SAR Image Anomaly Detection},
  year={2025},
  note={arXiv:2504.08115}
}

License

Apache 2.0. Part of the ANIMA Framework by Robot Flow Labs.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for ilessio-aiflowlab/DEF-sariad

Evaluation results