File size: 12,230 Bytes
5be3b98 c9c49cb 8b50cb5 f60b4df 8b50cb5 87d5060 8b50cb5 219c40f c9c49cb 219c40f 32af328 c9c49cb 32af328 c9c49cb 32af328 29627c3 c9c49cb 29627c3 20cd615 219c40f f60b4df 9bbb9e4 219c40f c9c49cb 219c40f c9c49cb 219c40f 9bbb9e4 5be3b98 9bbb9e4 32af328 5be3b98 f172492 5be3b98 9bbb9e4 48cbe6b c9c49cb 48cbe6b c9c49cb 48cbe6b 32af328 c9c49cb 8474e70 448693b 48cbe6b c9c49cb 48cbe6b 9bbb9e4 8474e70 448693b 5be3b98 448693b 20cd615 219c40f 32af328 c9c49cb e068b5b c9c49cb 77a76c2 32af328 448693b 32af328 9f450e7 219c40f 9f450e7 219c40f c9c49cb 9f450e7 32af328 c9c49cb 32af328 219c40f 29627c3 219c40f 32af328 219c40f c9c49cb 219c40f 32af328 8edd35a c9c49cb 243818b c9c49cb 243818b c9c49cb 9cf5bbb 243818b c9c49cb 20cd615 c9c49cb 20cd615 243818b c9c49cb 8edd35a 32af328 9f450e7 32af328 8edd35a 32af328 8edd35a c9c49cb 29627c3 219c40f c9c49cb 219c40f c9c49cb 9bbb9e4 219c40f c9c49cb a0d3da4 c9c49cb a0d3da4 219c40f a0d3da4 c9c49cb a0d3da4 219c40f 9f450e7 c9c49cb 9f450e7 219c40f 9bbb9e4 c9c49cb | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 | ---
license: mit
pipeline_tag: image-segmentation
tags:
- anomaly-detection
- image-anomaly-detection
- explainable-ai
- vision-transformer
- attention-mechanism
- digital-forensics
- image-forensics
- weakly-supervised
- localization
---

---
[](https://doi.org/10.5281/zenodo.18064355)
[](https://arxiv.org/abs/2512.15512)
[](https://github.com/OBA-Research/VAAS)
[](https://github.com/OBA-Research/VAAS/blob/main/LICENSE)
[](#)
[](https://colab.research.google.com/drive/1aGQc_ZpPhDOEf7G4_-p03_Fmtdg_QcGd)
# VAAS: Vision-Attention Anomaly Scoring
## Model Summary
VAAS (Vision-Attention Anomaly Scoring) is a dual-module vision framework for **image anomaly detection and localisation**.
It combines **global attention-based reasoning** with **patch-level self-consistency analysis** to produce a **continuous, interpretable anomaly score** alongside dense spatial anomaly maps.
Rather than making binary decisions, VAAS estimates **where anomalies occur** and **how strongly they deviate from learned visual regularities**, enabling explainable assessment of image integrity.
The framework is further extended with a cross-attention fusion mechanism that enables global representations to directly guide patch-level anomaly reasoning.
---
## Examples of detection and scoring

---
## Read Research Paper
- [Journal version (FSIDI / DFRWS EU 2026)](https://www.sciencedirect.com/science/article/pii/S266628172600020X)
- [Arxiv version](https://arxiv.org/abs/2512.15512)
* [Presentation Slides](https://opeyemibami.github.io/slides/vaas)
---
## Architecture Overview

VAAS consists of two complementary components:
- **Global Attention Module (Fx)**
A Vision Transformer backbone that captures global semantic and structural irregularities through attention distributions.
- **Patch-Level Module (Px)**
A SegFormer-based segmentation model that identifies local inconsistencies in texture, boundaries, and regions.
The framework is further extended with a cross-attention fusion mechanism, enabling global representations from Fx to guide patch-level anomaly reasoning within Px.
These components are combined via a hybrid scoring mechanism:
- `S_F`: Global attention fidelity score
- `S_P`: Patch-level plausibility score
- `S_H`: Final hybrid anomaly score
`S_H` provides a continuous measure of anomaly intensity rather than a binary decision.
---
## Installation
VAAS is distributed as a **lightweight inference library** and can be installed instantly.
PyTorch is **only required when executing inference or loading pretrained VAAS models**.
This allows users to inspect, install, and integrate VAAS without heavy dependencies.
---
### 1. Install PyTorch
To run inference or load pretrained VAAS models, install PyTorch and torchvision for your system (CPU or GPU).
Follow the official PyTorch installation guide:
https://pytorch.org/get-started/locally/
**Quick installation (CPU)**
```sh
pip install torch torchvision
```
---
### 2. Install VAAS
```sh
pip install vaas
```
VAAS will automatically detect PyTorch at runtime and raise a clear error if it is missing.
---
# Usage
## Documentation and Examples
π [APIs and Usage Documentation](https://github.com/OBA-Research/VAAS/blob/main/docs/usage/api_doc.md)
π [colab notebooks](https://drive.google.com/drive/folders/1xA0OdPgz9C8OL63nfl_nlUcZ-wWeRz84?usp=sharing)
π [v2 notebooks](https://github.com/OBA-Research/VAAS/tree/main/examples/notebooks/vaas_v2/)
π [v1 notebooks](https://github.com/OBA-Research/VAAS/tree/main/examples/notebooks/vaas_v017)
---
**Try VAAS instantly on Google Colab (no setup required):**
The notebooks cover:
- [01_detecting_image_manipulation_quick_start.ipynb](https://colab.research.google.com/drive/1tBZIMXjDLwjrbnHGNdtVgsyXoaQ2q6KK?usp=sharing)
- [02_where_was_the_image_manipulated.ipynb](https://colab.research.google.com/drive/1EBZYx56DQcTaxPlP_hWCnXaVDzjcv_TV?usp=sharing)
- [03_understanding_vaas_scores_sf_sp_sh.ipynb](https://colab.research.google.com/drive/1yNKrlwue9BItzqmhZUZ4-3d5kBAm9qys?usp=sharing)
- [04_effect_of_alpha_on_anomaly_scoring.ipynb](https://colab.research.google.com/drive/1IlBhIOzUEqaeqJnPJ6bWfjw0nv6BBATe?usp=sharing)
- [05_running_vaas_on_cpu_cuda_mps.ipynb](https://colab.research.google.com/drive/1XeQjEdlWtisZoDDPp6WxwbNxoYC43wyk?usp=sharing)
- [06_loading_vaas_models_from_huggingface.ipynb](https://colab.research.google.com/drive/16X5S_aarUKGktMYlW2bo2Fp4p5VX5p85?usp=sharing)
- [07_batch_analysis_with_vaas_folder_workflow.ipynb](https://colab.research.google.com/drive/1RBoG70bH9k3YceU0VdyfewlrDgjOOaom?usp=sharing)
- [08_ranking_images_by_visual_suspicion.ipynb](https://colab.research.google.com/drive/18D4eV_fgomOIrxsyP_U__HYrTl-ZtC8e?usp=sharing)
- [09_using_vaas_outputs_in_downstream_research.ipynb](https://colab.research.google.com/drive/1AiciR4GcXimFgr7M8Q8fXFCTekpmXN_X?usp=sharing)
- [10_known_limitations_and_future_research_directions.ipynb](https://colab.research.google.com/drive/1Vr2ufQp-pWwMh6tQt84DilYu6ESm-ZP2?usp=sharing)
---
### 1. Quick start: run VAAS and get a visual result
```python
from vaas.inference.pipeline import VAASPipeline
from PIL import Image
import requests
from io import BytesIO
pipeline = VAASPipeline.from_pretrained(
repo_id="OBA-Research/vaas",
device="cpu",
alpha=0.5,
model_variant="v2-base-df2023" # v2-medium-df2023 and v2-large-df2023 are also available
)
url = "https://raw.githubusercontent.com/OBA-Research/VAAS/main/examples/images/COCO_DF_C110B00000_00539519.jpg"
image = Image.open(BytesIO(requests.get(url).content)).convert("RGB")
pipeline.visualize(
image=image,
save_path="vaas_visualization.png",
mode="all",
threshold=0.5,
)
```
---
### 2. Programmatic inference (scores + anomaly map)
```python
result = pipeline(image)
print(result)
anomaly_map = result["anomaly_map"]
```
#### Output format
```python
{
"S_F": float,
"S_P": float,
"S_H": float,
"anomaly_map": ndarray
}
```
---
## Model Variants
### v2 (Cross-Attention VAAS)
| Models | Training Data | Description | Hugging Face Model |
|--------|----------------|-------------|--------------------|
| vaas-v2-base-df2023 | DF2023 (10%) | Lightweight inference with cross-attention fusion | [vaas-v2-base-df2023](https://huggingface.co/OBA-Research/vaas/tree/v2-base-df2023) |
| vaas-v2-medium-df2023 | DF2023 (β50%) | Balanced anomaly reasoning with improved localisation | [vaas-v2-medium-df2023](https://huggingface.co/OBA-Research/vaas/tree/v2-medium-df2023) |
| vaas-v2-large-df2023 | DF2023 (100%) | Full-scale training with highest sensitivity and interpretability | [vaas-v2-large-df2023](https://huggingface.co/OBA-Research/vaas/tree/v2-large-df2023) |
---
### V1 Model Variants
| Models | Training Data | Description | Reported Evaluation (Paper) | Hugging Face Model |
| --------------------- | ------------- | -------------------------------- | --------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
| vaas-v1-base-df2023 | DF2023 (10%) | Initial public inference release | [F1 & IoU are reported in research paper](https://arxiv.org/pdf/2512.15512) | [vaas-v1-base-df2023](https://huggingface.co/OBA-Research/vaas) |
| vaas-v1-medium-df2023 | DF2023 (β50%) | Scale-up experiment | 5% better than base | [vaas-v1-medium-df2023](https://huggingface.co/OBA-Research/vaas/tree/v1-medium-df2023) |
| vaas-v1-large-df2023 | DF2023 (100%) | Full-dataset training | 9% better than medium | [vaas-v1-large-df2023](https://huggingface.co/OBA-Research/vaas/tree/v1-large-df2023) |
---
## Notes
- VAAS supports both local and online images
- PyTorch is loaded lazily and only required at runtime
- CPU inference is supported; GPU accelerates execution but is optional
---
## Intended Use
- Image anomaly detection
- Visual integrity assessment
- Explainable inspection of irregular regions
- Research on attention-based anomaly scoring
- Prototyping anomaly-aware vision systems
---
## Limitations
- Trained on a single dataset
- Does not classify anomaly types
- Performance may degrade on out-of-distribution imagery
---
## Ethical Considerations
VAAS is intended for research and inspection purposes.
It should not be used as a standalone decision-making system in high-stakes or sensitive applications without human oversight.
---
## Citation
If you use VAAS, please cite both the software and the associated paper.
```bibtex
@software{vaas,
title = {VAAS: Vision-Attention Anomaly Scoring},
author = {Bamigbade, Opeyemi and Scanlon, Mark and Sheppard, John},
year = {2025},
publisher = {Zenodo},
doi = {10.5281/zenodo.18064355},
url = {https://doi.org/10.5281/zenodo.18064355}
}
```
```bibtex
@article{BAMIGBADE2026302063,
title = {VAAS: Vision-Attention Anomaly Scoring for image manipulation detection in digital forensics},
journal = {Forensic Science International: Digital Investigation},
volume = {56},
pages = {302063},
year = {2026},
note = {DFRWS EU 2026 - Selected Papers from the 13th Annual Digital Forensics Research Conference Europe},
issn = {2666-2817},
doi = {https://doi.org/10.1016/j.fsidi.2026.302063},
url = {https://www.sciencedirect.com/science/article/pii/S266628172600020X},
author = {Opeyemi Bamigbade and Mark Scanlon and John Sheppard},
keywords = {Digital forensics, Image manipulation detection, Tamper localisation, Explainable AI, Vision transformers, Segmentation, Attention mechanisms, Anomaly scoring},
abstract = {Recent advances in AI-driven image generation have introduced new challenges for verifying the authenticity of digital evidence in forensic investigations. Modern generative models can produce visually consistent forgeries that evade traditional detectors based on pixel or compression artefacts. Most existing approaches also lack an explicit measure of anomaly intensity, which limits their ability to quantify the severity of manipulation. This paper introduces Vision-Attention Anomaly Scoring (VAAS), a novel dual-module framework that integrates global attention-based anomaly estimation using Vision Transformers (ViT) with patch-level self-consistency scoring derived from segmentation embeddings. The hybrid formulation provides a continuous and interpretable anomaly score that reflects both the location and degree of manipulation. Evaluations on the DF2023 and CASIA v2.0 datasets demonstrate that vaas achieve competitive F1 and IoU performance, while enhancing visual explainability through attention-guided anomaly maps. The framework bridges quantitative detection with human-understandable reasoning, supporting transparent and reliable image integrity assessment. The source code for all experiments and corresponding materials for reproducing the results are available open source.}
}
```
---
## Contributing
We welcome contributions that improve the usability, robustness, and extensibility of VAAS.
See:
https://github.com/OBA-Research/VAAS/blob/main/CONTRIBUTING.md
---
## License
MIT License
---
## Maintainers
**OBA-Research**
- https://github.com/OBA-Research
- https://huggingface.co/OBA-Research |