Spaces:

tomofi
/

NDLOCR

Build error

App Files Files Community

NDLOCR / src /ndl_layout /mmdetection /docs /robustness_benchmarking.md

3v324v23

Add files

c9019cd almost 4 years ago

preview code

raw

history blame contribute delete

5.63 kB

	# Corruption Benchmarking

	## Introduction

	We provide tools to test object detection and instance segmentation models on the image corruption benchmark defined in [Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming](https://arxiv.org/abs/1907.07484).
	This page provides basic tutorials how to use the benchmark.

	```latex
	@article{michaelis2019winter,
	title={Benchmarking Robustness in Object Detection:
	Autonomous Driving when Winter is Coming},
	author={Michaelis, Claudio and Mitzkus, Benjamin and
	Geirhos, Robert and Rusak, Evgenia and
	Bringmann, Oliver and Ecker, Alexander S. and
	Bethge, Matthias and Brendel, Wieland},
	journal={arXiv:1907.07484},
	year={2019}
	}
	```

	![image corruption example](../resources/corruptions_sev_3.png)

	## About the benchmark

	To submit results to the benchmark please visit the [benchmark homepage](https://github.com/bethgelab/robust-detection-benchmark)

	The benchmark is modelled after the [imagenet-c benchmark](https://github.com/hendrycks/robustness) which was originally
	published in [Benchmarking Neural Network Robustness to Common Corruptions and Perturbations](https://arxiv.org/abs/1903.12261) (ICLR 2019) by Dan Hendrycks and Thomas Dietterich.

	The image corruption functions are included in this library but can be installed separately using:

	```shell
	pip install imagecorruptions
	```

	Compared to imagenet-c a few changes had to be made to handle images of arbitrary size and greyscale images.
	We also modfied the 'motion blur' and 'snow' corruptions to remove dependency from a linux specific library,
	which would have to be installed separately otherwise. For details please refer to the [imagecorruptions repository](https://github.com/bethgelab/imagecorruptions).

	## Inference with pretrained models

	We provide a testing script to evaluate a models performance on any combination of the corruptions provided in the benchmark.

	### Test a dataset

	- [x] single GPU testing
	- [ ] multiple GPU testing
	- [ ] visualize detection results

	You can use the following commands to test a models performance under the 15 corruptions used in the benchmark.

	```shell
	# single-gpu testing
	python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
	```

	Alternatively different group of corruptions can be selected.

	```shell
	# noise
	python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --corruptions noise

	# blur
	python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --corruptions blur

	# wetaher
	python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --corruptions weather

	# digital
	python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --corruptions digital
	```

	Or a costom set of corruptions e.g.:

	```shell
	# gaussian noise, zoom blur and snow
	python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --corruptions gaussian_noise zoom_blur snow
	```

	Finally the corruption severities to evaluate can be chosen.
	Severity 0 corresponds to clean data and the effect increases from 1 to 5.

	```shell
	# severity 1
	python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --severities 1

	# severities 0,2,4
	python tools/analysis_tools/test_robustness.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] --severities 0 2 4
	```

	## Results for modelzoo models

	The results on COCO 2017val are shown in the below table.

	Model \| Backbone \| Style \| Lr schd \| box AP clean \| box AP corr. \| box % \| mask AP clean \| mask AP corr. \| mask % \|
	:-----:\|:---------:\|:-------:\|:-------:\|:------------:\|:------------:\|:-----:\|:-------------:\|:-------------:\|:------:\|
	Faster R-CNN \| R-50-FPN \| pytorch \| 1x \| 36.3 \| 18.2 \| 50.2 \| - \| - \| - \|
	Faster R-CNN \| R-101-FPN \| pytorch \| 1x \| 38.5 \| 20.9 \| 54.2 \| - \| - \| - \|
	Faster R-CNN \| X-101-32x4d-FPN \| pytorch \|1x \| 40.1 \| 22.3 \| 55.5 \| - \| - \| - \|
	Faster R-CNN \| X-101-64x4d-FPN \| pytorch \|1x \| 41.3 \| 23.4 \| 56.6 \| - \| - \| - \|
	Faster R-CNN \| R-50-FPN-DCN \| pytorch \| 1x \| 40.0 \| 22.4 \| 56.1 \| - \| - \| - \|
	Faster R-CNN \| X-101-32x4d-FPN-DCN \| pytorch \| 1x \| 43.4 \| 26.7 \| 61.6 \| - \| - \| - \|
	Mask R-CNN \| R-50-FPN \| pytorch \| 1x \| 37.3 \| 18.7 \| 50.1 \| 34.2 \| 16.8 \| 49.1 \|
	Mask R-CNN \| R-50-FPN-DCN \| pytorch \| 1x \| 41.1 \| 23.3 \| 56.7 \| 37.2 \| 20.7 \| 55.7 \|
	Cascade R-CNN \| R-50-FPN \| pytorch \| 1x \| 40.4 \| 20.1 \| 49.7 \| - \| - \| - \|
	Cascade Mask R-CNN \| R-50-FPN \| pytorch \| 1x\| 41.2 \| 20.7 \| 50.2 \| 35.7 \| 17.6 \| 49.3 \|
	RetinaNet \| R-50-FPN \| pytorch \| 1x \| 35.6 \| 17.8 \| 50.1 \| - \| - \| - \|
	Hybrid Task Cascade \| X-101-64x4d-FPN-DCN \| pytorch \| 1x \| 50.6 \| 32.7 \| 64.7 \| 43.8 \| 28.1 \| 64.0 \|

	Results may vary slightly due to the stochastic application of the corruptions.