Spaces:
Runtime error
Runtime error
| # Prepare Datasets for FrozenSeg | |
| A dataset can be used by accessing [DatasetCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.DatasetCatalog) | |
| for its data, or [MetadataCatalog](https://detectron2.readthedocs.io/modules/data.html#detectron2.data.MetadataCatalog) for its metadata (class names, etc). | |
| This document explains how to setup the builtin datasets so they can be used by the above APIs. | |
| [Use Custom Datasets](https://detectron2.readthedocs.io/tutorials/datasets.html) gives a deeper dive on how to use `DatasetCatalog` and `MetadataCatalog`, | |
| and how to add new datasets to them. | |
| FrozenSeg has builtin support for a few datasets. | |
| The datasets are assumed to exist in a directory specified by the environment variable | |
| `DETECTRON2_DATASETS`. | |
| Under this directory, detectron2 will look for datasets in the structure described below, if needed. | |
| ``` | |
| $DETECTRON2_DATASETS/ | |
| # panoptic datasets | |
| ADEChallengeData2016/ | |
| coco/ | |
| cityscapes/ | |
| mapillary_vistas/ | |
| bdd100k/ | |
| # semantic datasets | |
| VOCdevkit/ | |
| ADE20K_2021_17_01/ | |
| pascal_ctx_d2/ | |
| pascal_voc_d2/ | |
| ``` | |
| You can set the location for builtin datasets by `export DETECTRON2_DATASETS=/path/to/datasets`. | |
| If left unset, the default is `./datasets` relative to your current working directory. | |
| ## Expected dataset structure for [COCO](https://cocodataset.org/#download): | |
| ``` | |
| coco/ | |
| annotations/ | |
| instances_{train,val}2017.json | |
| panoptic_{train,val}2017.json | |
| {train,val}2017/ | |
| # image files that are mentioned in the corresponding json | |
| panoptic_{train,val}2017/ # png annotations | |
| panoptic_semseg_{train,val}2017/ # generated by the script mentioned below | |
| ``` | |
| Install panopticapi by: | |
| ``` | |
| pip install git+https://github.com/cocodataset/panopticapi.git | |
| ``` | |
| Then, run `python datasets/prepare_coco_semantic_annos_from_panoptic_annos.py`, to extract semantic annotations from panoptic annotations (only used for evaluation). | |
| ## Expected dataset structure for [cityscapes](https://www.cityscapes-dataset.com/downloads/): | |
| ``` | |
| cityscapes/ | |
| gtFine/ | |
| train/ | |
| aachen/ | |
| color.png, instanceIds.png, labelIds.png, polygons.json, | |
| labelTrainIds.png | |
| ... | |
| val/ | |
| test/ | |
| # below are generated Cityscapes panoptic annotation | |
| cityscapes_panoptic_train.json | |
| cityscapes_panoptic_train/ | |
| cityscapes_panoptic_val.json | |
| cityscapes_panoptic_val/ | |
| cityscapes_panoptic_test.json | |
| cityscapes_panoptic_test/ | |
| leftImg8bit/ | |
| train/ | |
| val/ | |
| test/ | |
| ``` | |
| Install cityscapes scripts by: | |
| ``` | |
| pip install git+https://github.com/mcordts/cityscapesScripts.git | |
| ``` | |
| Note: to create labelTrainIds.png, first prepare the above structure, then run cityscapesescript with: | |
| ``` | |
| CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createTrainIdLabelImgs.py | |
| ``` | |
| These files are not needed for instance segmentation. | |
| Note: to generate Cityscapes panoptic dataset, run cityscapesescript with: | |
| ``` | |
| CITYSCAPES_DATASET=/path/to/abovementioned/cityscapes python cityscapesscripts/preparation/createPanopticImgs.py | |
| ``` | |
| These files are not needed for semantic and instance segmentation. | |
| ## Expected dataset structure for [ADE20k (A150)](http://sceneparsing.csail.mit.edu/): | |
| ``` | |
| ADEChallengeData2016/ | |
| images/ | |
| annotations/ | |
| objectInfo150.txt | |
| # download instance annotation | |
| annotations_instance/ | |
| # generated by prepare_ade20k_sem_seg.py | |
| annotations_detectron2/ | |
| # below are generated by prepare_ade20k_pan_seg.py | |
| ade20k_panoptic_{train,val}.json | |
| ade20k_panoptic_{train,val}/ | |
| # below are generated by prepare_ade20k_ins_seg.py | |
| ade20k_instance_{train,val}.json | |
| ``` | |
| The directory `annotations_detectron2` is generated by running `python datasets/prepare_ade20k_sem_seg.py`. | |
| Install panopticapi by: | |
| ```bash | |
| pip install git+https://github.com/cocodataset/panopticapi.git | |
| ``` | |
| Download the instance annotation from http://sceneparsing.csail.mit.edu/: | |
| ```bash | |
| wget http://sceneparsing.csail.mit.edu/data/ChallengeData2017/annotations_instance.tar | |
| ``` | |
| Then, run `python datasets/prepare_ade20k_pan_seg.py`, to combine semantic and instance annotations for panoptic annotations. | |
| And run `python datasets/prepare_ade20k_ins_seg.py`, to extract instance annotations in COCO format. | |
| ## Expected dataset structure for [Mapillary Vistas](https://www.mapillary.com/dataset/vistas): | |
| ``` | |
| mapillary_vistas/ | |
| training/ | |
| images/ | |
| instances/ | |
| labels/ | |
| panoptic/ | |
| validation/ | |
| images/ | |
| instances/ | |
| labels/ | |
| panoptic/ | |
| ``` | |
| No preprocessing is needed for Mapillary Vistas on semantic and panoptic segmentation. | |
| ## Expected dataset structure for [BDD100K](https://doc.bdd100k.com/download.html#id1) | |
| ``` | |
| bdd100k/ | |
| images/ | |
| 10k/ | |
| train/ | |
| val/ | |
| test/ | |
| json | |
| labels/ | |
| pan_seg/ | |
| sem_seg/ | |
| ``` | |
| `coco-format` annotations is obtained by running: | |
| ``` | |
| cd $DETECTRON2_DATASETS | |
| wget https://github.com/chenxi52/FrozenSeg/releases/download/latest/bdd100k_json.zip | |
| unzip bdd100k_json.zip | |
| ``` | |
| ## Expected dataset structure for [ADE20k-Full (A-847)](https://groups.csail.mit.edu/vision/datasets/ADE20K/): | |
| ``` | |
| ADE20K_2021_17_01/ | |
| images/ | |
| index_ade20k.pkl | |
| objects.txt | |
| # generated by prepare_ade20k_full_sem_seg.py | |
| images_detectron2/ | |
| annotations_detectron2/ | |
| ``` | |
| Register and download the dataset from https://groups.csail.mit.edu/vision/datasets/ADE20K/: | |
| ```bash | |
| cd $DETECTRON2_DATASETS | |
| wget your/personal/download/link/{username}_{hash}.zip | |
| unzip {username}_{hash}.zip | |
| ``` | |
| Generate the directories `ADE20K_2021_17_01/images_detectron2` and `ADE20K_2021_17_01/annotations_detectron2` by running: | |
| ```bash | |
| python datasets/prepare_ade20k_full_sem_seg.py | |
| ``` | |
| ## Expected dataset structure for [PASCAL Context Full (PC-459)](https://www.cs.stanford.edu/~roozbeh/pascal-context/) and [PASCAL VOC (PAS-21)](http://host.robots.ox.ac.uk/pascal/VOC/): | |
| ```bash | |
| VOCdevkit/ | |
| VOC2012/ | |
| Annotations/ | |
| JPEGImages/ | |
| ImageSets/ | |
| Segmentation/ | |
| VOC2010/ | |
| JPEGImages/ | |
| trainval/ | |
| trainval_merged.json | |
| # generated by prepare_pascal_voc_sem_seg.py | |
| pascal_voc_d2/ | |
| images/ | |
| annotations_pascal21/ | |
| # pascal 20 excludes the background class | |
| annotations_pascal20/ | |
| # generated by prepare_pascal_ctx_sem_seg.py | |
| pascal_ctx_d2/ | |
| images/ | |
| annotations_ctx59/ | |
| # generated by prepare_pascal_ctx_full_sem_seg.py | |
| annotations_ctx459/ | |
| ``` | |
| ### PASCAL VOC (PAS-21) | |
| Download the dataset from http://host.robots.ox.ac.uk/pascal/VOC/: | |
| ```bash | |
| cd $DETECTRON2_DATASETS | |
| wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar | |
| # generate folder VOCdevkit/VOC2012 | |
| tar -xvf VOCtrainval_11-May-2012.tar | |
| ``` | |
| Generate directory `pascal_voc_d2` running: | |
| ```bash | |
| python datasets/prepare_pascal_voc_sem_seg.py | |
| ``` | |
| ### PASCAL Context Full (PC-459) | |
| Download the dataset from http://host.robots.ox.ac.uk/pascal/VOC/ and annotation from https://www.cs.stanford.edu/~roozbeh/pascal-context/: | |
| ```bash | |
| cd $DETECTRON2_DATASETS | |
| wget http://host.robots.ox.ac.uk/pascal/VOC/voc2010/VOCtrainval_03-May-2010.tar | |
| # generate folder VOCdevkit/VOC2010 | |
| tar -xvf VOCtrainval_03-May-2010.tar | |
| wget https://www.cs.stanford.edu/~roozbeh/pascal-context/trainval.tar.gz | |
| # generate folder VOCdevkit/VOC2010/trainval | |
| tar -xvzf trainval.tar.gz -C VOCdevkit/VOC2010 | |
| wget https://codalabuser.blob.core.windows.net/public/trainval_merged.json -P VOCdevkit/VOC2010/ | |
| ``` | |
| Install [Detail API](https://github.com/zhanghang1989/detail-api) by: | |
| ```bash | |
| git clone https://github.com/zhanghang1989/detail-api.git | |
| rm detail-api/PythonAPI/detail/_mask.c | |
| pip install -e detail-api/PythonAPI/ | |
| ``` | |
| Generate directory `pascal_ctx_d2/images` running: | |
| ```bash | |
| python datasets/prepare_pascal_ctx_sem_seg.py | |
| ``` | |
| Generate directory `pascal_ctx_d2/annotations_ctx459` running: | |
| ```bash | |
| python datasets/prepare_pascal_ctx_full_sem_seg.py | |
| ``` |