| | --- |
| | license: apache-2.0 |
| | tags: |
| | - RyzenAI |
| | - Image Segmentation |
| | - Pytorch |
| | - Vision |
| | datasets: |
| | - cityscape |
| | language: |
| | - en |
| | Metircs: |
| | - mIoU |
| | --- |
| | |
| | # SemanticFPN model trained on cityscapes |
| |
|
| | SemanticFPN is a conceptually simple yet effective baseline for panoptic segmentation trained on cityscapes. The method starts with Mask R-CNN with FPN and adds to it a lightweight semantic segmentation branch for dense-pixel prediction. It was introduced in the paper [Panoptic Feature Pyramid Networks in 2019](https://arxiv.org/pdf/1901.02446.pdf) by Kirillov, Alexander, et al. |
| |
|
| | We develop a modified version that could be supported by [AMD Ryzen AI](https://ryzenai.docs.amd.com). |
| |
|
| |
|
| | ## Model description |
| |
|
| | SemanticFPN is a single network that unifies the tasks of instance segmentation and semantic segmentation. The network is designed by endowing Mask R-CNN, a popular instance segmentation method, with a semantic segmentation branch using a shared Feature Pyramid Network (FPN) backbone. This simple baseline not only remains effective for instance segmentation, but also yields a lightweight, top-performing method for semantic segmentation. It is a robust and accurate baseline for both tasks and can serve as a strong baseline for future research in panoptic segmentation. |
| |
|
| |
|
| | ## Intended uses & limitations |
| |
|
| | You can use the raw model for image segmentation. See the [model hub](https://huggingface.co/models?sort=trending&search=amd%2FSemanticFPN) to look for all available SemanticFPN models. |
| |
|
| |
|
| | ## How to use |
| |
|
| | ### Installation |
| |
|
| | Follow [Ryzen AI Installation](https://ryzenai.docs.amd.com/en/latest/inst.html) to prepare the environment for Ryzen AI. |
| | Run the following script to install pre-requisites for this model. |
| | ```bash |
| | pip install -r requirements.txt |
| | ``` |
| |
|
| |
|
| | ### Data Preparation (optional: for accuracy evaluation) |
| |
|
| | 1. Download cityscapes dataset (https://www.cityscapes-dataset.com/downloads) |
| | - grundtruth folder: gtFine_trainvaltest.zip [241MB] |
| | - image folder: leftImg8bit_trainvaltest.zip [11GB] |
| | 2. Organize the dataset directory as follows: |
| | ```Plain |
| | βββ data |
| | βββ cityscapes |
| | βββ leftImg8bit |
| | | βββ train |
| | | βββ val |
| | βββ gtFine |
| | βββ train |
| | βββ val |
| | ``` |
| |
|
| | ### Test & Evaluation |
| |
|
| | - Code snippet from [`infer_onnx.py`](infer_onnx.py) on how to use |
| | ```python |
| | parser = argparse.ArgumentParser(description='SemanticFPN model') |
| | parser.add_argument('--onnx_path', type=str, default='FPN_int.onnx') |
| | parser.add_argument('--save_path', type=str, default='./data/demo_results/senmatic_results.png') |
| | parser.add_argument('--input_path', type=str, default='data/cityscapes/cityscapes/leftImg8bit/test/bonn/bonn_000000_000019_leftImg8bit.png') |
| | parser.add_argument('--ipu', action='store_true', |
| | help='use ipu') |
| | parser.add_argument('--provider_config', type=str, default=None, |
| | help='provider config path') |
| | args = parser.parse_args() |
| | |
| | if args.ipu: |
| | providers = ["VitisAIExecutionProvider"] |
| | provider_options = [{"config_file": args.provider_config}] |
| | else: |
| | providers = ['CPUExecutionProvider'] |
| | provider_options = None |
| | |
| | onnx_path = args.onnx_path |
| | input_img = build_img(args) |
| | session = onnxruntime.InferenceSession(onnx_path, providers=providers, provider_options=provider_options) |
| | ort_input = {session.get_inputs()[0].name: input_img.cpu().numpy()} |
| | ort_output = session.run(None, ort_input)[0] |
| | if isinstance(ort_output, (tuple, list)): |
| | ort_output = ort_output[0] |
| | |
| | output = ort_output[0].transpose(1, 2, 0) |
| | seg_pred = np.asarray(np.argmax(output, axis=2), dtype=np.uint8) |
| | color_mask = colorize_mask(seg_pred) |
| | color_mask.save(args.save_path) |
| | ``` |
| |
|
| | - Run inference for a single image |
| | ```python |
| | python infer_onnx.py --onnx_path FPN_int.onnx --input_path /Path/To/Your/Image --ipu --provider_config Path/To/vaip_config.json |
| | ``` |
| |
|
| | - Test accuracy of the quantized model |
| | ```python |
| | python test_onnx.py --onnx_path FPN_int.onnx --dataset citys --test-folder ./data/cityscapes --crop-size 256 --ipu --provider_config Path/To/vaip_config.json |
| | ``` |
| | ### Performance |
| |
|
| | | model | input size | FLOPs | mIoU on Cityscapes Validation| |
| | |-------|------------|--------------|-------| |
| | | SemanticFPN(ResNet18)| 256x512 | 10G | 62.9% | |
| |
|
| | | model | input size | FLOPs | INT8 mIoU on Cityscapes Validation| |
| | |-------|------------|---------------|--------------| |
| | | SemanticFPN(ResNet18)| 256x512 | 10G | 62.5% | |
| |
|
| | ```bibtex |
| | @inproceedings{kirillov2019panoptic, |
| | title={Panoptic feature pyramid networks}, |
| | author={Kirillov, Alexander and Girshick, Ross and He, Kaiming and Doll{\'a}r, Piotr}, |
| | booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition}, |
| | pages={6399--6408}, |
| | year={2019} |
| | } |
| | ``` |