--- license: mit --- # HIPIE: Hierarchical Open-vocabulary Universal Image Segmentation PyTorch implementation of HIPIE from ["Hierarchical Open-vocabulary Universal Image Segmentation"](https://arxiv.org/abs/2307.00764) (Wang et al., NeurIPS 2023). ## Pretrained Weights We provide ViT-H and ResNet-50 weights for hierarchical and part-aware image segmentation across multiple datasets: | Format | Filename | Description | |--------|----------|-------------| | ViT-H (O365, COCO, RefCOCO, PACO) | `vit_h_cloud.pth` | Pretrained with O365,COCO,RefCOCO,PACO | | ViT-H (COCO, RefCOCO, Pascal-Parts) | `vit_h_cloud_parts.pth` | Finetuned on COCO,RefCOCO,Pascal-Parts | | ResNet-50 (Pascal-Parts) | `r50_parts.pth` | Pretrained with O365,COCO,RefCOCO,Pascal Panoptic Parts | ## Usage For demo notebooks, model configs, and inference scripts, see the [GitHub repository](https://github.com/berkeley-hipie/HIPIE). ## Citation ``` @inproceedings{wang2023hierarchical, title={Hierarchical Open-vocabulary Universal Image Segmentation}, author={Wang, Xudong and Li, Shufan and Kallidromitis, Konstantinos and Kato, Yusuke and Kozuka, Kazuki and Darrell, Trevor}, booktitle={Thirty-seventh Conference on Neural Information Processing Systems}, year={2023} } ```