metadata
library_name: transformers
license: mit
tags:
- vision
- image-segmentation
- instance-segmentation
- pytorch
pipeline_tag: image-segmentation
datasets:
- coco
base_model:
- tue-mps/coco_instance_pmt_large_640_dinov3
PMT-DINOv3 (Large, 640px) for COCO Instance Segmentation
Overview
This is the large variant of the PMT-DINOv3 model trained for instance segmentation on COCO at 640x640 resolution.
Model Details
| Property | Value |
|---|---|
| Backbone | DINOv3 ViT-L/16 |
| Input Resolution | 640x640 |
| Task | Instance Segmentation |
| Dataset | COCO |
Citation
@inproceedings{cavagnero2026pmt,
author = {Cavagnero, Niccolò and Norouzi, Narges and Dubbelman, Gijs and de Geus, Daan},
title = {PMT: Plain Mask Transformer for Image and Video Segmentation with Frozen Vision Encoders},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
year = {2026},
}
Acknowledgements
- Original implementation: tue-mps/pmt
- Paper: arXiv:2503.19108