LibreEoMTl-sem
EoMT-L semantic segmentation weights for ADE20K, converted to the LibreYOLO checkpoint format.
Source
Derived from tue-mps/ade20k_semantic_eomt_large_512, the DINOv2-based ADE20K semantic checkpoint for EoMT.
EoMT was introduced in "Your ViT is Secretly an Image Segmentation Model" by Tommie Kerssies, Niccolo Cavagnero, Alexander Hermans, Narges Norouzi, Giuseppe Averta, Bastian Leibe, Gijs Dubbelman, and Daan de Geus. The official implementation is at tue-mps/eomt and is licensed under the MIT License.
The DINOv2 backbone is Apache-2.0. LibreYOLO ships only the DINOv2-based ADE20K EoMT-L semantic checkpoint here. DINOv3 EoMT variants are intentionally excluded because they depend on gated non-commercial DINOv3 weights.
Modifications
State-dict normalization and LibreYOLO v1.0 metadata wrapping only. Learned
parameters are unchanged. See weights/convert_eomt_weights.py in the
LibreYOLO source repository.
The checkpoint metadata is:
model_family: eomtsize: ltask: semanticnc: 150imgsz: 512
Usage
from libreyolo import LibreYOLO
model = LibreYOLO("LibreEoMTl-sem.pt")
result = model("image.jpg")
semantic_mask = result.semantic_mask
Install the EoMT runtime dependency with:
pip install "libreyolo[eomt]"
Validation
The parity target for this checkpoint is ADE20K val mIoU 58.4 +/- 0.5. ADE20K is not redistributed in this repository. Users are responsible for complying with ADE20K dataset terms when validating or fine-tuning.
License
MIT License for EoMT weights. See LICENSE. DINOv2 and the
Hugging Face Transformers EoMT runtime are Apache-2.0; see NOTICE
for attribution.