What is the miou of this model? Is it better, worse, or identical to the official convnext+upernet weights?

by danjacobellis - opened Feb 10

Feb 10

Sorry if this is listed somewhere else; I looked and could not find it.

The official convnext repo lists 46.0 miou for this configuration (convnext-tiny + upernet)

However, this smp version appears to use a different encoder (convnext_tiny.in12k_ft_in1k from timm) compared to the official one. Presumably, it is also using different weights for the upernet decoder as well.

When I tried evaluating the miou at 512*512 resolution, I got 44.5 for the smp-hub version. Is this the expected value for this model?

I've trained my own decoder (43.3 miou, segformer instead of upernet) that is much faster (~100 megapixels per second on my cpu instead of 33). I can share more details of my training configuration if interested.

I would be interested to know what the training configuration for this upernet decoder is if you're able to share it.

danjacobellis

Feb 11

The miou of this version goes up to 45.96 if I use sliding window inference

Resize short edge to 512
Apply model the model to overlapping 512*512 pixel crops with a stride of 341
Aggregate predictions for overlapping regions

So, I think these weights match the miou of the paper.

danjacobellis changed discussion status to closed Feb 11

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment