UPerNet with ALiBi-ViT Backbone for Semantic Segmentation

This model is a UPerNet semantic segmentation model with a ALiBi-ViT (Vision Transformer with Rotary Position Embeddings) backbone, trained on the ADE20K dataset.

Model Description

  • Architecture: UPerNet
  • Backbone: ALiBi-ViT Tiny
  • Dataset: ADE20K
  • Task: Semantic Segmentation
  • Framework: MMSegmentation

Training Results

Metric Value
mIoU 23.49%
mAcc 32.01%
aAcc 70.84%

Usage

from mmseg.apis import init_model, inference_model

config_file = 'upernet_alibi_vit_tiny_512x512_ade20k.py'
checkpoint_file = 'best_mIoU_iter_40000.pth'

# Initialize the model
model = init_model(config_file, checkpoint_file, device='cuda:0')

# Inference on an image
result = inference_model(model, 'demo.jpg')

Training Configuration

The model was trained with the following configuration:

  • Input size: 512x512
  • Training iterations: 40,000
  • Optimizer: AdamW
  • Learning rate scheduler: Polynomial decay

Citation

If you use this model, please cite:

@misc{rope-vit-segmentation,
  author = {VLG IITR},
  title = {UPerNet with ALiBi-ViT for Semantic Segmentation},
  year = {2026},
  publisher = {Hugging Face},
}

License

This model is released under the Apache 2.0 license.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train aadex/upernet-alibi-vit-ade20k