UPerNet with ALiBi-ViT Backbone for Semantic Segmentation
This model is a UPerNet semantic segmentation model with a ALiBi-ViT (Vision Transformer with Rotary Position Embeddings) backbone, trained on the ADE20K dataset.
Model Description
- Architecture: UPerNet
- Backbone: ALiBi-ViT Tiny
- Dataset: ADE20K
- Task: Semantic Segmentation
- Framework: MMSegmentation
Training Results
| Metric | Value |
|---|---|
| mIoU | 23.49% |
| mAcc | 32.01% |
| aAcc | 70.84% |
Usage
from mmseg.apis import init_model, inference_model
config_file = 'upernet_alibi_vit_tiny_512x512_ade20k.py'
checkpoint_file = 'best_mIoU_iter_40000.pth'
# Initialize the model
model = init_model(config_file, checkpoint_file, device='cuda:0')
# Inference on an image
result = inference_model(model, 'demo.jpg')
Training Configuration
The model was trained with the following configuration:
- Input size: 512x512
- Training iterations: 40,000
- Optimizer: AdamW
- Learning rate scheduler: Polynomial decay
Citation
If you use this model, please cite:
@misc{rope-vit-segmentation,
author = {VLG IITR},
title = {UPerNet with ALiBi-ViT for Semantic Segmentation},
year = {2026},
publisher = {Hugging Face},
}
License
This model is released under the Apache 2.0 license.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support