VOC Semantic Segmentation — EfficientSegNet (MobileNetV3-Large + LR-ASPP)

Semantic segmentation model trained on Pascal VOC 2012.
Architecture: MobileNetV3-Large + LR-ASPP (pretrained COCO backbone).
Optimised with Optuna TPE-sampler HPO.

Metrics

Metric	Value
Macro DICE	0.7645
FLOPs/image	7.3561e+08
DICE / GFLOPs	1.0392
Parameters	3.22M

Best Hyperparameters (from Optuna HPO)

Hyperparameter	Value
—	N/A (direct training)

Training Details

Dataset: Pascal VOC 2012 (train split, 80/20 train/val)
Image size: 300×300
Loss: Combined CrossEntropy + Dice Loss
Optimizer: AdamW + CosineAnnealingLR
AMP: Mixed precision (PyTorch 2.x)
Augmentation: HFlip, Rotation, Gaussian noise / blur / salt-pepper / brightness

VOC Classes

background, aeroplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, diningtable, dog, horse, motorbike, person, pottedplant, sheep, sofa, train, tvmonitor

Usage

import torch
from model import EfficientSegNet
from config import Config

model = EfficientSegNet(pretrained=False).to(Config.DEVICE)
ckpt  = torch.load("best_model.pth", map_location=Config.DEVICE, weights_only=True)
model.load_state_dict(ckpt["model_state"])
model.eval()

Trained: 2026-03-07 19:25

Downloads last month: -; Downloads are not tracked for this model. How to track