TUNI: Real-Time RGB-T Semantic Segmentation

Paper: TUNI: Real-time RGB-T Semantic Segmentation with Unified Multi-Modal Feature Extraction and Cross-Modal Feature Fusion Venue: ICRA 2026 Authors: Xiaodong Guo et al.

Model

  • Architecture: TUNI (Conv + Attention hybrid) with unified RGB-T encoder
  • Variant: 384_2242 (dims=[48,96,192,384], depths=[2,2,4,2])
  • Parameters: 10.6M
  • Input: RGB (3ร—480ร—640) + Thermal (3ร—480ร—640)
  • Output: Segmentation mask (480ร—640)

Performance

Dataset Classes mIoU Pixel Acc
FMB 15 62.4% 91.4%
PST900 5 TBD TBD
CART 12 TBD TBD

Available Formats

  • tuni_fmb.pth โ€” PyTorch state dict
  • tuni_fmb.safetensors โ€” Safetensors format
  • tuni_fmb.onnx + tuni_fmb.onnx.data โ€” ONNX (opset 18)
  • tuni_fmb_fp16.trt โ€” TensorRT FP16 engine
  • tuni_fmb_fp32.trt โ€” TensorRT FP32 engine

Usage

from def_tuni.model import TUNIModel

model = TUNIModel(variant="384_2242", n_classes=15)
model.load_checkpoint("tuni_fmb.pth")
model.eval().cuda()

# Inference
pred = model(rgb_tensor, thermal_tensor)
segmentation = pred.argmax(dim=1)

ANIMA Module

Part of the ANIMA Defense Module ecosystem. Built with ANIMA by Robot Flow Labs.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Paper for ilessio-aiflowlab/DEF-tuni