DEF-rgbtcc: RGB-T Crowd Counting

Dual-Modulation Framework for RGB-T Crowd Counting via Spatially Modulated Attention and Adaptive Fusion.

Paper: ArXiv 2509.17079

Architecture

  • Backbone: Shared VGG-19
  • Encoder: Spatially Modulated Attention (SMA) Transformer
  • Fusion: Adaptive Cross-Modal Fusion (ACMF)
  • Output: Density map regression

Available Formats

  • model.pth โ€” PyTorch state dict
  • model.safetensors โ€” SafeTensors format
  • model.onnx โ€” ONNX (opset 17)
  • model_fp16.trt โ€” TensorRT FP16
  • model_fp32.trt โ€” TensorRT FP32

Usage

from def_rgbtcc.serve import RGBTCCInference

model = RGBTCCInference("model.pth")
result = model.predict(rgb_image, thermal_image)
print(f"Count: {result['count']:.1f}")

ANIMA Module

Part of the ANIMA Defense Module ecosystem (Wave 8). Products: ORACLE, ATLAS, NEMESIS

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Paper for ilessio-aiflowlab/DEF-rgbtcc