nielsr's picture
nielsr HF Staff
Improve model card and add metadata
ddc04a7 verified
|
raw
history blame
2.91 kB
metadata
license: apache-2.0
library_name: pytorch
pipeline_tag: object-detection
tags:
  - model_hub_mixin
  - pytorch_model_hub_mixin
  - object-detection
  - detrs
  - dinov3

DEIMv2: Real-Time Object Detection Meets DINOv3

DEIMv2 is an evolution of the DEIM framework that leverages features from DINOv3. It spans eight model sizes (from Atto to X), covering GPU, edge, and mobile deployment scenarios. DEIMv2 achieves state-of-the-art results by combining DINOv3-pretrained backbones with a Spatial Tuning Adapter (STA) for larger models, and using pruned HGNetv2 for ultra-lightweight variants.

Model Zoo (COCO)

Model AP #Params GFLOPs
Atto 23.8 0.5M 0.8
Femto 31.0 1.0M 1.7
Pico 38.5 1.5M 5.2
N 43.0 3.6M 6.8
S 50.9 9.7M 25.6
M 53.0 18.1M 52.2
L 56.0 32.2M 96.7
X 57.8 50.3M 151.6

Usage

This model can be loaded using the PyTorchModelHubMixin integration. Please ensure you have the necessary components from the official DEIMv2 repository in your Python path.

import torch.nn as nn
from huggingface_hub import PyTorchModelHubMixin

from engine.backbone import HGNetv2, DINOv3STAs
from engine.deim import HybridEncoder, LiteEncoder
from engine.deim import DFINETransformer, DEIMTransformer
from engine.deim.postprocessor import PostProcessor

class DEIMv2(nn.Module, PyTorchModelHubMixin):
    def __init__(self, config):
        super().__init__()
        # Select backbone based on the configuration
        if "HGNetv2" in config:
            self.backbone = HGNetv2(**config["HGNetv2"])
        else:
            self.backbone = DINOv3STAs(**config["DINOv3STAs"])
            
        self.encoder = HybridEncoder(**config["HybridEncoder"])
        self.decoder = DEIMTransformer(**config["DEIMTransformer"])
        self.postprocessor = PostProcessor(**config["PostProcessor"])

    def forward(self, x, orig_target_sizes):
        x = self.backbone(x)
        x = self.encoder(x)
        x = self.decoder(x)
        x = self.postprocessor(x, orig_target_sizes)

        return x

# Load the model from the Hub
# Replace the model ID with the specific variant you wish to use
model = DEIMv2.from_pretrained("Intellindust/DEIMv2_DINOv3_S_COCO")

Citation

@article{huang2025deimv2,
  title={Real-Time Object Detection Meets DINOv3},
  author={Huang, Shihua and Hou, Yongjie and Liu, Longfei and Yu, Xuanlong and Shen, Xi},
  journal={arXiv preprint arXiv:2509.20787},
  year={2025}
}