metadata
license: apache-2.0
library_name: pytorch
pipeline_tag: object-detection
tags:
- model_hub_mixin
- pytorch_model_hub_mixin
- object-detection
- detrs
- dinov3
DEIMv2: Real-Time Object Detection Meets DINOv3
DEIMv2 is an evolution of the DEIM framework that leverages features from DINOv3. It spans eight model sizes (from Atto to X), covering GPU, edge, and mobile deployment scenarios. DEIMv2 achieves state-of-the-art results by combining DINOv3-pretrained backbones with a Spatial Tuning Adapter (STA) for larger models, and using pruned HGNetv2 for ultra-lightweight variants.
- Paper: Real-Time Object Detection Meets DINOv3
- Repository: GitHub - DEIMv2
- Project Page: DEIMv2 Project
Model Zoo (COCO)
| Model | AP | #Params | GFLOPs |
|---|---|---|---|
| Atto | 23.8 | 0.5M | 0.8 |
| Femto | 31.0 | 1.0M | 1.7 |
| Pico | 38.5 | 1.5M | 5.2 |
| N | 43.0 | 3.6M | 6.8 |
| S | 50.9 | 9.7M | 25.6 |
| M | 53.0 | 18.1M | 52.2 |
| L | 56.0 | 32.2M | 96.7 |
| X | 57.8 | 50.3M | 151.6 |
Usage
This model can be loaded using the PyTorchModelHubMixin integration. Please ensure you have the necessary components from the official DEIMv2 repository in your Python path.
import torch.nn as nn
from huggingface_hub import PyTorchModelHubMixin
from engine.backbone import HGNetv2, DINOv3STAs
from engine.deim import HybridEncoder, LiteEncoder
from engine.deim import DFINETransformer, DEIMTransformer
from engine.deim.postprocessor import PostProcessor
class DEIMv2(nn.Module, PyTorchModelHubMixin):
def __init__(self, config):
super().__init__()
# Select backbone based on the configuration
if "HGNetv2" in config:
self.backbone = HGNetv2(**config["HGNetv2"])
else:
self.backbone = DINOv3STAs(**config["DINOv3STAs"])
self.encoder = HybridEncoder(**config["HybridEncoder"])
self.decoder = DEIMTransformer(**config["DEIMTransformer"])
self.postprocessor = PostProcessor(**config["PostProcessor"])
def forward(self, x, orig_target_sizes):
x = self.backbone(x)
x = self.encoder(x)
x = self.decoder(x)
x = self.postprocessor(x, orig_target_sizes)
return x
# Load the model from the Hub
# Replace the model ID with the specific variant you wish to use
model = DEIMv2.from_pretrained("Intellindust/DEIMv2_DINOv3_S_COCO")
Citation
@article{huang2025deimv2,
title={Real-Time Object Detection Meets DINOv3},
author={Huang, Shihua and Hou, Yongjie and Liu, Longfei and Yu, Xuanlong and Shen, Xi},
journal={arXiv preprint arXiv:2509.20787},
year={2025}
}