|
|
--- |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- object-detection |
|
|
- tensorrt |
|
|
- onnx |
|
|
- pytorch |
|
|
- real-time |
|
|
datasets: |
|
|
- coco |
|
|
library_name: transformers |
|
|
pipeline_tag: object-detection |
|
|
--- |
|
|
|
|
|
# DEIMv2 - Real-Time Object Detection Meets DINOv3 |
|
|
|
|
|
Pre-trained DEIMv2 models with PyTorch checkpoints, ONNX exports, and TensorRT FP16 engines. |
|
|
|
|
|
## Model Zoo |
|
|
|
|
|
| Model | AP | Params | GFLOPs | Checkpoint | ONNX | TensorRT | |
|
|
|:---:|:---:|:---:|:---:|:---:|:---:|:---:| |
|
|
| **Atto** | 23.8 | 0.5M | 0.8 | β
| β
| β
| |
|
|
| **Femto** | 31.0 | 1.0M | 1.7 | β
| β
| β
| |
|
|
| **Pico** | 38.5 | 1.5M | 5.2 | β
| β
| β
| |
|
|
| **N** | 43.0 | 3.6M | 6.8 | β
| β
| β
| |
|
|
| **S** | 50.9 | 9.7M | 25.6 | β
| β
| β
| |
|
|
| **M** | 53.0 | 18.1M | 52.2 | β
| β
| β
| |
|
|
| **L** | 56.0 | 32.2M | 96.7 | β
| β
| β
| |
|
|
| **X** | 57.8 | 50.3M | 151.6 | β
| β
| β
| |
|
|
|
|
|
## Files |
|
|
|
|
|
- `*.pth` - PyTorch checkpoints (EMA weights) |
|
|
- `*.onnx` - ONNX models (opset 17, dynamic batch) |
|
|
- `*.engine` - TensorRT FP16 engines (built on RTX 4090, TensorRT 10.14) |
|
|
|
|
|
## Input Shapes |
|
|
|
|
|
| Model | Input Size | |
|
|
|:---:|:---:| |
|
|
| Atto | 320x320 | |
|
|
| Femto | 416x416 | |
|
|
| Pico, N, S, M, L, X | 640x640 | |
|
|
|
|
|
## Usage |
|
|
|
|
|
### PyTorch |
|
|
```python |
|
|
from huggingface_hub import hf_hub_download |
|
|
import torch |
|
|
|
|
|
# Download checkpoint |
|
|
ckpt_path = hf_hub_download("carpedm20/DEIMv2", "deimv2_dinov3_s_coco.pth") |
|
|
checkpoint = torch.load(ckpt_path, map_location='cpu') |
|
|
state_dict = checkpoint['ema']['module'] |
|
|
``` |
|
|
|
|
|
### ONNX Runtime |
|
|
```python |
|
|
import onnxruntime as ort |
|
|
from huggingface_hub import hf_hub_download |
|
|
|
|
|
onnx_path = hf_hub_download("carpedm20/DEIMv2", "deimv2_dinov3_s_coco.onnx") |
|
|
session = ort.InferenceSession(onnx_path) |
|
|
``` |
|
|
|
|
|
### TensorRT |
|
|
```python |
|
|
import tensorrt as trt |
|
|
from huggingface_hub import hf_hub_download |
|
|
|
|
|
engine_path = hf_hub_download("carpedm20/DEIMv2", "deimv2_dinov3_s_coco.engine") |
|
|
# Load engine with TensorRT runtime |
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@article{huang2025deimv2, |
|
|
title={Real-Time Object Detection Meets DINOv3}, |
|
|
author={Huang, Shihua and Hou, Yongjie and Liu, Longfei and Yu, Xuanlong and Shen, Xi}, |
|
|
journal={arXiv}, |
|
|
year={2025} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
Apache 2.0 - See [DEIMv2 GitHub](https://github.com/Intellindust-AI-Lab/DEIMv2) for details. |
|
|
|