Model Card for RFDETR-Medium - Face Detection Finetune

This model is a fine-tuned version of RFDETR-Medium (Real-time Detection Transformer) specifically optimized for the detection of human faces. It leverages a DINOv2 backbone for high-quality feature extraction and a transformer-based head for NMS-free, end-to-end detection.

Model Details

Model Description

Model type: Object Detection
Task: Human Face Detection
Finetuned from model: RFDETR-Medium
Language(s): N/A (Computer Vision)
License: Apache 2.0

Training Details

Training Hyperparameters

The following hyperparameters were used during the fine-tuning process:

Epochs: 20
Batch Size: 16
Learning Rate: $1 \times 10^{-4}$
Input Image Size: $576 \times 576$

Evalutation Results

mAP@50: 0.9
mAP@5095: 0.6

Training Data

The model was trained on the Face Detection Dataset (Kaggle), which contains approximately 16,700 images with bounding box annotations for human faces.

Model Sources

Repository: Roboflow RF-DETR GitHub
Dataset: Face Detection Dataset (Kaggle)

Uses

Direct Use

This model is designed for real-time human face detection in images and video streams.

How to Get Started with the Model

from rfdetr import RFDETRMedium

# Load your fine-tuned weights
model = RFDETRMedium(device="cuda", pretrain_weights="rfdetr_medium_face.pth")
model.optimize_for_inference()

# Run inference
results = model.predict("input_image.jpg", threshold=0.5)

Downloads last month: -; Downloads are not tracked for this model. How to track