metadata
license: apache-2.0
tags:
- computer-vision
- object-detection
- rf-detr
- face-detection
datasets:
- fareselmenshawii/face-detection-dataset
metrics:
- mAP
Model Card for RFDETR-Medium - Face Detection Finetune
This model is a fine-tuned version of RFDETR-Medium (Real-time Detection Transformer) specifically optimized for the detection of human faces. It leverages a DINOv2 backbone for high-quality feature extraction and a transformer-based head for NMS-free, end-to-end detection.
Model Details
Model Description
- Model type: Object Detection
- Task: Human Face Detection
- Finetuned from model: RFDETR-Medium
- Language(s): N/A (Computer Vision)
- License: Apache 2.0
Training Details
Training Hyperparameters
The following hyperparameters were used during the fine-tuning process:
- Epochs: 20
- Batch Size: 16
- Learning Rate: $1 \times 10^{-4}$
- Input Image Size: $576 \times 576$
Evalutation Results
- mAP@50: 0.9
- mAP@5095: 0.6
Training Data
The model was trained on the Face Detection Dataset (Kaggle), which contains approximately 16,700 images with bounding box annotations for human faces.
Model Sources
- Repository: Roboflow RF-DETR GitHub
- Dataset: Face Detection Dataset (Kaggle)
Uses
Direct Use
This model is designed for real-time human face detection in images and video streams.
How to Get Started with the Model
from rfdetr import RFDETRMedium
# Load your fine-tuned weights
model = RFDETRMedium(device="cuda", pretrain_weights="rfdetr_medium_face.pth")
model.optimize_for_inference()
# Run inference
results = model.predict("input_image.jpg", threshold=0.5)