| --- |
| license: mit |
| language: |
| - en |
| base_model: |
| - qualcomm/RF-DETR |
| pipeline_tag: object-detection |
| tags: |
| - surveillance |
| - Threat_detection |
| --- |
| |
| # RF-DETR based Threat Detection Model |
|
|
| <a href="https://opensource.org/licenses/MIT"> |
| <img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License"> |
| </a> |
| <a href="https://github.com/roboflow/rf-detr"> |
| <img src="https://img.shields.io/badge/RF--DETR-Nano-purple?logo=roboflow&logoColor=white" alt="Model"> |
| </a> |
| <a href="#performance-metrics"> |
| <img src="https://img.shields.io/badge/mAP%4050-84.8%25-darkgreen?style=flat" alt="mAP"> |
| </a> |
| <a href="https://github.com/subh-775/Threat_Detection_YOLO-vs-RF-DETR"> |
| <img src="https://img.shields.io/badge/-code-black?logo=github" alt="Code"> |
| </a> |
| |
|
|
| ## Transformers for Object Detection |
|
|
| The paradigm has shifted! While CNNs traditionally dominated object detection with faster inference times, **RF-DETR** (Roboflow's Detection Transformer) has revolutionized the field. This transformer-based architecture not only **outperforms CNNs** in accuracy but also delivers **faster inference** for real-time applications. |
|
|
| This repository contains a **fine-tuned RF-DETR Nano model** specifically trained for **threat detection**, capable of identifying four critical threat categories with high precision and speed. |
|
|
| ## Predicted Results |
|  |
|
|
| ### Video Inferencing |
| <video muted autoplay loop controls src="https://cdn-uploads.huggingface.co/production/uploads/66c6048d0bf40704e4159a23/5Kt3KghZaanzOVaVB6JS9.mp4" width=800></video> |
|
|
|
|
| ## Model Overview |
|
|
| **RF-DETR Threat Detection** is a specialized computer vision model designed for security and surveillance applications. Built on Roboflow's cutting-edge RF-DETR architecture, this model can accurately detect and classify potential threats in real-time scenarios. |
|
|
| The threat categories are as: |
|
|
| | Class ID | Threat Type | Description | |
| |----------|-------------|-------------| |
| | 1 | **Gun** | Any type of firearm weapon including pistols, rifles, and other firearms | |
| | 2 | **Explosive** | Fire, explosion scenarios, and explosive devices | |
| | 3 | **Grenade** | Hand grenades and similar explosive devices | |
| | 4 | **Knife** | Bladed weapons including knives, daggers, and sharp objects | |
|
|
| ## Training Dataset |
|
|
| Our custom threat detection dataset was meticulously curated and annotated to ensure robust model performance across diverse scenarios. |
|
|
| ### Class Distribution |
|  |
|
|
| ### Sample Annotations (Actual) |
|  |
|
|
| The model is trained to detect threats across various scales, from small concealed weapons to larger explosive devices. |
|
|
| ## Performance Metrics |
|
|
| ### Training Performance |
|  |
|
|
| The training process demonstrates excellent convergence with: |
| - **Consistent loss reduction** over 50 epochs |
| - **Stable validation performance** indicating good generalization |
| - **Balanced precision and recall** across all threat categories |
|
|
| ### Validation Results |
|
|
| | Metric | Gun | Explosive | Grenade | Knife | **Overall** | |
| |--------|-----|-----------|---------|-------|-------------| |
| | **mAP@50:95** | 62.3% | 47.2% | 80.5% | 54.4% | **61.1%** | |
| | **mAP@50** | 90.1% | 69.6% | 93.7% | 85.8% | **84.8%** | |
| | **Precision** | 92.4% | 54.6% | 97.2% | 91.1% | **83.8%** | |
| | **Recall** | 85.0% | 85.0% | 85.0% | 85.0% | **85.0%** | |
|
|
| ### Test Results |
|
|
| | Metric | Gun | Explosive | Grenade | Knife | **Overall** | |
| |--------|-----|-----------|---------|-------|-------------| |
| | **mAP@50:95** | 65.3% | 35.7% | 83.2% | 49.8% | **58.5%** | |
| | **mAP@50** | 93.1% | 60.5% | 91.1% | 79.7% | **81.1%** | |
| | **Precision** | 96.7% | 49.7% | 93.1% | 86.5% | **81.5%** | |
| | **Recall** | 83.0% | 83.0% | 83.0% | 83.0% | **83.0%** | |
|
|
| ### Key Performance Highlights |
|
|
| - **84.8% mAP@50** on validation set |
| - **Fast inference** with RF-DETR Nano architecture |
| - **Excellent precision** for Gun (96.7%) and Grenade (93.1%) detection |
| - **Consistent recall** of 83-85% across all threat categories |
| - **Robust generalization** from validation to test performance |
|
|
| ## Model Architecture |
|
|
| - **Base Architecture**: RF-DETR Nano |
| - **Input Resolution**: 640×640 pixels |
| - **Backbone**: Optimized transformer encoder |
| - **Detection Head**: Custom 4-class threat detection |
| - **Inference Speed**: ~50ms per image (GPU) |
| - **Model Size**: Lightweight for edge deployment |
|
|
| ## Training Details |
|
|
| ### Training Configuration |
| - **Epochs**: 50 |
| - **Batch Size**: Optimized for available GPU memory |
| - **Optimizer**: AdamW with learning rate scheduling |
| - **Data Augmentation**: Advanced augmentation pipeline for robust training |
| - **Loss Function**: Multi-scale detection loss with class balancing |
|
|
| ### Training Strategy |
| 1. **Progressive Training**: Started with lower resolution, gradually increased |
| 2. **Class Balancing**: Weighted loss to handle class imbalance |
| 3. **Data Augmentation**: Extensive augmentation to improve generalization |
| 4. **Early Stopping**: Monitored validation mAP to prevent overfitting |
|
|
| ## Model Files |
|
|
| - `checkpoint_best_total.pth` - Main model weights |
|
|
| ### Inference Instructions |
|
|
| ```python |
| pip install -q rfdetr==1.2.1 supervision==0.26.1 |
| ``` |
| - You can use: [video_processing.py](https://huggingface.co/Subh775/Threat-Detection-RFDETR/blob/main/video_processing.py) to process large videos |
|
|
| - Below is the script to process a single image |
|
|
| ```python |
| import numpy as np |
| import supervision as sv |
| import torch |
| import requests |
| from PIL import Image |
| import os |
| |
| from rfdetr import RFDETRNano |
| |
| THREAT_CLASSES = { |
| 1: "Gun", |
| 2: "Explosive", |
| 3: "Grenade", |
| 4: "Knife" |
| } |
| |
| image = Image.open("Path_to_image") |
| |
| # pre-trained weights |
| weights_url = "https://huggingface.co/Subh775/Threat-Detection-RFDETR/resolve/main/checkpoint_best_total.pth" |
| weights_filename = "checkpoint_best_total.pth" |
| |
| # Download weights if not already present |
| if not os.path.exists(weights_filename): |
| print(f"Downloading weights from {weights_url}") |
| response = requests.get(weights_url, stream=True) |
| response.raise_for_status() |
| with open(weights_filename, 'wb') as f: |
| for chunk in response.iter_content(chunk_size=8192): |
| f.write(chunk) |
| print("Download complete.") |
| |
| model = RFDETRNano(resolution=640, pretrain_weights=weights_filename) |
| model.optimize_for_inference() |
| |
| detections = model.predict(image, threshold=0.5) |
| |
| color = sv.ColorPalette.from_hex([ |
| "#1E90FF", "#32CD32", "#FF0000", "#FF8C00" |
| ]) |
| |
| text_scale = sv.calculate_optimal_text_scale(resolution_wh=image.size) |
| thickness = sv.calculate_optimal_line_thickness(resolution_wh=image.size) |
| |
| bbox_annotator = sv.BoxAnnotator(color=color, thickness=thickness) |
| label_annotator = sv.LabelAnnotator( |
| color=color, |
| text_color=sv.Color.BLACK, |
| text_scale=text_scale, |
| smart_position=True |
| ) |
| |
| labels = [] |
| for class_id, confidence in zip(detections.class_id, detections.confidence): |
| class_name = THREAT_CLASSES.get(class_id, f"unknown_class_{class_id}") |
| labels.append(f"{class_name} {confidence:.2f}") |
| |
| annotated_image = image.copy() |
| annotated_image = bbox_annotator.annotate(annotated_image, detections) |
| annotated_image = label_annotator.annotate(annotated_image, detections, labels) |
| annotated_image.thumbnail((800, 800)) |
| annotated_image |
| ``` |
|
|
| ## Acknowledgments |
|
|
| - **Roboflow** for the RF-DETR architecture |
| - **Hugging Face** for model hosting and distribution |
| - **PyTorch** ecosystem for deep learning framework |
| - **Supervision** library for computer vision utilities |
|
|
| **Disclaimer**: This model is designed for research purpose only. It's predictions cannot be taken into account for deployment right now. |
|
|