--- license: mit language: - en base_model: - qualcomm/RF-DETR pipeline_tag: object-detection tags: - surveillance - Threat_detection --- # RF-DETR based Threat Detection Model License Model mAP Code ## Transformers for Object Detection The paradigm has shifted! While CNNs traditionally dominated object detection with faster inference times, **RF-DETR** (Roboflow's Detection Transformer) has revolutionized the field. This transformer-based architecture not only **outperforms CNNs** in accuracy but also delivers **faster inference** for real-time applications. This repository contains a **fine-tuned RF-DETR Nano model** specifically trained for **threat detection**, capable of identifying four critical threat categories with high precision and speed. ## Predicted Results ![predictions](https://cdn-uploads.huggingface.co/production/uploads/66c6048d0bf40704e4159a23/MDRT7LUt1RQE60CGW8to4.jpeg) ### Video Inferencing ## Model Overview **RF-DETR Threat Detection** is a specialized computer vision model designed for security and surveillance applications. Built on Roboflow's cutting-edge RF-DETR architecture, this model can accurately detect and classify potential threats in real-time scenarios. The threat categories are as: | Class ID | Threat Type | Description | |----------|-------------|-------------| | 1 | **Gun** | Any type of firearm weapon including pistols, rifles, and other firearms | | 2 | **Explosive** | Fire, explosion scenarios, and explosive devices | | 3 | **Grenade** | Hand grenades and similar explosive devices | | 4 | **Knife** | Bladed weapons including knives, daggers, and sharp objects | ## Training Dataset Our custom threat detection dataset was meticulously curated and annotated to ensure robust model performance across diverse scenarios. ### Class Distribution ![class_distribution](https://cdn-uploads.huggingface.co/production/uploads/66c6048d0bf40704e4159a23/5t7k-SJfuZWXJTek_RPWh.png) ### Sample Annotations (Actual) ![sample_images_annotated](https://cdn-uploads.huggingface.co/production/uploads/66c6048d0bf40704e4159a23/Mf65kxTEwfq9HPMlzwO5y.png) The model is trained to detect threats across various scales, from small concealed weapons to larger explosive devices. ## Performance Metrics ### Training Performance ![Training Metrics](metrics_plot.png) The training process demonstrates excellent convergence with: - **Consistent loss reduction** over 50 epochs - **Stable validation performance** indicating good generalization - **Balanced precision and recall** across all threat categories ### Validation Results | Metric | Gun | Explosive | Grenade | Knife | **Overall** | |--------|-----|-----------|---------|-------|-------------| | **mAP@50:95** | 62.3% | 47.2% | 80.5% | 54.4% | **61.1%** | | **mAP@50** | 90.1% | 69.6% | 93.7% | 85.8% | **84.8%** | | **Precision** | 92.4% | 54.6% | 97.2% | 91.1% | **83.8%** | | **Recall** | 85.0% | 85.0% | 85.0% | 85.0% | **85.0%** | ### Test Results | Metric | Gun | Explosive | Grenade | Knife | **Overall** | |--------|-----|-----------|---------|-------|-------------| | **mAP@50:95** | 65.3% | 35.7% | 83.2% | 49.8% | **58.5%** | | **mAP@50** | 93.1% | 60.5% | 91.1% | 79.7% | **81.1%** | | **Precision** | 96.7% | 49.7% | 93.1% | 86.5% | **81.5%** | | **Recall** | 83.0% | 83.0% | 83.0% | 83.0% | **83.0%** | ### Key Performance Highlights - **84.8% mAP@50** on validation set - **Fast inference** with RF-DETR Nano architecture - **Excellent precision** for Gun (96.7%) and Grenade (93.1%) detection - **Consistent recall** of 83-85% across all threat categories - **Robust generalization** from validation to test performance ## Model Architecture - **Base Architecture**: RF-DETR Nano - **Input Resolution**: 640×640 pixels - **Backbone**: Optimized transformer encoder - **Detection Head**: Custom 4-class threat detection - **Inference Speed**: ~50ms per image (GPU) - **Model Size**: Lightweight for edge deployment ## Training Details ### Training Configuration - **Epochs**: 50 - **Batch Size**: Optimized for available GPU memory - **Optimizer**: AdamW with learning rate scheduling - **Data Augmentation**: Advanced augmentation pipeline for robust training - **Loss Function**: Multi-scale detection loss with class balancing ### Training Strategy 1. **Progressive Training**: Started with lower resolution, gradually increased 2. **Class Balancing**: Weighted loss to handle class imbalance 3. **Data Augmentation**: Extensive augmentation to improve generalization 4. **Early Stopping**: Monitored validation mAP to prevent overfitting ## Model Files - `checkpoint_best_total.pth` - Main model weights ### Inference Instructions ```python pip install -q rfdetr==1.2.1 supervision==0.26.1 ``` - You can use: [video_processing.py](https://huggingface.co/Subh775/Threat-Detection-RF-DETR/blob/main/video_processing.py) to process large videos - Below is the script to process a single image ```python import numpy as np import supervision as sv import torch import requests from PIL import Image import os from rfdetr import RFDETRNano THREAT_CLASSES = { 1: "Gun", 2: "Explosive", 3: "Grenade", 4: "Knife" } image = Image.open("Path_to_image") # pre-trained weights weights_url = "https://huggingface.co/Subh775/Threat-Detection-RF-DETR/resolve/main/checkpoint_best_total.pth" weights_filename = "checkpoint_best_total.pth" # Download weights if not already present if not os.path.exists(weights_filename): print(f"Downloading weights from {weights_url}") response = requests.get(weights_url, stream=True) response.raise_for_status() with open(weights_filename, 'wb') as f: for chunk in response.iter_content(chunk_size=8192): f.write(chunk) print("Download complete.") model = RFDETRNano(resolution=640, pretrain_weights=weights_filename) model.optimize_for_inference() detections = model.predict(image, threshold=0.5) color = sv.ColorPalette.from_hex([ "#1E90FF", "#32CD32", "#FF0000", "#FF8C00" ]) text_scale = sv.calculate_optimal_text_scale(resolution_wh=image.size) thickness = sv.calculate_optimal_line_thickness(resolution_wh=image.size) bbox_annotator = sv.BoxAnnotator(color=color, thickness=thickness) label_annotator = sv.LabelAnnotator( color=color, text_color=sv.Color.BLACK, text_scale=text_scale, smart_position=True ) labels = [] for class_id, confidence in zip(detections.class_id, detections.confidence): class_name = THREAT_CLASSES.get(class_id, f"unknown_class_{class_id}") labels.append(f"{class_name} {confidence:.2f}") annotated_image = image.copy() annotated_image = bbox_annotator.annotate(annotated_image, detections) annotated_image = label_annotator.annotate(annotated_image, detections, labels) annotated_image.thumbnail((800, 800)) annotated_image ``` ## Acknowledgments - **Roboflow** for the RF-DETR architecture - **Hugging Face** for model hosting and distribution - **PyTorch** ecosystem for deep learning framework - **Supervision** library for computer vision utilities **Disclaimer**: This model is designed for research purpose only. It's predictions cannot be taken into account for deployment right now.