Duplicate from Subh775/Threat-Detection-RFDETR

3168689 19 days ago

7.93 kB

	---
	license: mit
	language:
	- en
	base_model:
	- qualcomm/RF-DETR
	pipeline_tag: object-detection
	tags:
	- surveillance
	- Threat_detection
	---

	# RF-DETR based Threat Detection Model

	<a href="https://opensource.org/licenses/MIT">
	<img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License">
	</a>
	<a href="https://github.com/roboflow/rf-detr">
	<img src="https://img.shields.io/badge/RF--DETR-Nano-purple?logo=roboflow&logoColor=white" alt="Model">
	</a>
	<a href="#performance-metrics">
	<img src="https://img.shields.io/badge/mAP%4050-84.8%25-darkgreen?style=flat" alt="mAP">
	</a>
	<a href="https://github.com/subh-775/Threat_Detection_YOLO-vs-RF-DETR">
	<img src="https://img.shields.io/badge/-code-black?logo=github" alt="Code">
	</a>


	## Transformers for Object Detection

	The paradigm has shifted! While CNNs traditionally dominated object detection with faster inference times, RF-DETR (Roboflow's Detection Transformer) has revolutionized the field. This transformer-based architecture not only outperforms CNNs in accuracy but also delivers faster inference for real-time applications.

	This repository contains a fine-tuned RF-DETR Nano model specifically trained for threat detection, capable of identifying four critical threat categories with high precision and speed.

	## Predicted Results
	![predictions](https://cdn-uploads.huggingface.co/production/uploads/66c6048d0bf40704e4159a23/MDRT7LUt1RQE60CGW8to4.jpeg)

	### Video Inferencing
	<video muted autoplay loop controls src="https://cdn-uploads.huggingface.co/production/uploads/66c6048d0bf40704e4159a23/5Kt3KghZaanzOVaVB6JS9.mp4" width=800></video>


	## Model Overview

	RF-DETR Threat Detection is a specialized computer vision model designed for security and surveillance applications. Built on Roboflow's cutting-edge RF-DETR architecture, this model can accurately detect and classify potential threats in real-time scenarios.

	The threat categories are as:

	\| Class ID \| Threat Type \| Description \|
	\|----------\|-------------\|-------------\|
	\| 1 \| Gun \| Any type of firearm weapon including pistols, rifles, and other firearms \|
	\| 2 \| Explosive \| Fire, explosion scenarios, and explosive devices \|
	\| 3 \| Grenade \| Hand grenades and similar explosive devices \|
	\| 4 \| Knife \| Bladed weapons including knives, daggers, and sharp objects \|

	## Training Dataset

	Our custom threat detection dataset was meticulously curated and annotated to ensure robust model performance across diverse scenarios.

	### Class Distribution
	![class_distribution](https://cdn-uploads.huggingface.co/production/uploads/66c6048d0bf40704e4159a23/5t7k-SJfuZWXJTek_RPWh.png)

	### Sample Annotations (Actual)
	![sample_images_annotated](https://cdn-uploads.huggingface.co/production/uploads/66c6048d0bf40704e4159a23/Mf65kxTEwfq9HPMlzwO5y.png)

	The model is trained to detect threats across various scales, from small concealed weapons to larger explosive devices.

	## Performance Metrics

	### Training Performance
	![Training Metrics](metrics_plot.png)

	The training process demonstrates excellent convergence with:
	- Consistent loss reduction over 50 epochs
	- Stable validation performance indicating good generalization
	- Balanced precision and recall across all threat categories

	### Validation Results

	\| Metric \| Gun \| Explosive \| Grenade \| Knife \| Overall \|
	\|--------\|-----\|-----------\|---------\|-------\|-------------\|
	\| mAP@50:95 \| 62.3% \| 47.2% \| 80.5% \| 54.4% \| 61.1% \|
	\| mAP@50 \| 90.1% \| 69.6% \| 93.7% \| 85.8% \| 84.8% \|
	\| Precision \| 92.4% \| 54.6% \| 97.2% \| 91.1% \| 83.8% \|
	\| Recall \| 85.0% \| 85.0% \| 85.0% \| 85.0% \| 85.0% \|

	### Test Results

	\| Metric \| Gun \| Explosive \| Grenade \| Knife \| Overall \|
	\|--------\|-----\|-----------\|---------\|-------\|-------------\|
	\| mAP@50:95 \| 65.3% \| 35.7% \| 83.2% \| 49.8% \| 58.5% \|
	\| mAP@50 \| 93.1% \| 60.5% \| 91.1% \| 79.7% \| 81.1% \|
	\| Precision \| 96.7% \| 49.7% \| 93.1% \| 86.5% \| 81.5% \|
	\| Recall \| 83.0% \| 83.0% \| 83.0% \| 83.0% \| 83.0% \|

	### Key Performance Highlights

	- 84.8% mAP@50 on validation set
	- Fast inference with RF-DETR Nano architecture
	- Excellent precision for Gun (96.7%) and Grenade (93.1%) detection
	- Consistent recall of 83-85% across all threat categories
	- Robust generalization from validation to test performance

	## Model Architecture

	- Base Architecture: RF-DETR Nano
	- Input Resolution: 640×640 pixels
	- Backbone: Optimized transformer encoder
	- Detection Head: Custom 4-class threat detection
	- Inference Speed: ~50ms per image (GPU)
	- Model Size: Lightweight for edge deployment

	## Training Details

	### Training Configuration
	- Epochs: 50
	- Batch Size: Optimized for available GPU memory
	- Optimizer: AdamW with learning rate scheduling
	- Data Augmentation: Advanced augmentation pipeline for robust training
	- Loss Function: Multi-scale detection loss with class balancing

	### Training Strategy
	1. Progressive Training: Started with lower resolution, gradually increased
	2. Class Balancing: Weighted loss to handle class imbalance
	3. Data Augmentation: Extensive augmentation to improve generalization
	4. Early Stopping: Monitored validation mAP to prevent overfitting

	## Model Files

	- `checkpoint_best_total.pth` - Main model weights

	### Inference Instructions

	```python
	pip install -q rfdetr==1.2.1 supervision==0.26.1
	```
	- You can use: [video_processing.py](https://huggingface.co/Subh775/Threat-Detection-RFDETR/blob/main/video_processing.py) to process large videos

	- Below is the script to process a single image

	```python
	import numpy as np
	import supervision as sv
	import torch
	import requests
	from PIL import Image
	import os

	from rfdetr import RFDETRNano

	THREAT_CLASSES = {
	1: "Gun",
	2: "Explosive",
	3: "Grenade",
	4: "Knife"
	}

	image = Image.open("Path_to_image")

	# pre-trained weights
	weights_url = "https://huggingface.co/Subh775/Threat-Detection-RFDETR/resolve/main/checkpoint_best_total.pth"
	weights_filename = "checkpoint_best_total.pth"

	# Download weights if not already present
	if not os.path.exists(weights_filename):
	print(f"Downloading weights from {weights_url}")
	response = requests.get(weights_url, stream=True)
	response.raise_for_status()
	with open(weights_filename, 'wb') as f:
	for chunk in response.iter_content(chunk_size=8192):
	f.write(chunk)
	print("Download complete.")

	model = RFDETRNano(resolution=640, pretrain_weights=weights_filename)
	model.optimize_for_inference()

	detections = model.predict(image, threshold=0.5)

	color = sv.ColorPalette.from_hex([
	"#1E90FF", "#32CD32", "#FF0000", "#FF8C00"
	])

	text_scale = sv.calculate_optimal_text_scale(resolution_wh=image.size)
	thickness = sv.calculate_optimal_line_thickness(resolution_wh=image.size)

	bbox_annotator = sv.BoxAnnotator(color=color, thickness=thickness)
	label_annotator = sv.LabelAnnotator(
	color=color,
	text_color=sv.Color.BLACK,
	text_scale=text_scale,
	smart_position=True
	)

	labels = []
	for class_id, confidence in zip(detections.class_id, detections.confidence):
	class_name = THREAT_CLASSES.get(class_id, f"unknown_class_{class_id}")
	labels.append(f"{class_name} {confidence:.2f}")

	annotated_image = image.copy()
	annotated_image = bbox_annotator.annotate(annotated_image, detections)
	annotated_image = label_annotator.annotate(annotated_image, detections, labels)
	annotated_image.thumbnail((800, 800))
	annotated_image
	```

	## Acknowledgments

	- Roboflow for the RF-DETR architecture
	- Hugging Face for model hosting and distribution
	- PyTorch ecosystem for deep learning framework
	- Supervision library for computer vision utilities

	Disclaimer: This model is designed for research purpose only. It's predictions cannot be taken into account for deployment right now.