File size: 7,929 Bytes
2cfbeca 458e5d7 2cfbeca 458e5d7 2cfbeca c7c52dc 3d9ff47 c7c52dc 458e5d7 1674635 458e5d7 9f8a412 58e8078 9f8a412 a01bb9b 9f8a412 458e5d7 eeee939 458e5d7 c257c28 eeee939 458e5d7 1674635 2cfbeca 1c5291e a8cb77d 1c5291e a8cb77d 877c36d 2cfbeca 915ca0a 2cfbeca 458e5d7 a8cb77d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 |
---
license: mit
language:
- en
base_model:
- qualcomm/RF-DETR
pipeline_tag: object-detection
tags:
- surveillance
- Threat_detection
---
# RF-DETR based Threat Detection Model
<a href="https://opensource.org/licenses/MIT">
<img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License">
</a>
<a href="https://github.com/roboflow/rf-detr">
<img src="https://img.shields.io/badge/RF--DETR-Nano-purple?logo=roboflow&logoColor=white" alt="Model">
</a>
<a href="#performance-metrics">
<img src="https://img.shields.io/badge/mAP%4050-84.8%25-darkgreen?style=flat" alt="mAP">
</a>
<a href="https://github.com/subh-775/Threat_Detection_YOLO-vs-RF-DETR">
<img src="https://img.shields.io/badge/-code-black?logo=github" alt="Code">
</a>
## Transformers for Object Detection
The paradigm has shifted! While CNNs traditionally dominated object detection with faster inference times, **RF-DETR** (Roboflow's Detection Transformer) has revolutionized the field. This transformer-based architecture not only **outperforms CNNs** in accuracy but also delivers **faster inference** for real-time applications.
This repository contains a **fine-tuned RF-DETR Nano model** specifically trained for **threat detection**, capable of identifying four critical threat categories with high precision and speed.
## Predicted Results

### Video Inferencing
<video muted autoplay loop controls src="https://cdn-uploads.huggingface.co/production/uploads/66c6048d0bf40704e4159a23/5Kt3KghZaanzOVaVB6JS9.mp4" width=800></video>
## Model Overview
**RF-DETR Threat Detection** is a specialized computer vision model designed for security and surveillance applications. Built on Roboflow's cutting-edge RF-DETR architecture, this model can accurately detect and classify potential threats in real-time scenarios.
The threat categories are as:
| Class ID | Threat Type | Description |
|----------|-------------|-------------|
| 1 | **Gun** | Any type of firearm weapon including pistols, rifles, and other firearms |
| 2 | **Explosive** | Fire, explosion scenarios, and explosive devices |
| 3 | **Grenade** | Hand grenades and similar explosive devices |
| 4 | **Knife** | Bladed weapons including knives, daggers, and sharp objects |
## Training Dataset
Our custom threat detection dataset was meticulously curated and annotated to ensure robust model performance across diverse scenarios.
### Class Distribution

### Sample Annotations (Actual)

The model is trained to detect threats across various scales, from small concealed weapons to larger explosive devices.
## Performance Metrics
### Training Performance

The training process demonstrates excellent convergence with:
- **Consistent loss reduction** over 50 epochs
- **Stable validation performance** indicating good generalization
- **Balanced precision and recall** across all threat categories
### Validation Results
| Metric | Gun | Explosive | Grenade | Knife | **Overall** |
|--------|-----|-----------|---------|-------|-------------|
| **mAP@50:95** | 62.3% | 47.2% | 80.5% | 54.4% | **61.1%** |
| **mAP@50** | 90.1% | 69.6% | 93.7% | 85.8% | **84.8%** |
| **Precision** | 92.4% | 54.6% | 97.2% | 91.1% | **83.8%** |
| **Recall** | 85.0% | 85.0% | 85.0% | 85.0% | **85.0%** |
### Test Results
| Metric | Gun | Explosive | Grenade | Knife | **Overall** |
|--------|-----|-----------|---------|-------|-------------|
| **mAP@50:95** | 65.3% | 35.7% | 83.2% | 49.8% | **58.5%** |
| **mAP@50** | 93.1% | 60.5% | 91.1% | 79.7% | **81.1%** |
| **Precision** | 96.7% | 49.7% | 93.1% | 86.5% | **81.5%** |
| **Recall** | 83.0% | 83.0% | 83.0% | 83.0% | **83.0%** |
### Key Performance Highlights
- **84.8% mAP@50** on validation set
- **Fast inference** with RF-DETR Nano architecture
- **Excellent precision** for Gun (96.7%) and Grenade (93.1%) detection
- **Consistent recall** of 83-85% across all threat categories
- **Robust generalization** from validation to test performance
## Model Architecture
- **Base Architecture**: RF-DETR Nano
- **Input Resolution**: 640×640 pixels
- **Backbone**: Optimized transformer encoder
- **Detection Head**: Custom 4-class threat detection
- **Inference Speed**: ~50ms per image (GPU)
- **Model Size**: Lightweight for edge deployment
## Training Details
### Training Configuration
- **Epochs**: 50
- **Batch Size**: Optimized for available GPU memory
- **Optimizer**: AdamW with learning rate scheduling
- **Data Augmentation**: Advanced augmentation pipeline for robust training
- **Loss Function**: Multi-scale detection loss with class balancing
### Training Strategy
1. **Progressive Training**: Started with lower resolution, gradually increased
2. **Class Balancing**: Weighted loss to handle class imbalance
3. **Data Augmentation**: Extensive augmentation to improve generalization
4. **Early Stopping**: Monitored validation mAP to prevent overfitting
## Model Files
- `checkpoint_best_total.pth` - Main model weights
### Inference Instructions
```python
pip install -q rfdetr==1.2.1 supervision==0.26.1
```
- You can use: [video_processing.py](https://huggingface.co/Subh775/Threat-Detection-RF-DETR/blob/main/video_processing.py) to process large videos
- Below is the script to process a single image
```python
import numpy as np
import supervision as sv
import torch
import requests
from PIL import Image
import os
from rfdetr import RFDETRNano
THREAT_CLASSES = {
1: "Gun",
2: "Explosive",
3: "Grenade",
4: "Knife"
}
image = Image.open("Path_to_image")
# pre-trained weights
weights_url = "https://huggingface.co/Subh775/Threat-Detection-RF-DETR/resolve/main/checkpoint_best_total.pth"
weights_filename = "checkpoint_best_total.pth"
# Download weights if not already present
if not os.path.exists(weights_filename):
print(f"Downloading weights from {weights_url}")
response = requests.get(weights_url, stream=True)
response.raise_for_status()
with open(weights_filename, 'wb') as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
print("Download complete.")
model = RFDETRNano(resolution=640, pretrain_weights=weights_filename)
model.optimize_for_inference()
detections = model.predict(image, threshold=0.5)
color = sv.ColorPalette.from_hex([
"#1E90FF", "#32CD32", "#FF0000", "#FF8C00"
])
text_scale = sv.calculate_optimal_text_scale(resolution_wh=image.size)
thickness = sv.calculate_optimal_line_thickness(resolution_wh=image.size)
bbox_annotator = sv.BoxAnnotator(color=color, thickness=thickness)
label_annotator = sv.LabelAnnotator(
color=color,
text_color=sv.Color.BLACK,
text_scale=text_scale,
smart_position=True
)
labels = []
for class_id, confidence in zip(detections.class_id, detections.confidence):
class_name = THREAT_CLASSES.get(class_id, f"unknown_class_{class_id}")
labels.append(f"{class_name} {confidence:.2f}")
annotated_image = image.copy()
annotated_image = bbox_annotator.annotate(annotated_image, detections)
annotated_image = label_annotator.annotate(annotated_image, detections, labels)
annotated_image.thumbnail((800, 800))
annotated_image
```
## Acknowledgments
- **Roboflow** for the RF-DETR architecture
- **Hugging Face** for model hosting and distribution
- **PyTorch** ecosystem for deep learning framework
- **Supervision** library for computer vision utilities
**Disclaimer**: This model is designed for research purpose only. It's predictions cannot be taken into account for deployment right now.
|