File size: 7,929 Bytes
2cfbeca
458e5d7
2cfbeca
 
 
 
 
 
 
 
 
 
458e5d7
2cfbeca
c7c52dc
 
 
 
 
 
 
3d9ff47
c7c52dc
 
 
 
 
 
458e5d7
1674635
458e5d7
 
 
 
9f8a412
58e8078
 
9f8a412
a01bb9b
 
9f8a412
458e5d7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eeee939
458e5d7
c257c28
eeee939
458e5d7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1674635
2cfbeca
 
1c5291e
 
 
a8cb77d
1c5291e
a8cb77d
877c36d
2cfbeca
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
915ca0a
2cfbeca
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
458e5d7
 
 
 
 
 
 
 
 
a8cb77d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
---
license: mit
language:
- en
base_model:
- qualcomm/RF-DETR
pipeline_tag: object-detection
tags:
- surveillance
- Threat_detection
---

# RF-DETR based Threat Detection Model

<a href="https://opensource.org/licenses/MIT">
    <img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License">
</a>
<a href="https://github.com/roboflow/rf-detr">
    <img src="https://img.shields.io/badge/RF--DETR-Nano-purple?logo=roboflow&logoColor=white" alt="Model">
</a>
<a href="#performance-metrics">
    <img src="https://img.shields.io/badge/mAP%4050-84.8%25-darkgreen?style=flat" alt="mAP">
</a>
<a href="https://github.com/subh-775/Threat_Detection_YOLO-vs-RF-DETR">
    <img src="https://img.shields.io/badge/-code-black?logo=github" alt="Code">
</a>


## Transformers for Object Detection

The paradigm has shifted! While CNNs traditionally dominated object detection with faster inference times, **RF-DETR** (Roboflow's Detection Transformer) has revolutionized the field. This transformer-based architecture not only **outperforms CNNs** in accuracy but also delivers **faster inference** for real-time applications.

This repository contains a **fine-tuned RF-DETR Nano model** specifically trained for **threat detection**, capable of identifying four critical threat categories with high precision and speed.

## Predicted Results
![predictions](https://cdn-uploads.huggingface.co/production/uploads/66c6048d0bf40704e4159a23/MDRT7LUt1RQE60CGW8to4.jpeg)

### Video Inferencing
<video muted autoplay loop controls src="https://cdn-uploads.huggingface.co/production/uploads/66c6048d0bf40704e4159a23/5Kt3KghZaanzOVaVB6JS9.mp4" width=800></video>


## Model Overview

**RF-DETR Threat Detection** is a specialized computer vision model designed for security and surveillance applications. Built on Roboflow's cutting-edge RF-DETR architecture, this model can accurately detect and classify potential threats in real-time scenarios.

The threat categories are as:

| Class ID | Threat Type | Description |
|----------|-------------|-------------|
| 1 | **Gun** | Any type of firearm weapon including pistols, rifles, and other firearms |
| 2 | **Explosive** | Fire, explosion scenarios, and explosive devices |
| 3 | **Grenade** | Hand grenades and similar explosive devices |
| 4 | **Knife** | Bladed weapons including knives, daggers, and sharp objects |

## Training Dataset 

Our custom threat detection dataset was meticulously curated and annotated to ensure robust model performance across diverse scenarios.

### Class Distribution
![class_distribution](https://cdn-uploads.huggingface.co/production/uploads/66c6048d0bf40704e4159a23/5t7k-SJfuZWXJTek_RPWh.png)

### Sample Annotations (Actual)
![sample_images_annotated](https://cdn-uploads.huggingface.co/production/uploads/66c6048d0bf40704e4159a23/Mf65kxTEwfq9HPMlzwO5y.png)

The model is trained to detect threats across various scales, from small concealed weapons to larger explosive devices.

## Performance Metrics

### Training Performance
![Training Metrics](metrics_plot.png)

The training process demonstrates excellent convergence with:
- **Consistent loss reduction** over 50 epochs
- **Stable validation performance** indicating good generalization
- **Balanced precision and recall** across all threat categories

### Validation Results

| Metric | Gun | Explosive | Grenade | Knife | **Overall** |
|--------|-----|-----------|---------|-------|-------------|
| **mAP@50:95** | 62.3% | 47.2% | 80.5% | 54.4% | **61.1%** |
| **mAP@50** | 90.1% | 69.6% | 93.7% | 85.8% | **84.8%** |
| **Precision** | 92.4% | 54.6% | 97.2% | 91.1% | **83.8%** |
| **Recall** | 85.0% | 85.0% | 85.0% | 85.0% | **85.0%** |

### Test Results

| Metric | Gun | Explosive | Grenade | Knife | **Overall** |
|--------|-----|-----------|---------|-------|-------------|
| **mAP@50:95** | 65.3% | 35.7% | 83.2% | 49.8% | **58.5%** |
| **mAP@50** | 93.1% | 60.5% | 91.1% | 79.7% | **81.1%** |
| **Precision** | 96.7% | 49.7% | 93.1% | 86.5% | **81.5%** |
| **Recall** | 83.0% | 83.0% | 83.0% | 83.0% | **83.0%** |

### Key Performance Highlights

- **84.8% mAP@50** on validation set
- **Fast inference** with RF-DETR Nano architecture
- **Excellent precision** for Gun (96.7%) and Grenade (93.1%) detection
- **Consistent recall** of 83-85% across all threat categories
- **Robust generalization** from validation to test performance

## Model Architecture

- **Base Architecture**: RF-DETR Nano
- **Input Resolution**: 640×640 pixels
- **Backbone**: Optimized transformer encoder
- **Detection Head**: Custom 4-class threat detection
- **Inference Speed**: ~50ms per image (GPU)
- **Model Size**: Lightweight for edge deployment

## Training Details

### Training Configuration
- **Epochs**: 50
- **Batch Size**: Optimized for available GPU memory
- **Optimizer**: AdamW with learning rate scheduling
- **Data Augmentation**: Advanced augmentation pipeline for robust training
- **Loss Function**: Multi-scale detection loss with class balancing

### Training Strategy
1. **Progressive Training**: Started with lower resolution, gradually increased
2. **Class Balancing**: Weighted loss to handle class imbalance
3. **Data Augmentation**: Extensive augmentation to improve generalization
4. **Early Stopping**: Monitored validation mAP to prevent overfitting

## Model Files

- `checkpoint_best_total.pth` - Main model weights 

### Inference Instructions

```python
pip install -q rfdetr==1.2.1 supervision==0.26.1 
```
- You can use: [video_processing.py](https://huggingface.co/Subh775/Threat-Detection-RF-DETR/blob/main/video_processing.py) to process large videos

- Below is the script to process a single image

```python
import numpy as np
import supervision as sv
import torch
import requests
from PIL import Image
import os

from rfdetr import RFDETRNano

THREAT_CLASSES = {
    1: "Gun",
    2: "Explosive", 
    3: "Grenade",
    4: "Knife"
}

image = Image.open("Path_to_image")

# pre-trained weights
weights_url = "https://huggingface.co/Subh775/Threat-Detection-RF-DETR/resolve/main/checkpoint_best_total.pth"
weights_filename = "checkpoint_best_total.pth"

# Download weights if not already present
if not os.path.exists(weights_filename):
    print(f"Downloading weights from {weights_url}")
    response = requests.get(weights_url, stream=True)
    response.raise_for_status()
    with open(weights_filename, 'wb') as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)
    print("Download complete.")

model = RFDETRNano(resolution=640, pretrain_weights=weights_filename)
model.optimize_for_inference()

detections = model.predict(image, threshold=0.5)

color = sv.ColorPalette.from_hex([
    "#1E90FF", "#32CD32", "#FF0000", "#FF8C00"
])

text_scale = sv.calculate_optimal_text_scale(resolution_wh=image.size)
thickness = sv.calculate_optimal_line_thickness(resolution_wh=image.size)

bbox_annotator = sv.BoxAnnotator(color=color, thickness=thickness)
label_annotator = sv.LabelAnnotator(
    color=color,
    text_color=sv.Color.BLACK,
    text_scale=text_scale,
    smart_position=True
)

labels = []
for class_id, confidence in zip(detections.class_id, detections.confidence):
    class_name = THREAT_CLASSES.get(class_id, f"unknown_class_{class_id}")
    labels.append(f"{class_name} {confidence:.2f}")

annotated_image = image.copy()
annotated_image = bbox_annotator.annotate(annotated_image, detections)
annotated_image = label_annotator.annotate(annotated_image, detections, labels)
annotated_image.thumbnail((800, 800))
annotated_image
```

## Acknowledgments

- **Roboflow** for the RF-DETR architecture
- **Hugging Face** for model hosting and distribution
- **PyTorch** ecosystem for deep learning framework
- **Supervision** library for computer vision utilities

**Disclaimer**: This model is designed for research purpose only. It's predictions cannot be taken into account for deployment right now.