π¦Ί PPE Detection β YOLOv5 (Real-Time)
Real-time Personal Protective Equipment (PPE) detection for construction and industrial environments. Trained on a merged dataset of 9,675 images using YOLOv5s and YOLOv5m architectures.
Published at AAAI 2025 Summer Symposium β Read the Paper
π GitHub Repository β Darth-Freljord/ppe-detection-yolov5
ποΈ Model Files
| File | Description | mAP@0.5 | FPS |
|---|---|---|---|
yolov5m_a_best.pt |
Best model β YOLOv5m with anchor evolution, 896Γ896 input | 0.841 | 25β30 |
yolov5s_a_best.pt |
Lightweight β YOLOv5s, 640Γ640 input | 0.736 | 35β40 |
yolov5s_b_best.pt |
Transfer learning variant β YOLOv5s | 0.683 | 35β40 |
β‘ Quick Start
import torch
# Load best model
model = torch.hub.load('ultralytics/yolov5', 'custom',
path='yolov5m_a_best.pt')
# Run inference on an image
results = model('construction_site.jpg')
results.show()
results.print()
Download with huggingface_hub
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(
repo_id="Darth-Freljord/ppe-detection-yolov5",
filename="yolov5m_a_best.pt"
)
import torch
model = torch.hub.load('ultralytics/yolov5', 'custom', path=model_path)
π Performance
Overall Results
| Model | mAP@0.5 | mAP@0.5:0.95 | Precision | Recall | FPS |
|---|---|---|---|---|---|
| YOLOv5m(a) (recommended) | 0.841 | 0.649 | 0.864 | 0.776 | 25β30 |
| YOLOv5s(a) | 0.736 | 0.409 | 0.748 | 0.721 | 35β40 |
| YOLOv5s(b) | 0.683 | 0.360 | 0.682 | 0.662 | 35β40 |
YOLOv5m(a) β Per-Class Performance
| Class | Precision | Recall | mAP@0.5 | mAP@0.5:0.95 |
|---|---|---|---|---|
| Mask | 0.948 | 0.945 | 0.969 | 0.850 |
| Helmet | 0.873 | 0.833 | 0.901 | 0.713 |
| Vest | 0.852 | 0.762 | 0.818 | 0.640 |
| Glasses | 0.860 | 0.744 | 0.812 | 0.538 |
| Person | 0.860 | 0.816 | 0.890 | 0.674 |
| Gloves | 0.790 | 0.555 | 0.655 | 0.480 |
π·οΈ Classes Detected
0: Person
1: Helmet
2: Vest
3: Gloves
4: Glasses
5: Mask
No-PPE Detection: The model also dynamically detects missing PPE (e.g.
No-Helmet,No-Vest) at inference time without requiring additional trained classes. Seedetect_mod.pyin the GitHub repo.
π Training Data
Three datasets were merged into a unified dataset of 9,675 images (70:15:15 split):
| Dataset | Images | Classes | Annotation Format |
|---|---|---|---|
| Pictor-PPE (ciber-lab, 2019) | 784 | Person, Helmet, Vest | Tab-separated |
| VOC2028 (njvisionpower, 2023) | 7,581 | Person, Helmet | PASCAL VOC XML |
| CHV (ZijianWang-ZW, 2020) | 1,330 | Person, Vest, Helmet (multi-color) | YOLO |
The extended dataset used for YOLOv5m(a) was further augmented using a pretrained YOLOv9 model to generate additional Mask, Gloves, and Glasses labels.
βοΈ Training Configuration (YOLOv5m(a))
| Parameter | Value |
|---|---|
| Input Resolution | 896Γ896 |
| Batch Size | 4 |
| Epochs | 30 (initial) + 100 (post-evolution) |
| Frozen Layers | 8 (initial), 0 (retraining) |
| Anchor Evolution | 100 iterations |
| Learning Rate (lr0) | 0.005 |
| Momentum | 0.937 |
| Mosaic Augmentation | 0.768 |
| Base Weights | yolov5m.pt (COCO pretrained) |
π Citation
If you use this model, please cite:
@inproceedings{ahmed2025seeing,
title={Seeing Safety: Computer Vision for Real-Time PPE Monitoring in the Middle East Construction Sector},
author={Syed Suhail Ahmed},
booktitle={AAAI 2025 Summer Symposium: Context-Awareness in Cyber Physical Systems},
year={2025},
url={https://ojs.aaai.org/index.php/AAAI-SS/article/view/36044}
}
π Links
- π Published Paper (AAAI 2025)
- π» GitHub Repository
- π Author LinkedIn