🦺 PPE Detection β€” YOLOv5 (Real-Time)

Real-time Personal Protective Equipment (PPE) detection for construction and industrial environments. Trained on a merged dataset of 9,675 images using YOLOv5s and YOLOv5m architectures.

Published at AAAI 2025 Summer Symposium β†’ Read the Paper

πŸ“ GitHub Repository β†’ Darth-Freljord/ppe-detection-yolov5


πŸ—‚οΈ Model Files

File Description mAP@0.5 FPS
yolov5m_a_best.pt Best model β€” YOLOv5m with anchor evolution, 896Γ—896 input 0.841 25–30
yolov5s_a_best.pt Lightweight β€” YOLOv5s, 640Γ—640 input 0.736 35–40
yolov5s_b_best.pt Transfer learning variant β€” YOLOv5s 0.683 35–40

⚑ Quick Start

import torch

# Load best model
model = torch.hub.load('ultralytics/yolov5', 'custom',
                        path='yolov5m_a_best.pt')

# Run inference on an image
results = model('construction_site.jpg')
results.show()
results.print()

Download with huggingface_hub

from huggingface_hub import hf_hub_download

model_path = hf_hub_download(
    repo_id="Darth-Freljord/ppe-detection-yolov5",
    filename="yolov5m_a_best.pt"
)

import torch
model = torch.hub.load('ultralytics/yolov5', 'custom', path=model_path)

πŸ“Š Performance

Overall Results

Model mAP@0.5 mAP@0.5:0.95 Precision Recall FPS
YOLOv5m(a) (recommended) 0.841 0.649 0.864 0.776 25–30
YOLOv5s(a) 0.736 0.409 0.748 0.721 35–40
YOLOv5s(b) 0.683 0.360 0.682 0.662 35–40

YOLOv5m(a) β€” Per-Class Performance

Class Precision Recall mAP@0.5 mAP@0.5:0.95
Mask 0.948 0.945 0.969 0.850
Helmet 0.873 0.833 0.901 0.713
Vest 0.852 0.762 0.818 0.640
Glasses 0.860 0.744 0.812 0.538
Person 0.860 0.816 0.890 0.674
Gloves 0.790 0.555 0.655 0.480

🏷️ Classes Detected

0: Person
1: Helmet
2: Vest
3: Gloves
4: Glasses
5: Mask

No-PPE Detection: The model also dynamically detects missing PPE (e.g. No-Helmet, No-Vest) at inference time without requiring additional trained classes. See detect_mod.py in the GitHub repo.


πŸ“ Training Data

Three datasets were merged into a unified dataset of 9,675 images (70:15:15 split):

Dataset Images Classes Annotation Format
Pictor-PPE (ciber-lab, 2019) 784 Person, Helmet, Vest Tab-separated
VOC2028 (njvisionpower, 2023) 7,581 Person, Helmet PASCAL VOC XML
CHV (ZijianWang-ZW, 2020) 1,330 Person, Vest, Helmet (multi-color) YOLO

The extended dataset used for YOLOv5m(a) was further augmented using a pretrained YOLOv9 model to generate additional Mask, Gloves, and Glasses labels.


βš™οΈ Training Configuration (YOLOv5m(a))

Parameter Value
Input Resolution 896Γ—896
Batch Size 4
Epochs 30 (initial) + 100 (post-evolution)
Frozen Layers 8 (initial), 0 (retraining)
Anchor Evolution 100 iterations
Learning Rate (lr0) 0.005
Momentum 0.937
Mosaic Augmentation 0.768
Base Weights yolov5m.pt (COCO pretrained)

πŸ“„ Citation

If you use this model, please cite:

@inproceedings{ahmed2025seeing,
  title={Seeing Safety: Computer Vision for Real-Time PPE Monitoring in the Middle East Construction Sector},
  author={Syed Suhail Ahmed},
  booktitle={AAAI 2025 Summer Symposium: Context-Awareness in Cyber Physical Systems},
  year={2025},
  url={https://ojs.aaai.org/index.php/AAAI-SS/article/view/36044}
}

πŸ”— Links

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support