--- license: mit language: - en tags: - object-detection - re-identification - construction - aerial-vision - rf-detr - dinov3 - osnet - real-time - tracking pipeline_tag: object-detection library_name: pytorch datasets: - roboflow --- # 🏗️ SiteSense — Model Weights **Real-Time Construction Equipment Monitoring via Aerial Computer Vision** [![GitHub](https://img.shields.io/badge/GitHub-Repository-181717?logo=github&logoColor=white)](https://github.com/Mahmoud-Zaafan/asdfqer) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) [![Python 3.11](https://img.shields.io/badge/Python-3.11-3776AB?logo=python&logoColor=white)](https://python.org) [![PyTorch 2.2+](https://img.shields.io/badge/PyTorch-2.2+-EE4C2C?logo=pytorch&logoColor=white)](https://pytorch.org) --- ## Overview This repository hosts the trained model weights for [SiteSense](https://github.com/Mahmoud-Zaafan/asdfqer) — a real-time pipeline that **detects, tracks, identifies, and classifies the activity** of heavy construction equipment from drone/aerial video footage. The system processes each frame through a multi-phase pipeline: ``` Video Frame → RF-DETR Detection → BoT-SORT Tracking → DINOv3 Re-ID → Activity Classification → Kafka Events ``` --- ## Model Weights | File | Size | Architecture | Task | Training Data | |:---|:---:|:---|:---|:---| | `rfdetr_construction.pth` | 122 MB | RF-DETR (Real-time Foundation DETR) | 8-class object detection | Custom aerial construction dataset (Roboflow) | | `dinov3_reid_head.pth` | 5.4 MB | Linear projection head (1536→256→128) | Equipment re-identification | Contrastive pairs from tracked equipment | | `osnet_x0_25_msmt17.pt` | 2.9 MB | OSNet x0.25 | Appearance-based ReID for BoT-SORT | MSMT17 (pretrained) | > **Note:** The DINOv3 ViT-B/16 backbone (~327 MB) is **not included** here. It is auto-downloaded from [facebook/dinov3-vitb16-pretrain-lvd1689m](https://huggingface.co/facebook/dinov3-vitb16-pretrain-lvd1689m) on first run using your `HF_TOKEN`. --- ## Detection Classes The RF-DETR detector is fine-tuned to recognize **8 classes** of construction equipment from aerial perspectives: | ID | Class | ID | Class | |:---:|:---|:---:|:---| | 0 | Excavator | 4 | Mobile Crane | | 1 | Dump Truck | 5 | Tower Crane | | 2 | Bulldozer | 6 | Roller Compactor | | 3 | Wheel Loader | 7 | Cement Mixer | --- ## Training Results ### RF-DETR Detector | Metric | Value | |:---|:---:| | **mAP@50** | 0.8340 | | **mAP@50:95** | 0.7607 | | **F1 Score** | 0.8859 | | **Precision** | 0.8666 | | **Recall** | 0.9061 | | Resolution | 560×560 | | Epochs | 70 | ### DINOv3 Re-ID Projection Head | Metric | Value | |:---|:---:| | **Contrastive Loss** | 0.0482 | | **Accuracy** | 96.8% | | Embedding Dim | 128-d L2-normalized | | Training Pairs | ~12,000 positive pairs | --- ## Quick Start ### Option A: Download All Weights (Recommended) ```bash pip install huggingface_hub huggingface-cli download Zaafan/sitesense-weights --local-dir models/ ``` ### Option B: Python API ```python from huggingface_hub import hf_hub_download # Download individual weights hf_hub_download(repo_id="Zaafan/sitesense-weights", filename="rfdetr_construction.pth", local_dir="models/") hf_hub_download(repo_id="Zaafan/sitesense-weights", filename="dinov3_reid_head.pth", local_dir="models/") hf_hub_download(repo_id="Zaafan/sitesense-weights", filename="osnet_x0_25_msmt17.pt", local_dir="models/") ``` ### Option C: Auto-Download (Zero Setup) The SiteSense pipeline automatically downloads missing weights on first run: ```python # In services/cv-inference/main.py — resolve_weights() handles this transparently weights_path = resolve_weights('rfdetr_construction.pth') # local first, HF fallback ``` --- ## Usage with SiteSense Pipeline ```bash # 1. Clone the repository git clone https://github.com/Mahmoud-Zaafan/asdfqer.git cd asdfqer # 2. Download weights huggingface-cli download Zaafan/sitesense-weights --local-dir models/ # 3. Configure environment cp .env.example .env # 4. Launch infrastructure + run pipeline docker compose up --build docker compose --profile pipeline up cv-inference ``` --- ## Citation If you use these weights in your research or projects, please cite: ```bibtex @misc{sitesense2025, author = {Mahmoud Zaafan}, title = {SiteSense: Real-Time Construction Equipment Monitoring via Aerial Computer Vision}, year = {2025}, url = {https://github.com/Mahmoud-Zaafan/SiteSense} } ``` --- ## License All weights are released under the [MIT License](LICENSE).