File size: 6,513 Bytes
d8e8655 7cfaefb c30e125 7cfaefb | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 | # π§Ύ Model Card β CivicAi-YOLO11m-v1
## π§ Model Overview
**PotholeNet-YOLO11m-v1** is a fine-tuned object detection model built on **Ultralytics YOLO11m** architecture, specifically trained to detect potholes, road damage, and garbage from street-level imagery. The model leverages YOLO11m's C2PSA (Cross-Stage Partial with Spatial Attention) mechanism, making it highly effective at identifying irregular-shaped urban defects like potholes.
Trained on a large-scale, curated civic infrastructure dataset of **23,000+ street-level images** from Indian urban environments, this model is designed to power real-time civic issue detection systems, enabling automated reporting and faster municipal response.
It serves as the **Detection Layer (Layer 1)** of the **Aamchi City AI Civic System** β an end-to-end intelligent dashboard for urban infrastructure monitoring.
---
## ποΈ Training Details
| Parameter | Value |
|:---|:---|
| **Base Model** | `yolo11m.pt` (COCO pretrained) |
| **Architecture** | YOLO11m (C3k2 + C2PSA Spatial Attention) |
| **Framework** | Ultralytics v8.x |
| **Training Hardware** | Kaggle β NVIDIA T4 Γ2 (Dual GPU) |
| **Epochs** | 50 |
| **Input Resolution** | 768Γ768 |
| **Batch Size** | Auto (`batch=-1`) |
| **Optimizer** | AdamW |
| **Learning Rate** | `lr0=0.001`, cosine decay to `lrf=0.01` |
| **Warmup** | 3 epochs |
| **Weight Decay** | 0.0005 |
| **AMP** | Enabled (FP16 mixed precision) |
| **Early Stopping** | `patience=10` (did not trigger β model was still improving) |
### Loss Weights
| Loss | Weight |
|:---|:---|
| Box Loss | 7.5 |
| Classification Loss | 1.0 |
| DFL Loss | 1.5 |
### Augmentation Pipeline
| Augmentation | Value |
|:---|:---|
| Mosaic | 1.0 |
| MixUp | 0.15 |
| Copy-Paste | 0.1 |
| HSV (H/S/V) | 0.015 / 0.7 / 0.4 |
| Rotation | Β±10Β° |
| Scale | 0.5 |
| Shear | 2.0 |
| Horizontal Flip | 0.5 |
| Erasing | 0.3 |
| Label Smoothing | 0.05 |
| Close Mosaic | Last 8 epochs |
---
## π Dataset Description
The model was trained on a curated subset of **23,179 street-level images** collected from Indian urban environments. The dataset underwent extensive preprocessing:
- **Perceptual Hash (pHash) Deduplication** β Removed near-duplicate images using hamming distance β€ 4
- **Corrupt Image Removal** β Verified all images via PIL
- **Intelligent Negative Sampling** β Trimmed empty-label (background) images to 2,000 hard negatives
- **Stratified Split** β 80% Train / 15% Val / 5% Test, stratified by dominant class
### Label Classes
| Class ID | Class Name | Description |
|:---|:---|:---|
| π΄ 0 | **Pothole** | Road surface cavities and depressions |
| π‘ 1 | **Road Damage** | Cracks, surface wear, and structural deterioration |
| π’ 2 | **Garbage** | Street-level waste and debris accumulation |
> **Priority:** Pothole (primary) > Garbage > Road Damage
---
## π― Evaluation Metrics
| Metric | Score |
|:---|:---|
| **mAP50** | **0.86** |
| **mAP50-95** | β |
| **Parameters** | ~20M |
| **Model Size** | ~39 MB |
| **Inference Speed** | Real-time on GPU |
> β‘ The model did not trigger early stopping at 50 epochs, indicating further training could yield additional performance gains.
---
## π¬ Example Usage
### Python (Ultralytics)
```python
from ultralytics import YOLO
# Load model
model = YOLO("best.pt")
# Run inference
results = model("street_image.jpg", imgsz=768, conf=0.25)
# Display results
results[0].show()
# Access detections
for box in results[0].boxes:
cls = int(box.cls)
conf = float(box.conf)
xyxy = box.xyxy[0].tolist()
class_names = {0: "pothole", 1: "road_damage", 2: "garbage"}
print(f"{class_names[cls]}: {conf:.2f} at {xyxy}")
```
### With Test-Time Augmentation (TTA)
```python
# TTA boosts mAP by +1-3% at the cost of inference speed
results = model("street_image.jpg", imgsz=768, conf=0.25, augment=True)
```
### Filter Pothole-Only Detections
```python
results = model("street_image.jpg", conf=0.25)
boxes = results[0].boxes
pothole_mask = boxes.cls == 0
pothole_boxes = boxes[pothole_mask]
print(f"Found {len(pothole_boxes)} potholes")
```
---
## π§© Intended Use
- **Real-time pothole detection** from dashcam, mobile phone, or street-view imagery
- **Automated civic issue reporting** β GPS-tagged detection for municipal dashboards
- **Infrastructure health monitoring** β Severity scoring and trend analysis for road maintenance
- **Smart city integration** β Layer 1 detection input for AI-driven civic action systems
- **Mobile deployment** β Exportable to ONNX for edge inference on mobile devices
---
## β οΈ Limitations
- The model is optimized for **Indian urban road conditions**; performance may degrade on highways, rural roads, or non-Indian geographies.
- **Road damage** class has visual overlap with potholes, which may cause occasional misclassification between the two.
- Performance is best on **daytime, clear-weather imagery** β low-light and rain-occluded scenes may reduce accuracy.
- The model was trained for **50 epochs without early stopping trigger**, suggesting the checkpoint is not fully converged and further fine-tuning could improve results.
- **Small potholes** (< 32px at 768px resolution) may be missed in wide-angle shots.
---
## π§βπ» Developer
| | |
|:---|:---|
| **Author** | Vansh Momaya |
| **Institution** | D. J. Sanghvi College of Engineering |
| **Focus Area** | Computer Vision, Object Detection, AI for Civic Infrastructure |
| **Email** | vanshmomaya9@gmail.com |
---
## π Citation
If you use PotholeNet-YOLO11m-v1 in your research or project:
```bibtex
@online{momaya2026potholenet,
author = {Vansh Momaya},
title = {PotholeNet-YOLO11m-v1: Real-Time Pothole and Civic Issue Detection for Indian Urban Roads},
year = {2026},
version = {v1},
url = {https://huggingface.co/Vansh180/PotholeNet-YOLO11m-v1},
institution = {D. J. Sanghvi College of Engineering},
note = {Fine-tuned YOLO11m model for detecting potholes, road damage, and garbage in Indian street imagery},
license = {MIT}
}
```
---
## π Acknowledgements
- **[Ultralytics YOLO11](https://github.com/ultralytics/ultralytics)** β Base architecture and training framework
- **[Kaggle](https://www.kaggle.com)** β Training infrastructure (Dual T4 GPU)
- **Aamchi City β Datahack 4** β Hackathon context and dataset
---
*Built for the Aamchi City AI Civic System β Datahack 4, PS2 Core ML*
|