File size: 6,513 Bytes
d8e8655
7cfaefb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c30e125
7cfaefb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
# 🧾 Model Card β€” CivicAi-YOLO11m-v1

## 🧠 Model Overview

**PotholeNet-YOLO11m-v1** is a fine-tuned object detection model built on **Ultralytics YOLO11m** architecture, specifically trained to detect potholes, road damage, and garbage from street-level imagery. The model leverages YOLO11m's C2PSA (Cross-Stage Partial with Spatial Attention) mechanism, making it highly effective at identifying irregular-shaped urban defects like potholes.

Trained on a large-scale, curated civic infrastructure dataset of **23,000+ street-level images** from Indian urban environments, this model is designed to power real-time civic issue detection systems, enabling automated reporting and faster municipal response.

It serves as the **Detection Layer (Layer 1)** of the **Aamchi City AI Civic System** β€” an end-to-end intelligent dashboard for urban infrastructure monitoring.

---

## πŸ—οΈ Training Details

| Parameter | Value |
|:---|:---|
| **Base Model** | `yolo11m.pt` (COCO pretrained) |
| **Architecture** | YOLO11m (C3k2 + C2PSA Spatial Attention) |
| **Framework** | Ultralytics v8.x |
| **Training Hardware** | Kaggle β€” NVIDIA T4 Γ—2 (Dual GPU) |
| **Epochs** | 50 |
| **Input Resolution** | 768Γ—768 |
| **Batch Size** | Auto (`batch=-1`) |
| **Optimizer** | AdamW |
| **Learning Rate** | `lr0=0.001`, cosine decay to `lrf=0.01` |
| **Warmup** | 3 epochs |
| **Weight Decay** | 0.0005 |
| **AMP** | Enabled (FP16 mixed precision) |
| **Early Stopping** | `patience=10` (did not trigger β€” model was still improving) |

### Loss Weights
| Loss | Weight |
|:---|:---|
| Box Loss | 7.5 |
| Classification Loss | 1.0 |
| DFL Loss | 1.5 |

### Augmentation Pipeline
| Augmentation | Value |
|:---|:---|
| Mosaic | 1.0 |
| MixUp | 0.15 |
| Copy-Paste | 0.1 |
| HSV (H/S/V) | 0.015 / 0.7 / 0.4 |
| Rotation | Β±10Β° |
| Scale | 0.5 |
| Shear | 2.0 |
| Horizontal Flip | 0.5 |
| Erasing | 0.3 |
| Label Smoothing | 0.05 |
| Close Mosaic | Last 8 epochs |

---

## πŸ“Š Dataset Description

The model was trained on a curated subset of **23,179 street-level images** collected from Indian urban environments. The dataset underwent extensive preprocessing:

- **Perceptual Hash (pHash) Deduplication** β€” Removed near-duplicate images using hamming distance ≀ 4
- **Corrupt Image Removal** β€” Verified all images via PIL
- **Intelligent Negative Sampling** β€” Trimmed empty-label (background) images to 2,000 hard negatives
- **Stratified Split** β€” 80% Train / 15% Val / 5% Test, stratified by dominant class

### Label Classes

| Class ID | Class Name | Description |
|:---|:---|:---|
| πŸ”΄ 0 | **Pothole** | Road surface cavities and depressions |
| 🟑 1 | **Road Damage** | Cracks, surface wear, and structural deterioration |
| 🟒 2 | **Garbage** | Street-level waste and debris accumulation |

> **Priority:** Pothole (primary) > Garbage > Road Damage

---

## 🎯 Evaluation Metrics

| Metric | Score |
|:---|:---|
| **mAP50** | **0.86** |
| **mAP50-95** | β€” |
| **Parameters** | ~20M |
| **Model Size** | ~39 MB |
| **Inference Speed** | Real-time on GPU |

> ⚑ The model did not trigger early stopping at 50 epochs, indicating further training could yield additional performance gains.

---

## πŸ’¬ Example Usage

### Python (Ultralytics)

```python
from ultralytics import YOLO

# Load model
model = YOLO("best.pt")

# Run inference
results = model("street_image.jpg", imgsz=768, conf=0.25)

# Display results
results[0].show()

# Access detections
for box in results[0].boxes:
    cls = int(box.cls)
    conf = float(box.conf)
    xyxy = box.xyxy[0].tolist()
    class_names = {0: "pothole", 1: "road_damage", 2: "garbage"}
    print(f"{class_names[cls]}: {conf:.2f} at {xyxy}")
```

### With Test-Time Augmentation (TTA)

```python
# TTA boosts mAP by +1-3% at the cost of inference speed
results = model("street_image.jpg", imgsz=768, conf=0.25, augment=True)
```

### Filter Pothole-Only Detections

```python
results = model("street_image.jpg", conf=0.25)
boxes = results[0].boxes
pothole_mask = boxes.cls == 0
pothole_boxes = boxes[pothole_mask]
print(f"Found {len(pothole_boxes)} potholes")
```

---

## 🧩 Intended Use

- **Real-time pothole detection** from dashcam, mobile phone, or street-view imagery
- **Automated civic issue reporting** β€” GPS-tagged detection for municipal dashboards
- **Infrastructure health monitoring** β€” Severity scoring and trend analysis for road maintenance
- **Smart city integration** β€” Layer 1 detection input for AI-driven civic action systems
- **Mobile deployment** β€” Exportable to ONNX for edge inference on mobile devices

---

## ⚠️ Limitations

- The model is optimized for **Indian urban road conditions**; performance may degrade on highways, rural roads, or non-Indian geographies.
- **Road damage** class has visual overlap with potholes, which may cause occasional misclassification between the two.
- Performance is best on **daytime, clear-weather imagery** β€” low-light and rain-occluded scenes may reduce accuracy.
- The model was trained for **50 epochs without early stopping trigger**, suggesting the checkpoint is not fully converged and further fine-tuning could improve results.
- **Small potholes** (< 32px at 768px resolution) may be missed in wide-angle shots.

---

## πŸ§‘β€πŸ’» Developer

| | |
|:---|:---|
| **Author** | Vansh Momaya |
| **Institution** | D. J. Sanghvi College of Engineering |
| **Focus Area** | Computer Vision, Object Detection, AI for Civic Infrastructure |
| **Email** | vanshmomaya9@gmail.com |

---

## 🌍 Citation

If you use PotholeNet-YOLO11m-v1 in your research or project:

```bibtex
@online{momaya2026potholenet,
  author       = {Vansh Momaya},
  title        = {PotholeNet-YOLO11m-v1: Real-Time Pothole and Civic Issue Detection for Indian Urban Roads},
  year         = {2026},
  version      = {v1},
  url          = {https://huggingface.co/Vansh180/PotholeNet-YOLO11m-v1},
  institution  = {D. J. Sanghvi College of Engineering},
  note         = {Fine-tuned YOLO11m model for detecting potholes, road damage, and garbage in Indian street imagery},
  license      = {MIT}
}
```

---

## πŸš€ Acknowledgements

- **[Ultralytics YOLO11](https://github.com/ultralytics/ultralytics)** β€” Base architecture and training framework
- **[Kaggle](https://www.kaggle.com)** β€” Training infrastructure (Dual T4 GPU)
- **Aamchi City β€” Datahack 4** β€” Hackathon context and dataset

---

*Built for the Aamchi City AI Civic System β€” Datahack 4, PS2 Core ML*