Vansh180 commited on
Commit
b5a0c9c
·
verified ·
1 Parent(s): c1ee626

Upload 5 files

Browse files
Files changed (5) hide show
  1. Dockerfile +24 -0
  2. README.md +189 -9
  3. Vision Classification.pt +3 -0
  4. app.py +79 -0
  5. requirements.txt +4 -0
Dockerfile ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Use the official lightweight Python image.
2
+ FROM python:3.9-slim
3
+
4
+ # Set the working directory to /app
5
+ WORKDIR /app
6
+
7
+ # Copy the requirements file into the container
8
+ COPY requirements.txt .
9
+
10
+ # Install dependencies
11
+ RUN pip install --no-cache-dir -r requirements.txt
12
+
13
+ # Copy the rest of the file into the working directory
14
+ COPY . .
15
+
16
+ # Set up Gradio environment variables so it runs accurately within Docker
17
+ ENV GRADIO_SERVER_NAME="0.0.0.0"
18
+ ENV GRADIO_SERVER_PORT=7860
19
+
20
+ # Expose port required for Gradio
21
+ EXPOSE 7860
22
+
23
+ # Command to run the application
24
+ CMD ["python", "app.py"]
README.md CHANGED
@@ -1,12 +1,192 @@
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: PotholeNet V1
3
- emoji: 🐢
4
- colorFrom: yellow
5
- colorTo: purple
6
- sdk: gradio
7
- sdk_version: 6.12.0
8
- app_file: app.py
9
- pinned: false
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
+ # 🧾 Model Card — PotholeNet-YOLO11m-v1
2
+
3
+ ## 🧠 Model Overview
4
+
5
+ **PotholeNet-YOLO11m-v1** is a fine-tuned object detection model built on **Ultralytics YOLO11m** architecture, specifically trained to detect potholes, road damage, and garbage from street-level imagery. The model leverages YOLO11m's C2PSA (Cross-Stage Partial with Spatial Attention) mechanism, making it highly effective at identifying irregular-shaped urban defects like potholes.
6
+
7
+ Trained on a large-scale, curated civic infrastructure dataset of **23,000+ street-level images** from Indian urban environments, this model is designed to power real-time civic issue detection systems, enabling automated reporting and faster municipal response.
8
+
9
+ It serves as the **Detection Layer (Layer 1)** of the **Aamchi City AI Civic System** — an end-to-end intelligent dashboard for urban infrastructure monitoring.
10
+
11
  ---
12
+
13
+ ## 🏗️ Training Details
14
+
15
+ | Parameter | Value |
16
+ |:---|:---|
17
+ | **Base Model** | `yolo11m.pt` (COCO pretrained) |
18
+ | **Architecture** | YOLO11m (C3k2 + C2PSA Spatial Attention) |
19
+ | **Framework** | Ultralytics v8.x |
20
+ | **Training Hardware** | Kaggle — NVIDIA T4 ×2 (Dual GPU) |
21
+ | **Epochs** | 50 |
22
+ | **Input Resolution** | 768×768 |
23
+ | **Batch Size** | Auto (`batch=-1`) |
24
+ | **Optimizer** | AdamW |
25
+ | **Learning Rate** | `lr0=0.001`, cosine decay to `lrf=0.01` |
26
+ | **Warmup** | 3 epochs |
27
+ | **Weight Decay** | 0.0005 |
28
+ | **AMP** | Enabled (FP16 mixed precision) |
29
+ | **Early Stopping** | `patience=10` (did not trigger — model was still improving) |
30
+
31
+ ### Loss Weights
32
+ | Loss | Weight |
33
+ |:---|:---|
34
+ | Box Loss | 7.5 |
35
+ | Classification Loss | 1.0 |
36
+ | DFL Loss | 1.5 |
37
+
38
+ ### Augmentation Pipeline
39
+ | Augmentation | Value |
40
+ |:---|:---|
41
+ | Mosaic | 1.0 |
42
+ | MixUp | 0.15 |
43
+ | Copy-Paste | 0.1 |
44
+ | HSV (H/S/V) | 0.015 / 0.7 / 0.4 |
45
+ | Rotation | ±10° |
46
+ | Scale | 0.5 |
47
+ | Shear | 2.0 |
48
+ | Horizontal Flip | 0.5 |
49
+ | Erasing | 0.3 |
50
+ | Label Smoothing | 0.05 |
51
+ | Close Mosaic | Last 8 epochs |
52
+
53
+ ---
54
+
55
+ ## 📊 Dataset Description
56
+
57
+ The model was trained on a curated subset of **23,179 street-level images** collected from Indian urban environments. The dataset underwent extensive preprocessing:
58
+
59
+ - **Perceptual Hash (pHash) Deduplication** — Removed near-duplicate images using hamming distance ≤ 4
60
+ - **Corrupt Image Removal** — Verified all images via PIL
61
+ - **Intelligent Negative Sampling** — Trimmed empty-label (background) images to 2,000 hard negatives
62
+ - **Stratified Split** — 80% Train / 15% Val / 5% Test, stratified by dominant class
63
+
64
+ ### Label Classes
65
+
66
+ | Class ID | Class Name | Description |
67
+ |:---|:---|:---|
68
+ | 🔴 0 | **Pothole** | Road surface cavities and depressions |
69
+ | 🟡 1 | **Road Damage** | Cracks, surface wear, and structural deterioration |
70
+ | 🟢 2 | **Garbage** | Street-level waste and debris accumulation |
71
+
72
+ > **Priority:** Pothole (primary) > Garbage > Road Damage
73
+
74
+ ---
75
+
76
+ ## 🎯 Evaluation Metrics
77
+
78
+ | Metric | Score |
79
+ |:---|:---|
80
+ | **mAP50** | **0.60** |
81
+ | **mAP50-95** | — |
82
+ | **Parameters** | ~20M |
83
+ | **Model Size** | ~39 MB |
84
+ | **Inference Speed** | Real-time on GPU |
85
+
86
+ > ⚡ The model did not trigger early stopping at 50 epochs, indicating further training could yield additional performance gains.
87
+
88
+ ---
89
+
90
+ ## 💬 Example Usage
91
+
92
+ ### Python (Ultralytics)
93
+
94
+ ```python
95
+ from ultralytics import YOLO
96
+
97
+ # Load model
98
+ model = YOLO("best.pt")
99
+
100
+ # Run inference
101
+ results = model("street_image.jpg", imgsz=768, conf=0.25)
102
+
103
+ # Display results
104
+ results[0].show()
105
+
106
+ # Access detections
107
+ for box in results[0].boxes:
108
+ cls = int(box.cls)
109
+ conf = float(box.conf)
110
+ xyxy = box.xyxy[0].tolist()
111
+ class_names = {0: "pothole", 1: "road_damage", 2: "garbage"}
112
+ print(f"{class_names[cls]}: {conf:.2f} at {xyxy}")
113
+ ```
114
+
115
+ ### With Test-Time Augmentation (TTA)
116
+
117
+ ```python
118
+ # TTA boosts mAP by +1-3% at the cost of inference speed
119
+ results = model("street_image.jpg", imgsz=768, conf=0.25, augment=True)
120
+ ```
121
+
122
+ ### Filter Pothole-Only Detections
123
+
124
+ ```python
125
+ results = model("street_image.jpg", conf=0.25)
126
+ boxes = results[0].boxes
127
+ pothole_mask = boxes.cls == 0
128
+ pothole_boxes = boxes[pothole_mask]
129
+ print(f"Found {len(pothole_boxes)} potholes")
130
+ ```
131
+
132
+ ---
133
+
134
+ ## 🧩 Intended Use
135
+
136
+ - **Real-time pothole detection** from dashcam, mobile phone, or street-view imagery
137
+ - **Automated civic issue reporting** — GPS-tagged detection for municipal dashboards
138
+ - **Infrastructure health monitoring** — Severity scoring and trend analysis for road maintenance
139
+ - **Smart city integration** — Layer 1 detection input for AI-driven civic action systems
140
+ - **Mobile deployment** — Exportable to ONNX for edge inference on mobile devices
141
+
142
+ ---
143
+
144
+ ## ⚠️ Limitations
145
+
146
+ - The model is optimized for **Indian urban road conditions**; performance may degrade on highways, rural roads, or non-Indian geographies.
147
+ - **Road damage** class has visual overlap with potholes, which may cause occasional misclassification between the two.
148
+ - Performance is best on **daytime, clear-weather imagery** — low-light and rain-occluded scenes may reduce accuracy.
149
+ - The model was trained for **50 epochs without early stopping trigger**, suggesting the checkpoint is not fully converged and further fine-tuning could improve results.
150
+ - **Small potholes** (< 32px at 768px resolution) may be missed in wide-angle shots.
151
+
152
+ ---
153
+
154
+ ## 🧑‍💻 Developer
155
+
156
+ | | |
157
+ |:---|:---|
158
+ | **Author** | Vansh Momaya |
159
+ | **Institution** | D. J. Sanghvi College of Engineering |
160
+ | **Focus Area** | Computer Vision, Object Detection, AI for Civic Infrastructure |
161
+ | **Email** | vanshmomaya9@gmail.com |
162
+
163
+ ---
164
+
165
+ ## 🌍 Citation
166
+
167
+ If you use PotholeNet-YOLO11m-v1 in your research or project:
168
+
169
+ ```bibtex
170
+ @online{momaya2026potholenet,
171
+ author = {Vansh Momaya},
172
+ title = {PotholeNet-YOLO11m-v1: Real-Time Pothole and Civic Issue Detection for Indian Urban Roads},
173
+ year = {2026},
174
+ version = {v1},
175
+ url = {https://huggingface.co/Vansh180/PotholeNet-YOLO11m-v1},
176
+ institution = {D. J. Sanghvi College of Engineering},
177
+ note = {Fine-tuned YOLO11m model for detecting potholes, road damage, and garbage in Indian street imagery},
178
+ license = {MIT}
179
+ }
180
+ ```
181
+
182
+ ---
183
+
184
+ ## 🚀 Acknowledgements
185
+
186
+ - **[Ultralytics YOLO11](https://github.com/ultralytics/ultralytics)** — Base architecture and training framework
187
+ - **[Kaggle](https://www.kaggle.com)** — Training infrastructure (Dual T4 GPU)
188
+ - **Aamchi City — Datahack 4** — Hackathon context and dataset
189
+
190
  ---
191
 
192
+ *Built for the Aamchi City AI Civic System — Datahack 4, PS2 Core ML*
Vision Classification.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f380cd373f61f2bc71f7fcc1b0ec072194dc2cd933fd05bc1ae5ad136a333b78
3
+ size 40540780
app.py ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ from ultralytics import YOLO
3
+ import cv2
4
+ from PIL import Image
5
+
6
+ # Load the YOLO model - YOLOv11m for pothole, road damage, and garbage detection
7
+ try:
8
+ model = YOLO("Vision Classification.pt")
9
+ except Exception as e:
10
+ print(f"Error loading model: {e}")
11
+ model = None
12
+
13
+ def predict(image, conf_threshold):
14
+ if image is None or model is None:
15
+ return None, "Model not loaded or invalid image."
16
+
17
+ # Run inference (imgsz=768 based on model card)
18
+ results = model(image, imgsz=768, conf=conf_threshold)
19
+
20
+ # YOLO returns a list of Results objects
21
+ result = results[0]
22
+
23
+ # Plotting the detections on the image
24
+ # plot() returns a BGR numpy array
25
+ annotated_image = result.plot()
26
+
27
+ # Convert BGR to RGB for Gradio Display
28
+ annotated_image_rgb = cv2.cvtColor(annotated_image, cv2.COLOR_BGR2RGB)
29
+
30
+ # Detection overview text
31
+ boxes = result.boxes
32
+ class_names = result.names
33
+
34
+ if len(boxes) == 0:
35
+ detection_summary = "No civic issues detected in this image."
36
+ else:
37
+ # Count detections
38
+ detection_counts = {}
39
+ for box in boxes:
40
+ cls_id = int(box.cls)
41
+ cls_name = class_names[cls_id]
42
+ detection_counts[cls_name] = detection_counts.get(cls_name, 0) + 1
43
+
44
+ summary_lines = ["**Detections:**"]
45
+ for cls_name, count in detection_counts.items():
46
+ summary_lines.append(f"- {count} {cls_name}(s)")
47
+
48
+ detection_summary = "\n".join(summary_lines)
49
+
50
+ return Image.fromarray(annotated_image_rgb), detection_summary
51
+
52
+ # Gradio Interface
53
+ with gr.Blocks(title="PotholeNet-YOLO11m-v1 🛑") as interface:
54
+ gr.Markdown("# 🛑 PotholeNet-YOLO11m-v1")
55
+ gr.Markdown("**Aamchi City AI Civic System** — Real-time pothole, road damage, and garbage detection for Indian urban roads.")
56
+ gr.Markdown("Upload an image of a road to detect infrastructure issues. The model was trained on 23,000+ street-level images.")
57
+
58
+ with gr.Row():
59
+ with gr.Column():
60
+ input_image = gr.Image(type="pil", label="Upload Street Image")
61
+ conf_slider = gr.Slider(minimum=0.01, maximum=1.0, value=0.25, step=0.01, label="Confidence Threshold")
62
+ submit_btn = gr.Button("Detect Civic Issues", variant="primary")
63
+
64
+ with gr.Column():
65
+ output_image = gr.Image(type="pil", label="Detection Results")
66
+ detection_text = gr.Markdown(label="Detection Summary")
67
+
68
+ submit_btn.click(
69
+ fn=predict,
70
+ inputs=[input_image, conf_slider],
71
+ outputs=[output_image, detection_text]
72
+ )
73
+
74
+ gr.Markdown("### Intended Use")
75
+ gr.Markdown("Real-time pothole detection, Automated civic issue reporting, Infrastructure health monitoring.")
76
+ gr.Markdown("**Developer:** Vansh Momaya")
77
+
78
+ if __name__ == "__main__":
79
+ interface.launch(server_name="0.0.0.0", server_port=7860)
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ ultralytics
2
+ gradio
3
+ pillow
4
+ opencv-python-headless