Spaces:
Build error
Build error
| title: PotholeNet YOLO11m | |
| emoji: π | |
| colorFrom: red | |
| colorTo: gray | |
| sdk: gradio | |
| sdk_version: 4.44.1 | |
| app_file: app.py | |
| pinned: false | |
| license: mit | |
| python_version: 3.10 | |
| # π§Ύ Model Card β PotholeNet-YOLO11m-v1 | |
| ## π§ Model Overview | |
| **PotholeNet-YOLO11m-v1** is a fine-tuned object detection model built on **Ultralytics YOLO11m** architecture, specifically trained to detect potholes, road damage, and garbage from street-level imagery. The model leverages YOLO11m's C2PSA (Cross-Stage Partial with Spatial Attention) mechanism, making it highly effective at identifying irregular-shaped urban defects like potholes. | |
| Trained on a large-scale, curated civic infrastructure dataset of **23,000+ street-level images** from Indian urban environments, this model is designed to power real-time civic issue detection systems, enabling automated reporting and faster municipal response. | |
| It serves as the **Detection Layer (Layer 1)** of the **Aamchi City AI Civic System** β an end-to-end intelligent dashboard for urban infrastructure monitoring. | |
| --- | |
| ## ποΈ Training Details | |
| | Parameter | Value | | |
| |:---|:---| | |
| | **Base Model** | `yolo11m.pt` (COCO pretrained) | | |
| | **Architecture** | YOLO11m (C3k2 + C2PSA Spatial Attention) | | |
| | **Framework** | Ultralytics v8.x | | |
| | **Training Hardware** | Kaggle β NVIDIA T4 Γ2 (Dual GPU) | | |
| | **Epochs** | 50 | | |
| | **Input Resolution** | 768Γ768 | | |
| | **Batch Size** | Auto (`batch=-1`) | | |
| | **Optimizer** | AdamW | | |
| | **Learning Rate** | `lr0=0.001`, cosine decay to `lrf=0.01` | | |
| | **Warmup** | 3 epochs | | |
| | **Weight Decay** | 0.0005 | | |
| | **AMP** | Enabled (FP16 mixed precision) | | |
| | **Early Stopping** | `patience=10` (did not trigger β model was still improving) | | |
| ### Loss Weights | |
| | Loss | Weight | | |
| |:---|:---| | |
| | Box Loss | 7.5 | | |
| | Classification Loss | 1.0 | | |
| | DFL Loss | 1.5 | | |
| ### Augmentation Pipeline | |
| | Augmentation | Value | | |
| |:---|:---| | |
| | Mosaic | 1.0 | | |
| | MixUp | 0.15 | | |
| | Copy-Paste | 0.1 | | |
| | HSV (H/S/V) | 0.015 / 0.7 / 0.4 | | |
| | Rotation | Β±10Β° | | |
| | Scale | 0.5 | | |
| | Shear | 2.0 | | |
| | Horizontal Flip | 0.5 | | |
| | Erasing | 0.3 | | |
| | Label Smoothing | 0.05 | | |
| | Close Mosaic | Last 8 epochs | | |
| --- | |
| ## π Dataset Description | |
| The model was trained on a curated subset of **23,179 street-level images** collected from Indian urban environments. The dataset underwent extensive preprocessing: | |
| - **Perceptual Hash (pHash) Deduplication** β Removed near-duplicate images using hamming distance β€ 4 | |
| - **Corrupt Image Removal** β Verified all images via PIL | |
| - **Intelligent Negative Sampling** β Trimmed empty-label (background) images to 2,000 hard negatives | |
| - **Stratified Split** β 80% Train / 15% Val / 5% Test, stratified by dominant class | |
| ### Label Classes | |
| | Class ID | Class Name | Description | | |
| |:---|:---|:---| | |
| | π΄ 0 | **Pothole** | Road surface cavities and depressions | | |
| | π‘ 1 | **Road Damage** | Cracks, surface wear, and structural deterioration | | |
| | π’ 2 | **Garbage** | Street-level waste and debris accumulation | | |
| > **Priority:** Pothole (primary) > Garbage > Road Damage | |
| --- | |
| ## π― Evaluation Metrics | |
| | Metric | Score | | |
| |:---|:---| | |
| | **mAP50** | **0.60** | | |
| | **mAP50-95** | β | | |
| | **Parameters** | ~20M | | |
| | **Model Size** | ~39 MB | | |
| | **Inference Speed** | Real-time on GPU | | |
| > β‘ The model did not trigger early stopping at 50 epochs, indicating further training could yield additional performance gains. | |
| --- | |
| ## π¬ Example Usage | |
| ### Python (Ultralytics) | |
| ```python | |
| from ultralytics import YOLO | |
| # Load model | |
| model = YOLO("best.pt") | |
| # Run inference | |
| results = model("street_image.jpg", imgsz=768, conf=0.25) | |
| # Display results | |
| results[0].show() | |
| # Access detections | |
| for box in results[0].boxes: | |
| cls = int(box.cls) | |
| conf = float(box.conf) | |
| xyxy = box.xyxy[0].tolist() | |
| class_names = {0: "pothole", 1: "road_damage", 2: "garbage"} | |
| print(f"{class_names[cls]}: {conf:.2f} at {xyxy}") | |
| ``` | |
| ### With Test-Time Augmentation (TTA) | |
| ```python | |
| # TTA boosts mAP by +1-3% at the cost of inference speed | |
| results = model("street_image.jpg", imgsz=768, conf=0.25, augment=True) | |
| ``` | |
| ### Filter Pothole-Only Detections | |
| ```python | |
| results = model("street_image.jpg", conf=0.25) | |
| boxes = results[0].boxes | |
| pothole_mask = boxes.cls == 0 | |
| pothole_boxes = boxes[pothole_mask] | |
| print(f"Found {len(pothole_boxes)} potholes") | |
| ``` | |
| --- | |
| ## π§© Intended Use | |
| - **Real-time pothole detection** from dashcam, mobile phone, or street-view imagery | |
| - **Automated civic issue reporting** β GPS-tagged detection for municipal dashboards | |
| - **Infrastructure health monitoring** β Severity scoring and trend analysis for road maintenance | |
| - **Smart city integration** β Layer 1 detection input for AI-driven civic action systems | |
| - **Mobile deployment** β Exportable to ONNX for edge inference on mobile devices | |
| --- | |
| ## β οΈ Limitations | |
| - The model is optimized for **Indian urban road conditions**; performance may degrade on highways, rural roads, or non-Indian geographies. | |
| - **Road damage** class has visual overlap with potholes, which may cause occasional misclassification between the two. | |
| - Performance is best on **daytime, clear-weather imagery** β low-light and rain-occluded scenes may reduce accuracy. | |
| - The model was trained for **50 epochs without early stopping trigger**, suggesting the checkpoint is not fully converged and further fine-tuning could improve results. | |
| - **Small potholes** (< 32px at 768px resolution) may be missed in wide-angle shots. | |
| --- | |
| ## π§βπ» Developer | |
| | | | | |
| |:---|:---| | |
| | **Author** | Vansh Momaya | | |
| | **Institution** | D. J. Sanghvi College of Engineering | | |
| | **Focus Area** | Computer Vision, Object Detection, AI for Civic Infrastructure | | |
| | **Email** | vanshmomaya9@gmail.com | | |
| --- | |
| ## π Citation | |
| If you use PotholeNet-YOLO11m-v1 in your research or project: | |
| ```bibtex | |
| @online{momaya2026potholenet, | |
| author = {Vansh Momaya}, | |
| title = {PotholeNet-YOLO11m-v1: Real-Time Pothole and Civic Issue Detection for Indian Urban Roads}, | |
| year = {2026}, | |
| version = {v1}, | |
| url = {https://huggingface.co/Vansh180/PotholeNet-YOLO11m-v1}, | |
| institution = {D. J. Sanghvi College of Engineering}, | |
| note = {Fine-tuned YOLO11m model for detecting potholes, road damage, and garbage in Indian street imagery}, | |
| license = {MIT} | |
| } | |
| ``` | |
| --- | |
| ## π Acknowledgements | |
| - **[Ultralytics YOLO11](https://github.com/ultralytics/ultralytics)** β Base architecture and training framework | |
| - **[Kaggle](https://www.kaggle.com)** β Training infrastructure (Dual T4 GPU) | |
| - **Aamchi City β Datahack 4** β Hackathon context and dataset | |
| --- | |
| *Built for the Aamchi City AI Civic System β Datahack 4, PS2 Core ML* | |