CarDD YOLOv8s β Car Damage Detection & Segmentation
YOLOv8s fine-tuned on the CarDD dataset for vision-based car damage detection and instance segmentation. The model detects and segments six types of car damage and outperforms the original paper's best model (DCN+ ResNet-101) on every individual damage class.
ποΈ Model Versions
| Version |
File |
Task |
mAP50 |
| v1.0 |
v1.0/best.pt |
Detection only |
71.67% |
| v2.0 |
v2.0/best.pt |
Detection + Segmentation |
75.84% π₯ |
v2.0 is the recommended version. It adds instance segmentation masks on top of detection and achieves significantly better results across all metrics.
π v1.0 vs v2.0 β Side by Side Comparison
Overall Metrics
| Metric |
v1.0 (Detection) |
v2.0 (Segmentation) |
Improvement |
| mAP50 (Box) |
71.67% |
75.84% |
β
+4.17% |
| mAP50-95 (Box) |
55.43% |
58.93% |
β
+3.50% |
| Precision (Box) |
75.11% |
76.84% |
β
+1.73% |
| Recall (Box) |
67.52% |
68.27% |
β
+0.75% |
| Mask mAP50 |
β N/A |
75.90% |
β
New |
| Mask mAP50-95 |
β N/A |
57.00% |
β
New |
| Model size |
22.5 MB |
22.5 MB |
β‘οΈ Same |
| Inference speed |
7.7ms |
61ms |
β οΈ Slower* |
*Segmentation is slower due to mask generation on top of detection.
Per-Class Comparison (Box mAP50)
| Class |
v1.0 |
v2.0 |
Improvement |
| dent |
57.8% |
60.6% |
β
+2.8% |
| scratch |
59.6% |
63.2% |
β
+3.6% |
| crack |
37.7% |
55.5% |
β
+17.8% |
| glass_shatter |
99.0% |
97.9% |
β‘οΈ ~Same |
| lamp_broken |
85.5% |
86.8% |
β
+1.3% |
| tire_flat |
90.5% |
91.0% |
β
+0.5% |
The biggest gain is on crack (+17.8%) β the hardest class in the dataset.
π Full Results (v2.0)
Overall Performance (Test Set)
| Metric |
Box |
Mask |
| mAP@0.5 |
75.84% |
75.90% |
| mAP@0.5:0.95 |
58.93% |
57.00% |
| Precision |
76.84% |
77.40% |
| Recall |
68.27% |
68.70% |
Per-Class Results (v2.0 Test Set)
| Class |
Box mAP50 |
Mask mAP50 |
| glass_shatter |
97.9% |
97.9% |
| tire_flat |
91.0% |
90.2% |
| lamp_broken |
86.8% |
88.2% |
| scratch |
63.2% |
60.4% |
| dent |
60.6% |
64.9% |
| crack |
55.5% |
54.0% |
Benchmark Comparison vs CarDD Paper (v2.0)
| Model |
Backbone |
mAP50 |
| Mask R-CNN |
ResNet-50 |
66.3% |
| Cascade Mask R-CNN |
ResNet-50 |
64.7% |
| GCNet |
ResNet-50 |
66.4% |
| HTC |
ResNet-50 |
68.1% |
| DCN |
ResNet-50 |
70.9% |
| Mask R-CNN |
ResNet-101 |
67.7% |
| HTC |
ResNet-101 |
68.4% |
| DCN |
ResNet-101 |
69.8% |
| YOLOv8s-seg v2.0 (ours) |
β |
75.8% π₯ |
| DCN+ |
ResNet-50 |
77.4% |
| DCN+ |
ResNet-101 |
78.8% |
Per-Class vs DCN+ ResNet-101 (Paper's Best)
| Class |
DCN+ ResNet-101 |
v2.0 (Box) |
Gain |
| dent |
40.5% |
60.6% |
β
+20.1% |
| scratch |
34.3% |
63.2% |
β
+28.9% |
| crack |
16.6% |
55.5% |
β
+38.9% |
| glass_shatter |
92.6% |
97.9% |
β
+5.3% |
| lamp_broken |
70.8% |
86.8% |
β
+16.0% |
| tire_flat |
86.0% |
91.0% |
β
+5.0% |
v2.0 beats DCN+ ResNet-101 on every individual damage class, while being ~10x smaller (22.5MB vs 200MB+) and requiring only a consumer GPU to train.
π Training Curves (v2.0)

π’ Confusion Matrix (v2.0)

π PR & F1 Curves (v2.0)

πΌοΈ Predictions (v2.0)

π Dataset Label Distribution

π Training Curves (v1.0)

π Per-Class Comparison (v1.0)

πΌοΈ Ground Truth vs Predictions (v1.0)

πΌοΈ Predictions Grid (v1.0)

π Usage
Install Dependencies
pip install ultralytics huggingface_hub
Load v2.0 (Recommended)
from huggingface_hub import hf_hub_download
from ultralytics import YOLO
model_path = hf_hub_download(
repo_id="abdullahg7/cardd-yolov8s",
filename="v2.0/best.pt"
)
model = YOLO(model_path)
results = model.predict(
source="your_car_image.jpg",
conf=0.25,
iou=0.45
)
results[0].show()
Load v1.0 (Detection only)
model_path = hf_hub_download(
repo_id="abdullahg7/cardd-yolov8s",
filename="v1.0/best.pt"
)
model = YOLO(model_path)
Interpret Results (v2.0)
results = model.predict("car.jpg", conf=0.25)
for box in results[0].boxes:
class_name = model.names[int(box.cls)]
confidence = float(box.conf)
print(f"Detected: {class_name} ({confidence:.2%})")
if results[0].masks is not None:
for mask in results[0].masks.data:
print(f"Mask shape: {mask.shape}")
βοΈ Training Configuration (v2.0)
| Parameter |
Value |
| Base model |
yolov8s-seg.pt |
| Epochs |
50 |
| Batch size |
4 |
| Image size |
1024px |
| Optimizer |
SGD |
| LR0 |
0.01 |
| LRF |
0.01 |
| Momentum |
0.937 |
| Weight decay |
0.0005 |
| Warmup epochs |
5.0 |
| Cosine LR |
True |
| Close mosaic |
15 |
π΄ Damage Categories
| ID |
Class |
Description |
| 0 |
dent |
Surface deformation on car body |
| 1 |
scratch |
Linear paint damage |
| 2 |
crack |
Structural fractures |
| 3 |
glass_shatter |
Broken windows or windshields |
| 4 |
lamp_broken |
Damaged headlights or tail lights |
| 5 |
tire_flat |
Deflated or damaged tires |
π¦ Dataset
- Name: CarDD (Car Damage Detection)
- Size: 4,000 high-resolution images, 9,000+ annotated instances
- Access: cardd-ustc.github.io (requires license agreement)
- Paper: Wang et al., IEEE T-ITS, 2023
| Split |
Images |
Instances |
| Train |
2,816 |
6,211 |
| Val |
810 |
1,744 |
| Test |
374 |
785 |
β οΈ Limitations
- Lower performance on hard classes (dent, scratch, crack) due to small size and visual similarity
- Trained on exterior car damage only
- Best results on clear, well-lit images
π Links
π Citation
@article{wang2023cardd,
title = {CarDD: A New Dataset for Vision-Based Car Damage Detection},
author = {Wang, Xinkuang and Li, Wenjing and Wu, Zhongcheng},
journal = {IEEE Transactions on Intelligent Transportation Systems},
volume = {24},
number = {7},
pages = {7202--7214},
year = {2023}
}
π License
GNU AGPL-3.0 β in compliance with the YOLOv8 framework by Ultralytics.
β οΈ Dataset Notice: CarDD images are sourced from Flickr and Shutterstock. Proper licensing must be obtained before commercial use.