File size: 8,364 Bytes

af09791
 
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
4c72429
 
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
 
 
 
8d2e8e8
 
125b90f
0911d4e
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
a70d818
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
0911d4e
125b90f
0911d4e
125b90f
 
 
 
 
 
 
 
 
0911d4e
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
a70d818
8d2e8e8
125b90f
 
 
8d2e8e8
125b90f
8d2e8e8
a70d818
8d2e8e8
125b90f
8d2e8e8
125b90f
0911d4e
125b90f
 
 
0911d4e
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
 
 
 
8d2e8e8
125b90f
0911d4e
125b90f
8d2e8e8
125b90f
 
 
 
8d2e8e8
125b90f
0911d4e
125b90f
8d2e8e8
125b90f
8d2e8e8
4c72429
 
8d2e8e8
125b90f
 
 
 
 
 
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
 
 
8d2e8e8
a70d818
8d2e8e8
125b90f
 
a70d818
8d2e8e8
6be3ff8
a70d818
4c72429
6be3ff8
 
 
 
 
 
 
 
 
 
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
b728eca
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
0911d4e
 
a70d818
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
109aca2
 
a70d818
109aca2
a70d818
109aca2
1ce3d2e
8d2e8e8
125b90f
8d2e8e8
ded7a3b
 
 
 
 
 
 
 
 
 
 
 
125b90f
8d2e8e8
125b90f
 
 
 
8d2e8e8
125b90f
 
 
 
8d2e8e8
125b90f
0911d4e
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
 
 
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
 
 
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
0911d4e
125b90f
 
 
0911d4e
125b90f
0911d4e
125b90f
8d2e8e8
a70d818
8d2e8e8
125b90f
8d2e8e8
125b90f
8d2e8e8
125b90f
 
 
 
0911d4e
125b90f
0911d4e
125b90f
0911d4e
a70d818
0911d4e
125b90f
0911d4e
125b90f
0911d4e
a70d818
0911d4e
125b90f
 
 
 
0911d4e
a70d818
8d2e8e8
a70d818

# Bike Lane Detection Model 
By Dylan Dang

## Model Description

This project uses a **YOLOv11 object detection model** to identify bike lane infrastructure and related objects in urban street images.

YOLOv11 was chosen for its balance of speed and accuracy in real-time object detection tasks.

The model detects features such as bike lane markings, shared lanes, cyclists, and vehicles using bounding boxes and class labels. It was fine-tuned from a pre-trained model rather than trained from scratch, which allows it to perform reasonably well even with a small dataset.

The goal of this project was not only to train a model, but to understand how dataset quality and structure affect performance in real-world computer vision tasks.

**Intended Use Cases:**
- Exploring bike lane infrastructure in street imagery  
- Supporting transportation or urban planning research  
- Analyzing cyclist environments and road conditions  


---

## Training Data

### Dataset Source

Roboflow Universe – Bike Lane Computer Vision Dataset  
https://universe.roboflow.com/bike-lane/bike-lane

---

### Dataset Overview

The dataset contains **147 images** of urban street environments with varying road layouts, lighting conditions, and traffic scenarios.

---

### Class Distribution

| Class | Count |
|------|------|
| Vehicle | 253 |
| Bicycle Lane | 129 |
| Shared Dotted Lane | 124 |
| Solid Lane | 59 |
| Cyclist | 13 |
| Bicycle | 2 |
| Car | 2 |

This dataset shows **strong class imbalance**, where some classes appear very frequently while others have very few examples. This directly affects model performance.

---

### Annotation Process

The dataset included pre-existing YOLO-format bounding box annotations.
Although the dataset was pre-annotated, I reviewed samples in order to check for consistency and quality. I observed that some classes such as "car" and "vehicle" overlap conceptually, which may introduce ambiguity during training. No major corrections were made, but this overlap influenced how results were interpreted.

I reviewed a subset of images to validate annotation quality, focusing on:
- alignment of bounding boxes  
- consistency of class labels  

No major corrections were made. This allowed me to focus on model training and evaluation, but it also represents a limitation since annotation quality was not significantly improved.

This project therefore, emphasizes **evaluation and understanding of model performance** rather than dataset refinement which is something I wanted in this process.

---

### Dataset Split

- Train: 102 images (69%)  
- Validation: 20 images (14%)  
- Test: 16 images (11%)  

---

### Data Augmentation

Default YOLO augmentations were applied during training:
- horizontal flipping  
- color adjustments  
- mosaic augmentation  

---

### Known Dataset Limitations

- Strong class imbalance  
- Extremely small sample sizes for some classes  
- Limited total dataset size  
- Mostly daytime, urban conditions  

---

## Training Procedure

The model was trained using the **Ultralytics YOLOv11 framework** in Google Colab.

Training was performed in Google Colab using an RTX 3070. Training took approximately 5-10 minutes.


**Training Details:**
- Framework: YOLOv11 (Ultralytics)  
- Epochs: 50  
- Batch size: 16  
- Image size: 640 × 640  
- Environment: Google Colab  

---

## Evaluation Results

### Key Metrics

- Precision: ~0.88  
- Recall: ~0.38  
- mAP50: ~0.48  

According to the results, these metrics show that the model is **highly precise but has low recall**.

This means:
- The model is usually correct when it makes predictions  
- It misses many objects especially harder or less frequent ones  

Common classes such as "Vehicle" achieved higher precision and recall, while underrepresented classes like "Bicycle" and "Car" performed poorly due to limited training samples.
This made it so that the performance differences across classes were influenced by class imbalance, with larger classes performing more reliably.

Performance varied across classes, as shown in the table below:

| Class | Relative Performance |
|------|--------------------|
| Vehicle | High |
| Bicycle Lane | Moderate |
| Shared Lane | Moderate |
| Cyclist | Low |
| Bicycle | Very Low |

---

### Example Predictions

![Prediction](./val_batch0_pred.jpg)

This example shows the model’s predictions on a validation images. In my analysis I found that the model correctly identifies several road features, but also makes some errors, such as detecting a building as a vehicle and labeling a bus lane as a bike lane. This reflects the model’s tendency to rely on visual similarity and highlights limitations in distinguishing similar structures.

---

### Confusion Matrix

![Confusion Matrix](./confusion_matrix.png)

The confusion matrix highlights where the model struggles, particularly between similar lane types and rare classes.

---

### Training Results

![Training Results](./results.png)

The training curve shows steady learning, but performance plateaus due to the dataset limitations.

---

### Failure Example

![Failure Example](./failure_example.png)

This image shows two types of errors made by the model.

In the first image, the model incorrectly detects a vehicle where there is actually part of a building. This is likely because the building has visual features such as rectangular shapes and edges that resemble vehicles in the training data.

The second model identifies a bike lane where the road marking appears to be a bus lane. This suggests that the model has difficulty distinguishing between different types of lane markings, especially when they share similar visual patterns.

These errors highlight an important limitation, the model relies heavily on visual similarity rather than deeper contextual understanding. Since the dataset contains limited variation and strong class imbalance, the model may generalize incorrectly when encountering unfamiliar scenes.

---

## Evaluation

The model was evaluated using standard object detection metrics.

The precision and recall curves illustrate the model’s ability to detect bike lanes while minimizing false positives. The F1 curve shows the optimal balance between precision and recall, and the PR curve summarizes overall detection performance across confidence thresholds.

### Curves
![Precision Curve](BoxP_curve.png)
![Recall Curve](BoxR_curve.png)
![PR Curve](BoxPR_curve.png)
![F1 Curve](BoxF1_curve.png)

## Performance Analysis

The model performs best when:
- lane markings are clearly visible  
- lighting conditions are consistent  
- objects are large and unobstructed  

The model struggles when:
- markings are faded or unclear  
- objects overlap or are partially blocked  
- objects are small or rare in the dataset  

This suggests that **dataset quality and balance are more important than model complexity** in this case.

---

## Limitations and Biases

### Failure Cases

- Missed detections of cyclists and small objects  
- Confusion between similar lane types  
- Reduced accuracy in cluttered scenes  

---

### Data Biases

- Overrepresentation of vehicles  
- Underrepresentation of bicycles and cars  
- Limited environmental diversity  

---

### Environmental Limitations

The model may perform poorly under:
- low lighting  
- occlusion  
- worn or faded lane markings  

---

### Additional Observations

The model sometimes misclassifies lane types such as solid vs shared lane, when markings are partially broken or unclear. This suggests the model relies heavily on strong visual patterns.

---

### Inappropriate Use Cases

This model should **not** be used for:
- autonomous driving systems  
- real-time safety decisions  
- high-risk environments  

---

### Sample Size Limitations

Some classes like bicycles and car have extremely limited training data, making reliable detection difficult. This contributes directly to low recall.

---

## Final Reflection

I found that this project demonstrates that model performance is heavily dependent on dataset quality.

Even with a strong model like YOLOv11, issues such as:
- class imbalance  
- small dataset size  
- annotation limitations  

It can significantly impact results.

In the end this project to me highlights the importance of **data quality, not just model choice**, in computer vision applications.