dyldang commited on
Commit
125b90f
·
verified ·
1 Parent(s): d6d15fa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +140 -123
README.md CHANGED
@@ -1,214 +1,231 @@
1
- Model Description
2
 
3
- This project uses a YOLOv11 object detection model to identify bike lane infrastructure and related objects in street images.
4
 
5
- The model detects features such as bike lane markings, shared lanes, cyclists, and vehicles using bounding boxes and class labels. It was fine-tuned from a pre-trained YOLO model rather than trained from scratch, which allows it to learn from a relatively small dataset.
6
 
7
- The main goal of this project was not just to build a high-performing model, but to understand how well object detection works in this context and what limitations arise when working with real-world, imperfect data.
8
 
9
- Intended Use Cases:
10
 
11
- Exploring bike lane infrastructure in street imagery
 
 
 
12
 
13
- Supporting transportation research
14
 
15
- Analyzing road design and cyclist environments
16
 
17
- This model is best suited for exploratory or research purposes rather than real-world deployment.
18
 
19
- Training Data
20
 
21
- Dataset Source:
22
- Roboflow Universe – Bike Lane Computer Vision Dataset
23
 
24
- The dataset consists of 147 images of urban street environments, including a mix of road layouts, traffic conditions, and lighting scenarios.
25
 
26
- Classes and Distribution:
27
 
28
- Class Count
29
- Vehicle 253
30
- Bicycle Lane 129
31
- Shared Dotted Lane 124
32
- Solid Lane 59
33
- Cyclist 13
34
- Bicycle 2
35
- Car 2
36
 
37
- One of the most important characteristics of this dataset is strong class imbalance. Some classes, like vehicles and lane markings, appear frequently, while others like bicycles and cars have almost no examples. This has a direct impact on model performance.
38
 
39
- Data Collection & Characteristics:
40
- Images represent real-world urban roads, primarily in daytime conditions, with varying visibility of lane markings and objects.
41
 
42
- Annotation Process
 
 
 
 
 
 
 
 
43
 
44
- The dataset included pre-existing YOLO-format bounding box annotations.
45
-
46
- Instead of creating new annotations, I focused on reviewing and validating the existing ones. I manually inspected a subset of images to check:
47
-
48
- whether bounding boxes aligned correctly with objects
49
-
50
- whether labels were applied consistently
51
-
52
- No major corrections were made. While this allowed me to focus on model training and evaluation, it also represents a limitation, since annotation quality was not improved or standardized further.
53
 
54
- This is important because errors or inconsistencies in annotations can directly affect model performance, especially for less frequent classes.
55
 
56
- Dataset Split
57
 
58
- Train: 102 images (69%)
59
-
60
- Validation: 20 images (14%)
61
-
62
- Test: 16 images (11%)
63
-
64
- Data Augmentation
65
 
66
- Default YOLO augmentation techniques were used during training, including:
 
 
67
 
68
- horizontal flipping
69
 
70
- color variation
71
 
72
- mosaic augmentation
73
 
74
- Known Dataset Limitations
75
 
76
- Significant class imbalance
 
 
77
 
78
- Extremely small number of examples for some classes
79
 
80
- Limited dataset size overall
81
 
82
- Mostly urban, daytime conditions (lack of environmental diversity)
 
 
 
83
 
84
- Training Procedure
85
 
86
- The model was trained using the Ultralytics YOLOv11 framework in Google Colab.
87
 
88
- I fine-tuned a pre-trained model for 50 epochs using images resized to 640 × 640 pixels.
 
 
 
89
 
90
- Training Details:
91
 
92
- Framework: Ultralytics YOLOv11
93
 
94
- Epochs: 50
95
 
96
- Image size: 640
97
 
98
- Batch size: 16
 
 
 
 
 
99
 
100
- Learning rate: default YOLO settings
101
 
102
- Environment: Google Colab
103
 
104
- Training relied on transfer learning, which is especially useful given the small dataset size.
105
 
106
- Evaluation Results
 
 
107
 
108
- Key Metrics:
109
 
110
- Precision: ~0.88
 
 
111
 
112
- Recall: ~0.38
113
 
114
- mAP50: ~0.48
115
 
116
- Rather than focusing only on these numbers, it is more important to understand what they reveal about the model.
117
 
118
- The relatively high precision indicates that when the model makes a prediction, it is usually correct. However, the low recall suggests that the model is missing a significant number of objects.
119
 
120
- This imbalance between precision and recall shows that the model is somewhat conservative — it avoids false positives but fails to detect more difficult or less frequent objects.
121
 
122
- Per-Class Performance
123
 
124
- Strong performance on common classes (vehicles, lane markings)
125
 
126
- Weak performance on rare classes (bicycle, car)
127
 
128
- This is largely due to the extreme imbalance in the dataset.
129
 
130
- Key Visualizations
131
 
132
- ![Confusion Matrix](./confusion_matrix.png)
133
  ![Training Results](./results.png)
134
- ![Prediction Example](./val_batch0_pred.jpg)
135
-
136
- Performance Analysis
137
-
138
- The model performs best when:
139
 
140
- lane markings are clearly visible
141
 
142
- lighting conditions are consistent
143
 
144
- objects are not occluded
145
 
146
- However, the model struggles in several situations:
147
 
148
- faded or worn bike lane markings
 
 
 
149
 
150
- overlapping or partially blocked objects
151
 
152
- rare classes with very limited training data
153
 
154
- These results highlight that performance is not just about the model architecture, but heavily influenced by the dataset.
155
-
156
- In particular, the lack of examples for certain classes makes it difficult for the model to learn meaningful patterns.
 
157
 
158
- Limitations and Biases
 
 
 
159
 
160
- This model has several important limitations that should be clearly acknowledged.
161
 
162
- Failure Cases
163
 
164
- missed detections of bicycles and cars
165
 
166
- incorrect detections when lane markings are unclear
167
 
168
- confusion between similar lane types
 
 
169
 
170
- Data Biases
171
 
172
- overrepresentation of vehicles
173
 
174
- underrepresentation of rare classes
 
 
175
 
176
- limited diversity in environment and conditions
177
 
178
- Environmental Limitations
179
 
180
  The model may perform poorly under:
 
 
 
181
 
182
- low lighting conditions
183
-
184
- occlusion
185
-
186
- faded or damaged road markings
187
 
188
- Inappropriate Use Cases
189
 
190
- This model should not be used for:
191
 
192
- real-time safety systems
193
 
194
- autonomous driving
195
 
196
- decision-making in high-risk environments
 
 
 
197
 
198
- Sample Size Limitations
199
 
200
- Some classes (such as bicycle and car) have extremely limited training data, making reliable detection difficult. This directly impacts recall and overall model performance.
201
 
202
- Final Reflection
203
 
204
- This project demonstrates that even with a strong model like YOLOv11, performance is highly dependent on the dataset.
205
 
206
- Rather than focusing only on improving accuracy, this project highlights the importance of:
207
 
208
- dataset quality
209
 
210
- class balance
 
 
 
211
 
212
- annotation reliability
213
 
214
- Understanding these limitations is essential when applying computer vision models to real-world problems.
 
1
+ # Bike Lane Detection Model (YOLOv11)
2
 
3
+ ## Model Description
4
 
5
+ This project uses a **YOLOv11 object detection model** to identify bike lane infrastructure and related objects in urban street images.
6
 
7
+ The model detects features such as bike lane markings, shared lanes, cyclists, and vehicles using bounding boxes and class labels. It was fine-tuned from a pre-trained model rather than trained from scratch, which allows it to perform reasonably well even with a small dataset.
8
 
9
+ The goal of this project was not only to train a model, but to understand how dataset quality and structure affect performance in real-world computer vision tasks.
10
 
11
+ **Intended Use Cases:**
12
+ - Exploring bike lane infrastructure in street imagery
13
+ - Supporting transportation or urban planning research
14
+ - Analyzing cyclist environments and road conditions
15
 
16
+ This model is best suited for **research and learning purposes**, not real-world deployment.
17
 
18
+ ---
19
 
20
+ ## Training Data
21
 
22
+ ### Dataset Source
23
 
24
+ Roboflow Universe – Bike Lane Computer Vision Dataset
 
25
 
26
+ ---
27
 
28
+ ### Dataset Overview
29
 
30
+ The dataset contains **147 images** of urban street environments with varying road layouts, lighting conditions, and traffic scenarios.
 
 
 
 
 
 
 
31
 
32
+ ---
33
 
34
+ ### Class Distribution
 
35
 
36
+ | Class | Count |
37
+ |------|------|
38
+ | Vehicle | 253 |
39
+ | Bicycle Lane | 129 |
40
+ | Shared Dotted Lane | 124 |
41
+ | Solid Lane | 59 |
42
+ | Cyclist | 13 |
43
+ | Bicycle | 2 |
44
+ | Car | 2 |
45
 
46
+ This dataset shows **strong class imbalance**, where some classes appear very frequently while others have very few examples. This directly affects model performance.
 
 
 
 
 
 
 
 
47
 
48
+ ---
49
 
50
+ ### Annotation Process
51
 
52
+ The dataset included pre-existing YOLO-format bounding box annotations.
 
 
 
 
 
 
53
 
54
+ I reviewed a subset of images to validate annotation quality, focusing on:
55
+ - alignment of bounding boxes
56
+ - consistency of class labels
57
 
58
+ No major corrections were made. This allowed me to focus on model training and evaluation, but it also represents a limitation since annotation quality was not significantly improved.
59
 
60
+ This project therefore emphasizes **evaluation and understanding of model performance** rather than dataset refinement.
61
 
62
+ ---
63
 
64
+ ### Dataset Split
65
 
66
+ - Train: 102 images (69%)
67
+ - Validation: 20 images (14%)
68
+ - Test: 16 images (11%)
69
 
70
+ ---
71
 
72
+ ### Data Augmentation
73
 
74
+ Default YOLO augmentations were applied during training:
75
+ - horizontal flipping
76
+ - color adjustments
77
+ - mosaic augmentation
78
 
79
+ ---
80
 
81
+ ### Known Dataset Limitations
82
 
83
+ - Strong class imbalance
84
+ - Extremely small sample sizes for some classes
85
+ - Limited total dataset size
86
+ - Mostly daytime, urban conditions
87
 
88
+ ---
89
 
90
+ ## Training Procedure
91
 
92
+ The model was trained using the **Ultralytics YOLOv11 framework** in Google Colab.
93
 
94
+ Training used transfer learning, starting from a pre-trained model.
95
 
96
+ **Training Details:**
97
+ - Framework: YOLOv11 (Ultralytics)
98
+ - Epochs: 50
99
+ - Batch size: 16
100
+ - Image size: 640 × 640
101
+ - Environment: Google Colab
102
 
103
+ ---
104
 
105
+ ## Evaluation Results
106
 
107
+ ### Key Metrics
108
 
109
+ - Precision: ~0.88
110
+ - Recall: ~0.38
111
+ - mAP50: ~0.48
112
 
113
+ These metrics show that the model is **highly precise but has low recall**.
114
 
115
+ This means:
116
+ - The model is usually correct when it makes predictions
117
+ - But it misses many objects, especially harder or less frequent ones
118
 
119
+ ---
120
 
121
+ ### Example Predictions
122
 
123
+ ![Prediction](./val_batch0_pred.jpg)
124
 
125
+ This example shows successful detection of lane markings and vehicles under clear conditions.
126
 
127
+ ---
128
 
129
+ ### Confusion Matrix
130
 
131
+ ![Confusion Matrix](./confusion_matrix.png)
132
 
133
+ The confusion matrix highlights where the model struggles, particularly between similar lane types and rare classes.
134
 
135
+ ---
136
 
137
+ ### Training Results
138
 
 
139
  ![Training Results](./results.png)
 
 
 
 
 
140
 
141
+ The training curve shows steady learning, but performance plateaus due to dataset limitations.
142
 
143
+ ---
144
 
145
+ ### Failure Example
146
 
147
+ ![Failure Example](./failure_example.png)
148
 
149
+ This example shows a missed detection of a cyclist. This likely occurs due to:
150
+ - small object size
151
+ - occlusion
152
+ - lack of sufficient training examples
153
 
154
+ ---
155
 
156
+ ## Performance Analysis
157
 
158
+ The model performs best when:
159
+ - lane markings are clearly visible
160
+ - lighting conditions are consistent
161
+ - objects are large and unobstructed
162
 
163
+ The model struggles when:
164
+ - markings are faded or unclear
165
+ - objects overlap or are partially blocked
166
+ - objects are small or rare in the dataset
167
 
168
+ This suggests that **dataset quality and balance are more important than model complexity** in this case.
169
 
170
+ ---
171
 
172
+ ## Limitations and Biases
173
 
174
+ ### Failure Cases
175
 
176
+ - Missed detections of cyclists and small objects
177
+ - Confusion between similar lane types
178
+ - Reduced accuracy in cluttered scenes
179
 
180
+ ---
181
 
182
+ ### Data Biases
183
 
184
+ - Overrepresentation of vehicles
185
+ - Underrepresentation of bicycles and cars
186
+ - Limited environmental diversity
187
 
188
+ ---
189
 
190
+ ### Environmental Limitations
191
 
192
  The model may perform poorly under:
193
+ - low lighting
194
+ - occlusion
195
+ - worn or faded lane markings
196
 
197
+ ---
 
 
 
 
198
 
199
+ ### Additional Observations
200
 
201
+ The model sometimes misclassifies lane types (e.g., solid vs shared lanes) when markings are partially broken or unclear. This suggests the model relies heavily on strong visual patterns.
202
 
203
+ ---
204
 
205
+ ### Inappropriate Use Cases
206
 
207
+ This model should **not** be used for:
208
+ - autonomous driving systems
209
+ - real-time safety decisions
210
+ - high-risk environments
211
 
212
+ ---
213
 
214
+ ### Sample Size Limitations
215
 
216
+ Some classes (e.g., bicycle and car) have extremely limited training data, making reliable detection difficult. This contributes directly to low recall.
217
 
218
+ ---
219
 
220
+ ## Final Reflection
221
 
222
+ This project demonstrates that model performance is heavily dependent on dataset quality.
223
 
224
+ Even with a strong model like YOLOv11, issues such as:
225
+ - class imbalance
226
+ - small dataset size
227
+ - annotation limitations
228
 
229
+ can significantly impact results.
230
 
231
+ Overall, this project highlights the importance of **data quality, not just model choice**, in computer vision applications.