| # Object Detection Model for Rotten Fruits |
| **By Aiden Luo** |
|
|
| # Model Description |
|
|
| This is an object detection model that finds apples and oranges along with their rotten variants. |
| The model is fine tuned from YOLOv11, which orignally uses COCO as its dataset. |
| It is meant to detect rotten fruits on a conveyor belt when mass processing produce. |
|
|
| # Training Data |
| ### Dataset |
|
|
| [https://www.kaggle.com/datasets/muhammad0subhan/fruit-and-vegetable-disease-healthy-vs-rotten] |
|
|
| **Classes**: 28 |
|
|
| **Images**: 29,291 |
|
|
| **Data Collection** |
|
|
| Dataset is a combination of other datasets containing fruit and vegetables, rotten and healthy, then |
| manually valiadated and sorted. |
|
|
| ### Class Distribution and Annotations |
| I used only 4 of the 28 classes (Apple_Healthy, Orange_Healthy,Apple_Rotten, Orange_Rotten). |
| Annotated 1000 images using Roboflow's SAM3 autolabel, and validating 1000 of them. |
| Manually added 578 detections, and fixed about 30% of the annotations as they were false positives. |
|
|
| | Class Name | Total Count | Training Count (70%) | Validation Count (20%) | Test Count (10%)| |
| | ------------- | ----------- | -------------------- | ---------------------- | --------------- | |
| | Apple | 554 | 389 | 111 | 55 | |
| | Orange | 512 | 359 | 102 | 51 | |
| | RottenApple | 332 | 233 | 66 | 33 | |
| | RottenOrange | 246 | 172 | 49 | 25 | |
|
|
| ### Augmentations |
|
|
| - Rotation |
| - Translate |
| - Horizontal flipping |
| - Mosaic |
|
|
| ### Training Procedure |
|
|
| - **Framework** Ultralytics |
| - **Hardware** NVIDIA Tesla T4 |
| - **Batch Size** 64 |
| - **Epochs** 100 |
| - **Patience** 50 |
|
|
| ## Metrics (Epoch 100) |
|
|
| | epoch | class/intances | metrics/precision(B) | metrics/recall(B) | metrics/mAP50(B) | metrics/mAP50-95(B) | |
| | ----- | -----------------| -------------------- | ----------------- | ---------------- | ------------------- | |
| | 100 | All(264) | 0.944 | 0.899 | 0.964 | 0.899 | |
| | 100 | Apple(79) | 0.946 | 0.889 | 0.956 | 0.935 | |
| | 100 | Orange(58) | 0.916 | 0.914 | 0.957 | 0.902 | |
| | 100 | RottenApple(73) | 0.935 | 0.984 | 0.981 | 0.915 | |
| | 100 | RottenOrange(54) | 0.978 | 0.808 | 0.961 | 0.844 | |
|
|
| ## Examples |
|
|
| **Apple** |
| : Red and green apples with a simple background. |
| <img src='simpleapple.png' width="400" height="400"> |
| <img src='simplegreenapple.png' width="400" height="400"> |
|
|
| **Orange** |
| : Oranges with a simple background. |
| <img src='manyoranges.png' width="400" height="400"> |
| <img src='simpleorange.jpg' width="400" height="400"> |
|
|
| **RottenApple** |
| : Moldy apples or deformed/old apples with a simple background. |
| <img src='moldyapple.png' width="400" height="400"> |
| <img src='deformedapple.png' width="400" height="400"> |
|
|
| **RottenOrange** |
| : Moldy oranges or deformed oranges with a simple background. |
| <img src='deformedorange.png' width="400" height="400"> |
| <img src='moldyorange.png' width="400" height="400"> |
|
|
| ## F1-Score |
| <img src='BoxF1_curve.png' alt="F1 Curve Graph" width="700" height="700"> |
|
|
| ## Confusion Matrix |
| <img src='confusion_matrix_normalized.png' alt="Confusion Matrix Graph" width="700" height="700"> |
|
|
| ## Train/Loss and Val/Loss Curves |
| <img src='results.png' alt="Training and Validation Loss Curve Graph" width="700" height="700"> |
|
|
| ## Performance Analysis |
|
|
| My results show generally a very strong performance across the board, performing the best at all classes with a F1-score of 0.92 at a confidence level of 58%. |
| It is good at both finding and identifying the object, however the confusion matrix does show some problems; the Orange class and background get mixed up fairly often |
| despite a relatively strong diagonal shown in the matrix. There is also no signs of obvious over or underfitting shown in the class/val loss curves, they both steadily curve down. |
| However, the model could likely still be improved with more training or more images, as the curves don't seem like they've completely plateaud yet. |
|
|
| ## Limatations and Biases |
|
|
| **Failure Cases** |
| Struggles with the inside of fruits and human hands in background. |
| <img src='failcase1.png' width="400" height="400"> |
| <img src='failcase2.png' width="400" height="400"> |
|
|
| **Poor Peforming classes** |
| Although they all have very high metrics, the Orange class likely performs the worst. It gets mixed up on the background the most, likely because the |
| dataset contains quite a few images with orange backgrounds. |
|
|
| **Data Biases/Contextual limitations** |
| Many images were blurry or low resolution, similarly some images contained logos or stock image words printed over the fruits. Many of the fruits were all very similar in |
| species, there were fewer green apples and blood oranges in the dataset. The model significantly degrades when the background is not simple or matches the examples. |
|
|
| **Innappropriate Use Cases** |
| This model is meant for conveyor belts, however anything that creates a non-static background such as human workers will mess with model accuracy. |
|
|
| **Sample size limitations** |
| The entire model could benefit from at least thousands more images in each class, but they can be very similar images as I want the model to be sucessful on a conveyor belt and not much else. It's okay if it gets confused on human hands or can't detect the inside of a fruit as that usually won't happen on a conveyor belt. |
|
|