Update README.md
Browse files
README.md
CHANGED
|
@@ -133,4 +133,41 @@ and recall with a mAP50 of 0.992.
|
|
| 133 |
| seasoner | 302 | 572 | 0.986 | 0.965 | 0.993 | 0.849 |
|
| 134 |
| stationery | 162 | 300 | 0.986 | 0.957 | 0.972 | 0.785 |
|
| 135 |
| tissue | 482 | 978 | 0.999 | 0.994 | 0.995 | 0.909 |
|
| 136 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 133 |
| seasoner | 302 | 572 | 0.986 | 0.965 | 0.993 | 0.849 |
|
| 134 |
| stationery | 162 | 300 | 0.986 | 0.957 | 0.972 | 0.785 |
|
| 135 |
| tissue | 482 | 978 | 0.999 | 0.994 | 0.995 | 0.909 |
|
| 136 |
+
|
| 137 |
+
### Visual Examples of Classes
|
| 138 |
+
|
| 139 |
+
blah blah do this later
|
| 140 |
+
|
| 141 |
+
### Key Visualizations
|
| 142 |
+
|
| 143 |
+
#### Confusion Matrix
|
| 144 |
+

|
| 145 |
+
|
| 146 |
+
#### F1 Confidence Curve
|
| 147 |
+

|
| 148 |
+
|
| 149 |
+
#### Training & Validation Loss Curves
|
| 150 |
+

|
| 151 |
+
|
| 152 |
+
### Performance Analysis
|
| 153 |
+
|
| 154 |
+
The model performs consistently well across all 17 classes on the validation
|
| 155 |
+
dataset, with the lowest mAP50 being **stationery** at 0.972. The strongest
|
| 156 |
+
performing classes were **tissue** and **puffed_food** (mAP50-95: 0.909, 0.907),
|
| 157 |
+
likely due to their distinct packaging shapes and high training sample
|
| 158 |
+
counts. The weakest performing class was **stationery**
|
| 159 |
+
(mAP50: 0.972, mAP50-95: 0.785), which is also the
|
| 160 |
+
smallest class at 1,466 training images, suggesting performance
|
| 161 |
+
is partially limited by sample size.
|
| 162 |
+
|
| 163 |
+
## Limitations and Biases
|
| 164 |
+
|
| 165 |
+
When tested on the **D2S dataset** (wild images),
|
| 166 |
+
performance dropped significantly. The model missed entire
|
| 167 |
+
objects, produced low-confidence detections, and misclassified items.
|
| 168 |
+
For example, it labeled a water bottle as `instant_noodles`. This
|
| 169 |
+
suggests the model may have overfit to the specific visual patterns
|
| 170 |
+
of the training data, or alternatively reflects a domain gap between
|
| 171 |
+
Asian grocery packaging (training data) and the European products in D2S.
|
| 172 |
+
Both explanations are plausible and further testing on diverse datasets
|
| 173 |
+
would be needed to distinguish between them.
|