Update README.md
Browse files
README.md
CHANGED
|
@@ -21,7 +21,7 @@ on unseen brands or product types not represented in the training data.
|
|
| 21 |
## 2. Training Data
|
| 22 |
The training dataset is a subset of the **RPC-Dataset**
|
| 23 |
([rpc-dataset.github.io](https://rpc-dataset.github.io/)),
|
| 24 |
-
a large-scale retail product checkout dataset consisting of
|
| 25 |
across 200 grocery product classes.
|
| 26 |
The working dataset is a subset of this,
|
| 27 |
consisting of 9,616 images across the same 200 classes, sourced via
|
|
@@ -193,14 +193,14 @@ the model struggled significantly with unseen products and environments.
|
|
| 193 |
The image below shows a representative failure case:
|
| 194 |

|
| 195 |
|
| 196 |
-
In this image, the model:
|
| 197 |
- Completely missed the avocado (no detection)
|
| 198 |
- Missed the tea box entirely & drink under it
|
| 199 |
- Misclassified a water bottle as `instant_noodles` (0.63 confidence)
|
| 200 |
- Produced a low-confidence `dried_fruit` detection (0.45) on an incorrect region
|
| 201 |
|
| 202 |
This suggests the model learned to recognize specific packaging patterns from its training data
|
| 203 |
-
rather than generalizing to grocery items as a broader category.
|
| 204 |
|
| 205 |
### Poor Performing Classes
|
| 206 |
| Class | mAP50 | mAP50-95 | Likely Reason |
|
|
@@ -216,17 +216,18 @@ rather than generalizing to grocery items as a broader category.
|
|
| 216 |
### Environmental and Contextual Limitations
|
| 217 |
- Performance degrades significantly when used on items not present in the training data, as seen with the D2S Dataset
|
| 218 |
- Overlapping or partially occluded items in a self-checkout camera may cause missed or incorrect detections
|
| 219 |
-
- Model was designed for overhead/top-down perspective, so
|
| 220 |
|
| 221 |
### Inappropriate Use Cases
|
| 222 |
-
|
| 223 |
-
- Should **NOT** be deployed in stores with inventory significantly different from the training data without retraining
|
|
|
|
| 224 |
- Should **NOT** be used to detect fresh produce, unpackaged items, or non-grocery products
|
| 225 |
- Should **NOT** be used in serious applications where misclassification has serious consequences
|
| 226 |
|
| 227 |
### Ethical Considerations
|
| 228 |
- Overhead camera systems at self-checkout may raise **customer privacy concerns** depending on how image/video data is stored and used
|
| 229 |
-
- Model should not be used to make automated decisions that negatively impact customers without human review, as misclassifications may
|
| 230 |
|
| 231 |
### Sample Size Limitations
|
| 232 |
- **Stationery** (1,466 images) is the smallest class and shows the weakest overall performance (albeit still strong). Additional training data would likely improve results
|
|
|
|
| 21 |
## 2. Training Data
|
| 22 |
The training dataset is a subset of the **RPC-Dataset**
|
| 23 |
([rpc-dataset.github.io](https://rpc-dataset.github.io/)),
|
| 24 |
+
a large-scale retail product checkout dataset consisting of 83,699 images
|
| 25 |
across 200 grocery product classes.
|
| 26 |
The working dataset is a subset of this,
|
| 27 |
consisting of 9,616 images across the same 200 classes, sourced via
|
|
|
|
| 193 |
The image below shows a representative failure case:
|
| 194 |

|
| 195 |
|
| 196 |
+
In this test image, the model:
|
| 197 |
- Completely missed the avocado (no detection)
|
| 198 |
- Missed the tea box entirely & drink under it
|
| 199 |
- Misclassified a water bottle as `instant_noodles` (0.63 confidence)
|
| 200 |
- Produced a low-confidence `dried_fruit` detection (0.45) on an incorrect region
|
| 201 |
|
| 202 |
This suggests the model learned to recognize specific packaging patterns from its training data
|
| 203 |
+
rather than generalizing to grocery items as a broader category. This model should be store-specific on inventory with this training data.
|
| 204 |
|
| 205 |
### Poor Performing Classes
|
| 206 |
| Class | mAP50 | mAP50-95 | Likely Reason |
|
|
|
|
| 216 |
### Environmental and Contextual Limitations
|
| 217 |
- Performance degrades significantly when used on items not present in the training data, as seen with the D2S Dataset
|
| 218 |
- Overlapping or partially occluded items in a self-checkout camera may cause missed or incorrect detections
|
| 219 |
+
- Model was designed for overhead/top-down perspective, so differing angles/views could degrade performance
|
| 220 |
|
| 221 |
### Inappropriate Use Cases
|
| 222 |
+
This specific model:
|
| 223 |
+
- Should **NOT** be deployed in stores with inventory significantly different from the training data without retraining. Different models with different data should be used for different inventory and stock!
|
| 224 |
+
- Should **NOT** be used as a standalone loss prevention or security system
|
| 225 |
- Should **NOT** be used to detect fresh produce, unpackaged items, or non-grocery products
|
| 226 |
- Should **NOT** be used in serious applications where misclassification has serious consequences
|
| 227 |
|
| 228 |
### Ethical Considerations
|
| 229 |
- Overhead camera systems at self-checkout may raise **customer privacy concerns** depending on how image/video data is stored and used
|
| 230 |
+
- Model should not be used to make automated decisions that negatively impact customers without human review, as misclassifications may affect customers purchasing unfamiliar or international products not well represented in the training data
|
| 231 |
|
| 232 |
### Sample Size Limitations
|
| 233 |
- **Stationery** (1,466 images) is the smallest class and shows the weakest overall performance (albeit still strong). Additional training data would likely improve results
|