AutowareFoundation
/

SceneSeg

@@ -23,4 +23,45 @@ SceneSeg performs robustly across challenging weather and lighting conditions, i
 SceneSeg performs out of the box on roads across the world without any parameter tuning. SceneSeg provides self-driving cars with a core
 safety layer, helping to address 'long-tail' edge cases which plauge object-level detectors.
-<img src="https://github.com/autowarefoundation/autoware_vision_pilot/blob/main/Media/SceneSeg_GIF_Rain.gif">

 SceneSeg performs out of the box on roads across the world without any parameter tuning. SceneSeg provides self-driving cars with a core
 safety layer, helping to address 'long-tail' edge cases which plauge object-level detectors.
+During training, SceneSeg estimates three semantic classes
+- `Foreground Objects`
+- `Background Elements`
+- `Drivable Road Surface`
+However, during inference, we only use the outputs from the **`Foreground Objects`** class.
+## Watch the explainer video
+Please click the video link to play - [***Video link***](https://drive.google.com/file/d/1riGlT3Ct-O1Y2C0DqxemwWS233dJrY7F/view?usp=sharing)
+## Performance Results
+SceneSeg was trained on a diverse dataset comprised of multiple open-source datasets, including ACDC, MUSES, IDDAW, Mapillary Vistas and the Comma10K datset. These datasets provide challenging training data covering a wide range of countries, road types, lighting conditions and weather conditions. The BDD100K dataset was not used during training and served as a broad and diverse test set.
+Mean Intersection Over Union (mIoU) scores are provided for both validation and test data. Validation results are provided for each of the datasets which comprise the complete validation set, alongside the results for the entire validation set, which are presented in the Cross Dataset column. Per-class mIoU scores are provided, alongside mIoU averaged across classes, as well as an Overall mIoU score which calculates the mIoU between the full multi-class prediction and multi-class ground truth.
+### Validation Set Performance - mIoU Scores
+|| Cross Dataset | Mapillary| MUSES | ACDC | IDDAW | Comma10K |
+|--------|---------------|------------------|-------|------|-------|----------|
+| Overall | **90.7** | 91.1 | 83.7 | 89.3 | 87.2 | **92.5** |
+| Background Objects | **93.5** | 93.7 | 89.1 | 93.2 | 90.0 | **95.1** |
+| Foreground Objects | **58.2** | **60.9** | 35.7 | 46.9 | 58.6 | 58.9 |
+| Drivable Road Surface | **84.2** | 85.7 | 70.8 | 74.4 | 81.8 | **86.3** |
+| Class Average | **78.6** | **80.1** | 65.2 | 71.5 | 76.8 | **80.1** |
+### Test Set Performance - mIoU Scores
+|| BDD100K |
+|-|---------|
+| Overall | **91.5** |
+| Background Objects | **94.3** |
+| Foreground Objects | **69.8** |
+| Drivable Road Surface | **71.3** |
+| Class Average | **78.5** |
+### Inference Speed
+Inference speed tests were performed on a laptop equipped with an RTX3060 Mobile Gaming GPU, and an AMD Ryzen 7 5800H CPU. The SceneSeg network comprises a total of 223.43 Billion Floating Point Operations.
+#### FP32 Precision
+At FP32 precision, SceneSeg achieved 18.1 Frames Per Second inference speed
+#### FP16 Precision
+At FP16 precision, SceneSeg achieved 26.7 Frames Per Second inference speed