Update README.md
Browse files
README.md
CHANGED
|
@@ -80,27 +80,27 @@ Measures are done with default STEDGEAI configuration with enabled input / outpu
|
|
| 80 |
### Reference **NPU** memory footprint based on ESC-10 dataset
|
| 81 |
|Model | Dataset | Format | Resolution | Series | Internal RAM (KiB) | External RAM (KiB) | Weights Flash (KiB) | STM32Cube.AI version | STEdgeAI Core version |
|
| 82 |
|----------|------------------|--------|-------------|------------------|------------------|---------------------|-------|----------------------|-------------------------|
|
| 83 |
-
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | esc-10 | Int8 | 64x96x1 | STM32N6 | 144 | 0.0 | 176.59 | 10.0.0 | 2.0.0 |
|
| 84 |
-
| [Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | esc-10 | Int8 | 64x96x1 | STM32N6 | 144 | 0.0 | 3497.24 | 10.0.0 | 2.0.0 |
|
| 85 |
|
| 86 |
### Reference **NPU** inference time based on ESC-10 dataset
|
| 87 |
| Model | Dataset | Format | Resolution | Board | Execution Engine | Inference time (ms) | Inf / sec | STM32Cube.AI version | STEdgeAI Core version |
|
| 88 |
|--------|------------------|--------|-------------|------------------|------------------|---------------------|-------|----------------------|-------------------------|
|
| 89 |
-
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | esc-10 | Int8 | 64x96x1 | STM32N6570-DK | NPU/MCU | 1.07 | 934.58 | 10.0.0 | 2.0.0 |
|
| 90 |
-
| [Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | esc-10 | Int8 | 64x96x1 | STM32N6570-DK | NPU/MCU | 9.88 | 101.21 | 10.0.0 | 2.0.0 |
|
| 91 |
|
| 92 |
|
| 93 |
### Reference **MCU** memory footprint based on ESC-10 dataset
|
| 94 |
| Model | Format | Resolution | Series | Activation RAM (kB) | Runtime RAM (kB) | Weights Flash (kB) | Code Flash (kB) | Total RAM (kB) | Total Flash (kB) | STM32Cube.AI version |
|
| 95 |
|-------------------|--------|------------|---------|----------------|-------------|---------------|------------|-------------|-------------|-----------------------|
|
| 96 |
-
|[Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | Int8 | 64x96x1 | B-U585I-IOT02A | 109.57 | 7.61 | 135.91 | 57.74 | 117.18 | 193.65 | 10.0.0 |
|
| 97 |
-
|[Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | Int8 | 64x96x1 | STM32N6 | 108.59 | 35.41 | 3162.66 | 334.30 | 144.0 | 3496.96 | 10.0.0 |
|
| 98 |
|
| 99 |
### Reference inference time based on ESC-10 dataset
|
| 100 |
| Model | Format | Resolution | Board | Execution Engine | Frequency | Inference time | STM32Cube.AI version |
|
| 101 |
|-------------------|--------|------------|------------------|------------------|--------------|-----------------|-----------------------|
|
| 102 |
-
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | Int8 | 64x96x1 | B-U585I-IOT02A | 1 CPU | 160 MHz | 281.95 ms | 10.0.0
|
| 103 |
-
|[Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | Int8 | 64x96x1 | STM32N6 | 1 CPU + 1 NPU | 800MhZ/1000MhZ | 11.949 ms | 10.0.0
|
| 104 |
|
| 105 |
|
| 106 |
### Accuracy with ESC-10 dataset
|
|
@@ -111,10 +111,10 @@ The reason this metric is used instead of patch-level accuracy is because patch-
|
|
| 111 |
|
| 112 |
| Model | Format | Resolution | Clip-level Accuracy |
|
| 113 |
|-------|--------|------------|----------------|
|
| 114 |
-
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl.h5) | float32 | 64x96x1 | 94.9% |
|
| 115 |
-
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | int8 | 64x96x1 | 94.9% |
|
| 116 |
-
| [Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl.h5) | float32 | 64x96x1 | 100.0% |
|
| 117 |
-
| [Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | int8 | 64x96x1 | 100.0% |
|
| 118 |
|
| 119 |
|
| 120 |
|
|
@@ -132,10 +132,10 @@ However, contrary to what the numbers might suggest online performance on device
|
|
| 132 |
Note that accuracy with unknown class is lower. This is normal
|
| 133 |
| Model | Format | Resolution | Clip-level Accuracy |
|
| 134 |
|-------|--------|------------|----------------|
|
| 135 |
-
| [Yamnet 256 without unknown class](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/without_unknown_class/yamnet_256_64x96_tl.h5) | float32 | 64x96x1 | 86.0% |
|
| 136 |
-
| [Yamnet 256 without unknown class](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/without_unknown_class/yamnet_256_64x96_tl_int8.tflite) | float32 | 64x96x1 | 87.0% |
|
| 137 |
-
| [Yamnet 256 with unknown class](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/with_unknown_class/yamnet_256_64x96_tl.h5) | float32 | 64x96x1 | 73.0% |
|
| 138 |
-
| [Yamnet 256 with unknown class](https://github.com/STMicroelectronics/stm32ai-modelzoo/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/with_unknown_class/yamnet_256_64x96_tl_int8.tflite) | int8 | 64x96x1 | 73.9% |
|
| 139 |
|
| 140 |
## Retraining and Integration in a simple example:
|
| 141 |
|
|
|
|
| 80 |
### Reference **NPU** memory footprint based on ESC-10 dataset
|
| 81 |
|Model | Dataset | Format | Resolution | Series | Internal RAM (KiB) | External RAM (KiB) | Weights Flash (KiB) | STM32Cube.AI version | STEdgeAI Core version |
|
| 82 |
|----------|------------------|--------|-------------|------------------|------------------|---------------------|-------|----------------------|-------------------------|
|
| 83 |
+
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | esc-10 | Int8 | 64x96x1 | STM32N6 | 144 | 0.0 | 176.59 | 10.0.0 | 2.0.0 |
|
| 84 |
+
| [Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | esc-10 | Int8 | 64x96x1 | STM32N6 | 144 | 0.0 | 3497.24 | 10.0.0 | 2.0.0 |
|
| 85 |
|
| 86 |
### Reference **NPU** inference time based on ESC-10 dataset
|
| 87 |
| Model | Dataset | Format | Resolution | Board | Execution Engine | Inference time (ms) | Inf / sec | STM32Cube.AI version | STEdgeAI Core version |
|
| 88 |
|--------|------------------|--------|-------------|------------------|------------------|---------------------|-------|----------------------|-------------------------|
|
| 89 |
+
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | esc-10 | Int8 | 64x96x1 | STM32N6570-DK | NPU/MCU | 1.07 | 934.58 | 10.0.0 | 2.0.0 |
|
| 90 |
+
| [Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | esc-10 | Int8 | 64x96x1 | STM32N6570-DK | NPU/MCU | 9.88 | 101.21 | 10.0.0 | 2.0.0 |
|
| 91 |
|
| 92 |
|
| 93 |
### Reference **MCU** memory footprint based on ESC-10 dataset
|
| 94 |
| Model | Format | Resolution | Series | Activation RAM (kB) | Runtime RAM (kB) | Weights Flash (kB) | Code Flash (kB) | Total RAM (kB) | Total Flash (kB) | STM32Cube.AI version |
|
| 95 |
|-------------------|--------|------------|---------|----------------|-------------|---------------|------------|-------------|-------------|-----------------------|
|
| 96 |
+
|[Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | Int8 | 64x96x1 | B-U585I-IOT02A | 109.57 | 7.61 | 135.91 | 57.74 | 117.18 | 193.65 | 10.0.0 |
|
| 97 |
+
|[Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | Int8 | 64x96x1 | STM32N6 | 108.59 | 35.41 | 3162.66 | 334.30 | 144.0 | 3496.96 | 10.0.0 |
|
| 98 |
|
| 99 |
### Reference inference time based on ESC-10 dataset
|
| 100 |
| Model | Format | Resolution | Board | Execution Engine | Frequency | Inference time | STM32Cube.AI version |
|
| 101 |
|-------------------|--------|------------|------------------|------------------|--------------|-----------------|-----------------------|
|
| 102 |
+
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | Int8 | 64x96x1 | B-U585I-IOT02A | 1 CPU | 160 MHz | 281.95 ms | 10.0.0
|
| 103 |
+
|[Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | Int8 | 64x96x1 | STM32N6 | 1 CPU + 1 NPU | 800MhZ/1000MhZ | 11.949 ms | 10.0.0
|
| 104 |
|
| 105 |
|
| 106 |
### Accuracy with ESC-10 dataset
|
|
|
|
| 111 |
|
| 112 |
| Model | Format | Resolution | Clip-level Accuracy |
|
| 113 |
|-------|--------|------------|----------------|
|
| 114 |
+
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl.h5) | float32 | 64x96x1 | 94.9% |
|
| 115 |
+
| [Yamnet 256](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_256_64x96_tl/yamnet_256_64x96_tl_int8.tflite) | int8 | 64x96x1 | 94.9% |
|
| 116 |
+
| [Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl.h5) | float32 | 64x96x1 | 100.0% |
|
| 117 |
+
| [Yamnet 1024](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/esc10/yamnet_1024_64x96_tl/yamnet_1024_64x96_tl_qdq_int8.onnx) | int8 | 64x96x1 | 100.0% |
|
| 118 |
|
| 119 |
|
| 120 |
|
|
|
|
| 132 |
Note that accuracy with unknown class is lower. This is normal
|
| 133 |
| Model | Format | Resolution | Clip-level Accuracy |
|
| 134 |
|-------|--------|------------|----------------|
|
| 135 |
+
| [Yamnet 256 without unknown class](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/without_unknown_class/yamnet_256_64x96_tl.h5) | float32 | 64x96x1 | 86.0% |
|
| 136 |
+
| [Yamnet 256 without unknown class](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/without_unknown_class/yamnet_256_64x96_tl_int8.tflite) | float32 | 64x96x1 | 87.0% |
|
| 137 |
+
| [Yamnet 256 with unknown class](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/with_unknown_class/yamnet_256_64x96_tl.h5) | float32 | 64x96x1 | 73.0% |
|
| 138 |
+
| [Yamnet 256 with unknown class](https://github.com/STMicroelectronics/stm32ai-modelzoo/audio_event_detection/yamnet/ST_pretrainedmodel_public_dataset/fsd50k/yamnet_256_64x96_tl/with_unknown_class/yamnet_256_64x96_tl_int8.tflite) | int8 | 64x96x1 | 73.9% |
|
| 139 |
|
| 140 |
## Retraining and Integration in a simple example:
|
| 141 |
|