Audio Classification
FBAGSTM commited on
Commit
56b88ea
·
verified ·
1 Parent(s): 859e1c6

Release AI-ModelZoo-4.0.0

Browse files
Files changed (1) hide show
  1. README.md +15 -14
README.md CHANGED
@@ -35,8 +35,8 @@ Papers : [ResNet](https://arxiv.org/abs/1512.03385)
35
  | Network Information | Value |
36
  |-------------------------|-----------------|
37
  | Framework | TensorFlow Lite |
38
- | Params 1stack | 135K |
39
- | Params 2stacks | 450K |
40
  | Quantization | int8 |
41
  | Provenance | https://keras.io/api/applications/resnet/ |
42
 
@@ -59,25 +59,25 @@ It outputs embedding vectors of size 2048 for the 2 stacks version, and 3548 for
59
  ## Metrics
60
 
61
 
62
- Measures are done with default STM32Cube.AI configuration with enabled input / output allocated option.
63
 
64
 
65
  ### Reference MCU memory footprint based on ESC-10 dataset
66
 
67
 
68
- | Model | Format | Resolution | Series | Activation RAM (KiB) | Runtime RAM (KiB)| Weights Flash (KiB) | Code Flash (KiB) | Total RAM (KiB) | Total Flash (KiB)| STM32Cube.AI version |
69
  |-------------------|--------|------------|---------|----------------|-------------|---------------|------------|-------------|-------------|-----------------------|
70
- | [MiniResNet 1stack ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_1stacks_64x50_tl/miniresnet_1stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 59.89 | 5.38 | 123.6 | 55.89 | 65.27 | 179.49 | 10.2.0 |
71
- | [MiniResNet 2stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_2stacks_64x50_tl/miniresnet_2stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 59.89 | 8.37 | 431.1 | 62.68 | 68.26 | 493.78 | 10.2.0 |
72
 
73
 
74
  ### Reference inference time based on ESC-10 dataset
75
 
76
 
77
- | Model | Format | Resolution | Board | Execution Engine | Frequency | Inference time (ms) | STM32Cube.AI version |
78
  |-------------------|--------|------------|------------------|------------------|-------------|-----------------|-----------------------|
79
- | [MiniResNet 1stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_1stacks_64x50_tl/miniresnet_1stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 1 CPU | 160 MHz | 91.47 | 10.2.0 |
80
- | [MiniResNet 2stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_2stacks_64x50_tl/miniresnet_2stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 1 CPU | 160 MHz | 141.86 | 10.2.0 |
81
 
82
 
83
  ### Accuracy with ESC-10 dataset
@@ -88,11 +88,12 @@ The reason this metric is used instead of patch-level accuracy is because patch-
88
 
89
  | Model | Format | Resolution | Clip-level Accuracy |
90
  |-------|--------|------------|----------------|
91
- | [MiniResNet 1stack ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_1stacks_64x50_tl/miniresnet_1stacks_64x50_tl.h5) | float32 | 64x50x1 | 89.9% |
92
- | [MiniResNet 1stack ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_1stacks_64x50_tl/miniresnet_1stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | 88.9% |
93
- | [MiniResNet 2stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_2stacks_64x50_tl/miniresnet_2stacks_64x50_tl.h5) | float32 | 64x50x1 | 92.4% |
94
- | [MiniResNet 2stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnet/ST_pretrainedmodel_public_dataset/esc10/miniresnet_2stacks_64x50_tl/miniresnet_2stacks_64x50_tl_int8.tflite) | int8 | 64x50x1 | 93.6% |
95
 
96
  ## Retraining and Integration in a simple example:
97
 
98
- Please refer to the stm32ai-modelzoo-services GitHub [here](https://github.com/STMicroelectronics/stm32ai-modelzoo-services)
 
 
35
  | Network Information | Value |
36
  |-------------------------|-----------------|
37
  | Framework | TensorFlow Lite |
38
+ | Params 1 stack | 135K |
39
+ | Params 2 stacks | 450K |
40
  | Quantization | int8 |
41
  | Provenance | https://keras.io/api/applications/resnet/ |
42
 
 
59
  ## Metrics
60
 
61
 
62
+ Measures are done with default STEdgeAI Core configuration with enabled input / output allocated option.
63
 
64
 
65
  ### Reference MCU memory footprint based on ESC-10 dataset
66
 
67
 
68
+ | Model | Format | Resolution | Series | Activation RAM (KiB) | Runtime RAM (KiB)| Weights Flash (KiB) | Code Flash (KiB) | Total RAM (KiB) | Total Flash (KiB)| STEdgeAI Core version |
69
  |-------------------|--------|------------|---------|----------------|-------------|---------------|------------|-------------|-------------|-----------------------|
70
+ | [MiniResNet 1 stack ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnetv1/esc10/miniresnetv1_s1_64x50_tl/miniresnetv1_s1_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 59.89 | 1.08 | 123.6 | 32.36 | 60.97 | 155.96 | 3.0.0 |
71
+ | [MiniResNet 2 stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnetv1/esc10/miniresnetv1_s2_64x50_tl/miniresnetv1_s2_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 59.89 | 1.69 | 431.1 | 36.81 | 61.58 | 467.91 | 3.0.0 |
72
 
73
 
74
  ### Reference inference time based on ESC-10 dataset
75
 
76
 
77
+ | Model | Format | Resolution | Board | Execution Engine | Frequency | Inference time (ms) | STEdgeAI Core version |
78
  |-------------------|--------|------------|------------------|------------------|-------------|-----------------|-----------------------|
79
+ | [MiniResNet 1 stack ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnetv1/esc10/miniresnetv1_s1_64x50_tl/miniresnetv1_s1_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 1 CPU | 160 MHz | 91.45 | 3.0.0 |
80
+ | [MiniResNet 2 stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnetv1/esc10/miniresnetv1_s2_64x50_tl/miniresnetv1_s2_64x50_tl_int8.tflite) | int8 | 64x50x1 | B-U585I-IOT02A | 1 CPU | 160 MHz | 141.82 | 3.0.0 |
81
 
82
 
83
  ### Accuracy with ESC-10 dataset
 
88
 
89
  | Model | Format | Resolution | Clip-level Accuracy |
90
  |-------|--------|------------|----------------|
91
+ | [MiniResNet 1 stack ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnetv1/esc10/miniresnetv1_s1_64x50_tl/miniresnetv1_s1_64x50_tl.keras) | float32 | 64x50x1 | 90.0% |
92
+ | [MiniResNet 1 stack ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnetv1/esc10/miniresnetv1_s1_64x50_tl/miniresnetv1_s1_64x50_tl_int8.tflite) | int8 | 64x50x1 | 90.0% |
93
+ | [MiniResNet 2 stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnetv1/esc10/miniresnetv1_s2_64x50_tl/miniresnetv1_s2_64x50_tl.keras) | float32 | 64x50x1 | 92.5% |
94
+ | [MiniResNet 2 stacks ](https://github.com/STMicroelectronics/stm32ai-modelzoo/tree/main/audio_event_detection/miniresnetv1/esc10/miniresnetv1_s2_64x50_tl/miniresnetv1_s2_64x50_tl_int8.tflite) | int8 | 64x50x1 | 92.5% |
95
 
96
  ## Retraining and Integration in a simple example:
97
 
98
+ Please refer to the stm32ai-modelzoo-services GitHub [here](https://github.com/STMicroelectronics/stm32ai-modelzoo-services)
99
+