|
|
--- |
|
|
datasets: |
|
|
- google/speech_commands |
|
|
pipeline_tag: image-classification |
|
|
tags: |
|
|
- arXiv:1611.02361 |
|
|
--- |
|
|
|
|
|
# DS-CNN |
|
|
|
|
|
DS-CNN model from MLCommons repository https://github.com/mlcommons/tiny/tree/master/benchmark/training/keyword_spotting |
|
|
|
|
|
ONNX version exported from `.pb` model doing |
|
|
|
|
|
```bash |
|
|
# setup environment |
|
|
python3.10 -m venv pytf |
|
|
source pytf/bin/activate |
|
|
|
|
|
# install latest officially compatible versions |
|
|
pip install tensorflow==2.15.1 tf2onnx==1.16.1 |
|
|
|
|
|
# use most recent opset officially supported |
|
|
python -m tf2onnx.convert \ |
|
|
--saved-model <path/to/dir> \ |
|
|
--output converted_ds_cnn.onnx --opset 18 |
|
|
``` |
|
|
|
|
|
This version input format is NHWC. |
|
|
The following Python code fuses MatMul+Add to Gemm and folds the first Reshape operator. |
|
|
|
|
|
```python |
|
|
import onnx |
|
|
import onnxruntime |
|
|
|
|
|
import aidge_core as ai |
|
|
import aidge_onnx |
|
|
|
|
|
model_onnx = onnx.load_model("converted_ds_cnn.onnx") |
|
|
model_onnx_clean_nhwc = aidge_onnx.onnx_cleaner.clean_onnx( |
|
|
model_onnx, {"input_1": [[1, 49, 10, 1]]}, "test_clean", opset_version=18 |
|
|
) |
|
|
model = aidge_onnx.convert_onnx_to_aidge(model_onnx_clean_nhwc) |
|
|
|
|
|
to_replace: set[ai.Node] = set( |
|
|
[ |
|
|
model.get_node("StatefulPartitionedCall_functional_1_conv2d_BiasAdd__6"), |
|
|
model.get_node("new_shape__103_out0"), |
|
|
] |
|
|
) |
|
|
|
|
|
model.replace(to_replace, set()) |
|
|
model.set_mandatory_inputs_first() |
|
|
|
|
|
model.forward_dims(dims=[[1, 1, 49, 10]], allow_data_dependency=True) |
|
|
model_onnx_clean_nchw = aidge_onnx.convert_aidge_to_onnx(model, "ds_cnn", opset=18) |
|
|
onnx.save_model(model_onnx_clean_nchw, "ds_cnn.onnx") |
|
|
``` |
|
|
|
|
|
## Aidge support |
|
|
|
|
|
> Note: We tested this network for the following features. If you encounter any error please open an [issue](https://gitlab.eclipse.org/groups/eclipse/aidge/-/issues). Features not tested in CI may not be functional. |
|
|
|
|
|
| Feature | Tested in CI | |
|
|
| :----------: | :----------: | |
|
|
| ONNX import | ✔️ | |
|
|
| Runtime CPU | ✔️ | |
|
|
| Runtime CUDA | ✔️ | |
|
|
| Export CPU | ✔️ | |
|
|
|
|
|
|
|
|
## Model |
|
|
|
|
|
* operators: 43 (9 types) |
|
|
- AvgPooling2D: 1 |
|
|
- Conv2D: 4 |
|
|
- FC: 1 |
|
|
- PaddedConv2D: 1 |
|
|
- PaddedConvDepthWise2D: 4 |
|
|
- Producer: 21 |
|
|
- ReLU: 9 |
|
|
- Reshape: 1 |
|
|
- Softmax: 1 |
|
|
|
|
|
## Google Speech Commands v2 |
|
|
|
|
|
* Opset: 18 |
|
|
* Source: Google |
|
|
* **Input** |
|
|
* size: [N, 1, 49, 10] |
|
|
* format: [N, C, H, W] |
|
|
* preprocessing: |
|
|
* ? |
|
|
* **Output** |
|
|
* size: [N, 12] |