ds_cnn / README.md
Pineberry's picture
Update README.md
8d6aa9b verified
---
datasets:
- google/speech_commands
pipeline_tag: image-classification
tags:
- arXiv:1611.02361
---
# DS-CNN
DS-CNN model from MLCommons repository https://github.com/mlcommons/tiny/tree/master/benchmark/training/keyword_spotting
ONNX version exported from `.pb` model doing
```bash
# setup environment
python3.10 -m venv pytf
source pytf/bin/activate
# install latest officially compatible versions
pip install tensorflow==2.15.1 tf2onnx==1.16.1
# use most recent opset officially supported
python -m tf2onnx.convert \
--saved-model <path/to/dir> \
--output converted_ds_cnn.onnx --opset 18
```
This version input format is NHWC.
The following Python code fuses MatMul+Add to Gemm and folds the first Reshape operator.
```python
import onnx
import onnxruntime
import aidge_core as ai
import aidge_onnx
model_onnx = onnx.load_model("converted_ds_cnn.onnx")
model_onnx_clean_nhwc = aidge_onnx.onnx_cleaner.clean_onnx(
model_onnx, {"input_1": [[1, 49, 10, 1]]}, "test_clean", opset_version=18
)
model = aidge_onnx.convert_onnx_to_aidge(model_onnx_clean_nhwc)
to_replace: set[ai.Node] = set(
[
model.get_node("StatefulPartitionedCall_functional_1_conv2d_BiasAdd__6"),
model.get_node("new_shape__103_out0"),
]
)
model.replace(to_replace, set())
model.set_mandatory_inputs_first()
model.forward_dims(dims=[[1, 1, 49, 10]], allow_data_dependency=True)
model_onnx_clean_nchw = aidge_onnx.convert_aidge_to_onnx(model, "ds_cnn", opset=18)
onnx.save_model(model_onnx_clean_nchw, "ds_cnn.onnx")
```
## Aidge support
> Note: We tested this network for the following features. If you encounter any error please open an [issue](https://gitlab.eclipse.org/groups/eclipse/aidge/-/issues). Features not tested in CI may not be functional.
| Feature | Tested in CI |
| :----------: | :----------: |
| ONNX import | ✔️ |
| Runtime CPU | ✔️ |
| Runtime CUDA | ✔️ |
| Export CPU | ✔️ |
## Model
* operators: 43 (9 types)
- AvgPooling2D: 1
- Conv2D: 4
- FC: 1
- PaddedConv2D: 1
- PaddedConvDepthWise2D: 4
- Producer: 21
- ReLU: 9
- Reshape: 1
- Softmax: 1
## Google Speech Commands v2
* Opset: 18
* Source: Google
* **Input**
* size: [N, 1, 49, 10]
* format: [N, C, H, W]
* preprocessing:
* ?
* **Output**
* size: [N, 12]