Instructions to use pyf98/speechcommands_12commands_conformer with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- ESPnet
How to use pyf98/speechcommands_12commands_conformer with ESPnet:
from espnet2.bin.asr_inference import Speech2Text model = Speech2Text.from_pretrained( "pyf98/speechcommands_12commands_conformer" ) speech, rate = soundfile.read("speech.wav") text, *_ = model(speech)[0] - Notebooks
- Google Colab
- Kaggle
Missing key(s) in state_dict: how to resolve
Getting missing dict values while loading the pretrained model, any solution to this?
RuntimeError: Error(s) in loading state_dict for ESPnetASRModel:
Missing key(s) in state_dict: "encoder.encoders.0.conv_module.norm.weight", "encoder.encoders.0.conv_module.norm.bias", "encoder.encoders.0.conv_module.norm.running_mean", "encoder.encoders.0.conv_module.norm.running_var", "encoder.encoders.1.conv_module.norm.weight", "encoder.encoders.1.conv_module.norm.bias", "encoder.encoders.1.conv_module.norm.running_mean", "encoder.encoders.1.conv_module.norm.running_var", "encoder.encoders.2.conv_module.norm.weight", "encoder.encoders.2.conv_module.norm.bias", "encoder.encoders.2.conv_module.norm.running_mean", "encoder.encoders.2.conv_module.norm.running_var", "encoder.encoders.3.conv_module.norm.weight", "encoder.encoders.3.conv_module.norm.bias", "encoder.encoders.3.conv_module.norm.running_mean", "encoder.encoders.3.conv_module.norm.running_var", "encoder.encoders.4.conv_module.norm.weight", "encoder.encoders.4.conv_module.norm.bias", "encoder.encoders.4.conv_module.norm.running_mean", "encoder.encoders.4.conv_module.norm.running_var", "encoder.encoders.5.conv_module.norm.weight", "encoder.encoders.5.conv_module.norm.bias", "encoder.encoders.5.conv_module.norm.running_mean", "encoder.encoders.5.conv_module.norm.running_var", "encoder.encoders.6.conv_module.norm.weight", "encoder.encoders.6.conv_module.norm.bias", "encoder.encoders.6.conv_module.norm.running_mean", "encoder.encoders.6.conv_module.norm.running_var", "encoder.encoders.7.conv_module.norm.weight", "encoder.encoders.7.conv_module.norm.bias", "encoder.encoders.7.conv_module.norm.running_mean", "encoder.encoders.7.conv_module.norm.running_var", "encoder.encoders.8.conv_module.norm.weight", "encoder.encoders.8.conv_module.norm.bias", "encoder.encoders.8.conv_module.norm.running_mean", "encoder.encoders.8.conv_module.norm.running_var", "encoder.encoders.9.conv_module.norm.weight", "encoder.encoders.9.conv_module.norm.bias", "encoder.encoders.9.conv_module.norm.running_mean", "encoder.encoders.9.conv_module.norm.running_var", "encoder.encoders.10.conv_module.norm.weight", "encoder.encoders.10.conv_module.norm.bias", "encoder.encoders.10.conv_module.norm.running_mean", "encoder.encoders.10.conv_module.norm.running_var", "encoder.encoders.11.conv_module.norm.weight", "encoder.encoders.11.conv_module.norm.bias", "encoder.encoders.11.conv_module.norm.running_mean", "encoder.encoders.11.conv_module.norm.running_var".
Hi, maybe you need this: https://github.com/espnet/espnet/tree/master/egs2/speechcommands/asr1#notes