|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
- zh |
|
|
metrics: |
|
|
- cer |
|
|
pipeline_tag: automatic-speech-recognition |
|
|
--- |
|
|
# Efficient Conformer v2 for non-streaming ASR |
|
|
|
|
|
**Specification**: https://github.com/wenet-e2e/wenet/pull/1636 |
|
|
|
|
|
## Results |
|
|
|
|
|
* Feature info: |
|
|
* using fbank feature, cmvn, speed perturb, dither |
|
|
* Training info: |
|
|
* train_u2++_efficonformer_v2.yaml |
|
|
* 8 gpu, batch size 16, acc_grad 1, 120 epochs |
|
|
* lr 0.001, warmup_steps 35000 |
|
|
* Model info: |
|
|
* Model Params: 50,341,278 |
|
|
* Downsample rate: 1/2 (conv2d2) * 1/4 (efficonformer block) |
|
|
* encoder_dim 256, output_size 256, head 8, linear_units 2048 |
|
|
* num_blocks 12, cnn_module_kernel 15, group_size 3 |
|
|
* Decoding info: |
|
|
* ctc_weight 0.5, reverse_weight 0.3, average_num 20 |
|
|
|
|
|
test clean |
|
|
|
|
|
| decoding mode | full | 18 | 16 | |
|
|
|------------------------|------|------|------| |
|
|
| attention decoder | 3.49 | 3.71 | 3.72 | |
|
|
| ctc_greedy_search | 3.49 | 3.74 | 3.77 | |
|
|
| ctc prefix beam search | 3.47 | 3.72 | 3.74 | |
|
|
| attention rescoring | 3.12 | 3.38 | 3.36 | |
|
|
|
|
|
test other |
|
|
|
|
|
| decoding mode | full | 18 | 16 | |
|
|
|------------------------|------|------|------| |
|
|
| attention decoder | 8.15 | 9.05 | 9.03 | |
|
|
| ctc_greedy_search | 8.73 | 9.82 | 9.83 | |
|
|
| ctc prefix beam search | 8.70 | 9.81 | 9.79 | |
|
|
| attention rescoring | 8.05 | 9.08 | 9.10 | |
|
|
|
|
|
|
|
|
## Start to Use |
|
|
|
|
|
Install **WeNet** follow: https://wenet.org.cn/wenet/install.html#install-for-training |
|
|
|
|
|
|
|
|
Decode |
|
|
```sh |
|
|
cd examples/librispeech/s0 |
|
|
|
|
|
cp exp/wenet_efficient_conformer_librispeech_v2/decode.sh ./ |
|
|
cp exp/wenet_efficient_conformer_librispeech_v2/wer.sh ./ |
|
|
|
|
|
dir=exp/wenet_efficient_conformer_librispeech_v2 |
|
|
decoding_chunk_size=-1 |
|
|
. ./decode.sh ${dir} 20 ${decoding_chunk_size} |
|
|
|
|
|
# WER |
|
|
. ./wer.sh test_clean wenet_efficient_conformer_librispeech_v2 ${decoding_chunk_size} |
|
|
. ./wer.sh test_other wenet_efficient_conformer_librispeech_v2 ${decoding_chunk_size} |
|
|
``` |
|
|
|
|
|
|
|
|
|