|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
- zh |
|
|
metrics: |
|
|
- cer |
|
|
pipeline_tag: automatic-speech-recognition |
|
|
--- |
|
|
## Efficient Conformer v2 for non-streaming ASR |
|
|
|
|
|
**Specification**: https://github.com/wenet-e2e/wenet/pull/1636 |
|
|
|
|
|
## Aishell-1 Results |
|
|
|
|
|
* Feature info: |
|
|
* using fbank feature, cmvn, speed perturb, dither |
|
|
* Training info: |
|
|
* [train_u2++_efficonformer_v2.yaml](https://github.com/wenet-e2e/wenet/blob/main/examples/aishell/s0/conf/train_u2%2B%2B_efficonformer_v2.yaml) |
|
|
* 8 gpu, batch size 16, acc_grad 1, 200 epochs |
|
|
* lr 0.001, warmup_steps 25000 |
|
|
* Model info: |
|
|
* Model Params: 49,354,651 |
|
|
* Downsample rate: 1/2 (conv2d2) * 1/4 (efficonformer block) |
|
|
* encoder_dim 256, output_size 256, head 8, linear_units 2048 |
|
|
* num_blocks 12, cnn_module_kernel 15, group_size 3 |
|
|
* Decoding info: |
|
|
* ctc_weight 0.5, reverse_weight 0.3, average_num 20 |
|
|
|
|
|
| decoding mode | full | 18 | 16 | |
|
|
|------------------------|------|------|------| |
|
|
| attention decoder | 4.87 | 5.03 | 5.07 | |
|
|
| ctc prefix beam search | 4.97 | 5.18 | 5.20 | |
|
|
| attention rescoring | 4.56 | 4.75 | 4.77 | |
|
|
|
|
|
## Start to Use |
|
|
|
|
|
Install **WeNet** follow: https://wenet.org.cn/wenet/install.html#install-for-training |
|
|
|
|
|
|
|
|
Decode |
|
|
|
|
|
```sh |
|
|
cd wenet/examples/aishell/s0 |
|
|
dir=exp/wenet_efficient_conformer_aishell_v2/ |
|
|
|
|
|
ctc_weight=0.5 |
|
|
reverse_weight=0.3 |
|
|
decoding_chunk_size=-1 |
|
|
mode="attention_rescoring" |
|
|
|
|
|
test_dir=$dir/test_${mode} |
|
|
mkdir -p $test_dir |
|
|
|
|
|
# Decode |
|
|
nohup python wenet/bin/recognize.py --gpu 0 \ |
|
|
--mode $mode \ |
|
|
--config $dir/train.yaml \ |
|
|
--data_type "raw" \ |
|
|
--test_data data/test/data.list \ |
|
|
--checkpoint $dir/final.pt \ |
|
|
--beam_size 10 \ |
|
|
--batch_size 1 \ |
|
|
--penalty 0.0 \ |
|
|
--dict $dir/words.txt \ |
|
|
--ctc_weight $ctc_weight \ |
|
|
--reverse_weight $reverse_weight \ |
|
|
--result_file $test_dir/text \ |
|
|
${decoding_chunk_size:+--decoding_chunk_size $decoding_chunk_size} > logs/decode_aishell.log & |
|
|
|
|
|
# CER |
|
|
python tools/compute-cer.py --char=1 --v=1 \ |
|
|
data/test/text $test_dir/text > $test_dir/cer.txt |
|
|
``` |
|
|
|
|
|
|