--- license: apache-2.0 language: - en - zh metrics: - cer pipeline_tag: automatic-speech-recognition --- ## Efficient Conformer v2 for non-streaming ASR **Specification**: https://github.com/wenet-e2e/wenet/pull/1636 ## Aishell-1 Results * Feature info: * using fbank feature, cmvn, speed perturb, dither * Training info: * [train_u2++_efficonformer_v2.yaml](https://github.com/wenet-e2e/wenet/blob/main/examples/aishell/s0/conf/train_u2%2B%2B_efficonformer_v2.yaml) * 8 gpu, batch size 16, acc_grad 1, 200 epochs * lr 0.001, warmup_steps 25000 * Model info: * Model Params: 49,354,651 * Downsample rate: 1/2 (conv2d2) * 1/4 (efficonformer block) * encoder_dim 256, output_size 256, head 8, linear_units 2048 * num_blocks 12, cnn_module_kernel 15, group_size 3 * Decoding info: * ctc_weight 0.5, reverse_weight 0.3, average_num 20 | decoding mode | full | 18 | 16 | |------------------------|------|------|------| | attention decoder | 4.87 | 5.03 | 5.07 | | ctc prefix beam search | 4.97 | 5.18 | 5.20 | | attention rescoring | 4.56 | 4.75 | 4.77 | ## Start to Use Install **WeNet** follow: https://wenet.org.cn/wenet/install.html#install-for-training Decode ```sh cd wenet/examples/aishell/s0 dir=exp/wenet_efficient_conformer_aishell_v2/ ctc_weight=0.5 reverse_weight=0.3 decoding_chunk_size=-1 mode="attention_rescoring" test_dir=$dir/test_${mode} mkdir -p $test_dir # Decode nohup python wenet/bin/recognize.py --gpu 0 \ --mode $mode \ --config $dir/train.yaml \ --data_type "raw" \ --test_data data/test/data.list \ --checkpoint $dir/final.pt \ --beam_size 10 \ --batch_size 1 \ --penalty 0.0 \ --dict $dir/words.txt \ --ctc_weight $ctc_weight \ --reverse_weight $reverse_weight \ --result_file $test_dir/text \ ${decoding_chunk_size:+--decoding_chunk_size $decoding_chunk_size} > logs/decode_aishell.log & # CER python tools/compute-cer.py --char=1 --v=1 \ data/test/text $test_dir/text > $test_dir/cer.txt ```