Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,75 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
- zh
|
| 6 |
+
metrics:
|
| 7 |
+
- cer
|
| 8 |
+
pipeline_tag: automatic-speech-recognition
|
| 9 |
---
|
| 10 |
+
## Efficient Conformer v2 for non-streaming ASR
|
| 11 |
+
|
| 12 |
+
**Specification**: https://github.com/wenet-e2e/wenet/pull/1636
|
| 13 |
+
|
| 14 |
+
## Aishell-1 Results
|
| 15 |
+
|
| 16 |
+
* Feature info:
|
| 17 |
+
* using fbank feature, cmvn, speed perturb, dither
|
| 18 |
+
* Training info:
|
| 19 |
+
* [train_u2++_efficonformer_v2.yaml](https://github.com/wenet-e2e/wenet/blob/main/examples/aishell/s0/conf/train_u2%2B%2B_efficonformer_v2.yaml)
|
| 20 |
+
* 8 gpu, batch size 16, acc_grad 1, 200 epochs
|
| 21 |
+
* lr 0.001, warmup_steps 25000
|
| 22 |
+
* Model info:
|
| 23 |
+
* Model Params: 49,354,651
|
| 24 |
+
* Downsample rate: 1/2 (conv2d2) * 1/4 (efficonformer block)
|
| 25 |
+
* encoder_dim 256, output_size 256, head 8, linear_units 2048
|
| 26 |
+
* num_blocks 12, cnn_module_kernel 15, group_size 3
|
| 27 |
+
* Decoding info:
|
| 28 |
+
* ctc_weight 0.5, reverse_weight 0.3, average_num 20
|
| 29 |
+
|
| 30 |
+
| decoding mode | full | 18 | 16 |
|
| 31 |
+
|------------------------|------|------|------|
|
| 32 |
+
| attention decoder | 4.87 | 5.03 | 5.07 |
|
| 33 |
+
| ctc prefix beam search | 4.97 | 5.18 | 5.20 |
|
| 34 |
+
| attention rescoring | 4.56 | 4.75 | 4.77 |
|
| 35 |
+
|
| 36 |
+
## Start to Use
|
| 37 |
+
|
| 38 |
+
Install **WeNet** follow: https://wenet.org.cn/wenet/install.html#install-for-training
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
Decode
|
| 42 |
+
|
| 43 |
+
```sh
|
| 44 |
+
cd wenet/examples/aishell/s0
|
| 45 |
+
dir=exp/wenet_efficient_conformer_aishell_v2/
|
| 46 |
+
|
| 47 |
+
ctc_weight=0.5
|
| 48 |
+
reverse_weight=0.3
|
| 49 |
+
decoding_chunk_size=-1
|
| 50 |
+
mode="attention_rescoring"
|
| 51 |
+
|
| 52 |
+
test_dir=$dir/test_${mode}
|
| 53 |
+
mkdir -p $test_dir
|
| 54 |
+
|
| 55 |
+
# Decode
|
| 56 |
+
nohup python wenet/bin/recognize.py --gpu 0 \
|
| 57 |
+
--mode $mode \
|
| 58 |
+
--config $dir/train.yaml \
|
| 59 |
+
--data_type "raw" \
|
| 60 |
+
--test_data data/test/data.list \
|
| 61 |
+
--checkpoint $dir/final.pt \
|
| 62 |
+
--beam_size 10 \
|
| 63 |
+
--batch_size 1 \
|
| 64 |
+
--penalty 0.0 \
|
| 65 |
+
--dict $dir/words.txt \
|
| 66 |
+
--ctc_weight $ctc_weight \
|
| 67 |
+
--reverse_weight $reverse_weight \
|
| 68 |
+
--result_file $test_dir/text \
|
| 69 |
+
${decoding_chunk_size:+--decoding_chunk_size $decoding_chunk_size} > logs/decode_aishell.log &
|
| 70 |
+
|
| 71 |
+
# CER
|
| 72 |
+
python tools/compute-cer.py --char=1 --v=1 \
|
| 73 |
+
data/test/text $test_dir/text > $test_dir/cer.txt
|
| 74 |
+
```
|
| 75 |
+
|