File size: 1,925 Bytes
a6df376
 
9eae9f0
 
 
 
 
 
a6df376
9eae9f0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7e97add
 
9eae9f0
 
 
 
 
 
 
 
b44df2f
 
9eae9f0
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
---
license: apache-2.0
language:
- en
- zh
metrics:
- cer
pipeline_tag: automatic-speech-recognition
---
# Efficient Conformer v2 for non-streaming ASR

**Specification**: https://github.com/wenet-e2e/wenet/pull/1636

## Results

* Feature info:
    * using fbank feature, cmvn, speed perturb, dither
* Training info:
    * train_u2++_efficonformer_v2.yaml
    * 8 gpu, batch size 16, acc_grad 1, 120 epochs
    * lr 0.001, warmup_steps 35000
* Model info:
    * Model Params: 50,341,278
    * Downsample rate: 1/2 (conv2d2) * 1/4 (efficonformer block)
    * encoder_dim 256, output_size 256, head 8, linear_units 2048
    * num_blocks 12, cnn_module_kernel 15, group_size 3
* Decoding info:
    * ctc_weight 0.5, reverse_weight 0.3, average_num 20

test clean

| decoding mode          | full | 18   | 16   |
|------------------------|------|------|------|
| attention decoder      | 3.49 | 3.71 | 3.72 |
| ctc_greedy_search      | 3.49 | 3.74 | 3.77 |
| ctc prefix beam search | 3.47 | 3.72 | 3.74 |
| attention rescoring    | 3.12 | 3.38 | 3.36 |

test other

| decoding mode          | full | 18   | 16   |
|------------------------|------|------|------|
| attention decoder      | 8.15 | 9.05 | 9.03 |
| ctc_greedy_search      | 8.73 | 9.82 | 9.83 |
| ctc prefix beam search | 8.70 | 9.81 | 9.79 |
| attention rescoring    | 8.05 | 9.08 | 9.10 |


## Start to Use

Install **WeNet** follow: https://wenet.org.cn/wenet/install.html#install-for-training


Decode
```sh
cd examples/librispeech/s0

cp exp/wenet_efficient_conformer_librispeech_v2/decode.sh ./
cp exp/wenet_efficient_conformer_librispeech_v2/wer.sh ./

dir=exp/wenet_efficient_conformer_librispeech_v2
decoding_chunk_size=-1
. ./decode.sh ${dir} 20 ${decoding_chunk_size}

# WER
. ./wer.sh test_clean wenet_efficient_conformer_librispeech_v2 ${decoding_chunk_size}
. ./wer.sh test_other wenet_efficient_conformer_librispeech_v2 ${decoding_chunk_size}
```