pkufool commited on
Commit
17c09dd
·
verified ·
1 Parent(s): 9ec5447

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md CHANGED
@@ -1,3 +1,54 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - zh
5
+ - en
6
+ metrics:
7
+ - wer
8
+ tags:
9
+ - ASR
10
+ - onnx
11
  ---
12
+
13
+ ## Introduction
14
+
15
+
16
+ This is a medium [zipformer](https://arxiv.org/pdf/2310.11230) model developed by Xiaomi AI Lab Next-gen-Kaldi team. The model was trained on around 20,0000 hours of open-sourced Chinese and English datasets. The number of parameters is around 68M (for ctc head), 73M (for transducer head).
17
+
18
+ The performance on some popular test sets (CER for Chinese, WER for English).
19
+
20
+ | Head | aishell test 1 / 2 | wenetspeech test-net/meetting | Common Voice zh | kespeech test | librispeech test-clean / other | gigaspeech test | Common voice en | tedium test |
21
+ | -- | -- | -- | -- | -- | -- | -- | -- | -- |
22
+ | CTC | 3.08 / 3.98 | 7.08 / 7.62 | 9.2 | 11.23| 3.01 / 6.06 | 11.22 | 15.28 | 10.38 |
23
+ | Transducer | 2.67 / 3.67 | 6.79 / 7.33 | 8.97 | 10.67| 2.61 / 5.36 | 10.56 | 12.94 | 10.06 |
24
+
25
+ Please refer to [zipformer in github](https://github.com/pkufool/zipformer) for model details.
26
+
27
+ > Training set list: Librispeech, Gigaspeech, Commonvoice-2022(zh + en), Libriheavy, Emilia (zh+en), AIshell 2, Wenetspeech, Wenetspeech4tts, Kespeech, AIshell, aidatatang, aishell4, alimeeting, magicdata, primewords, stcmds, thchs30.
28
+
29
+
30
+ ## Documentation
31
+
32
+ Please refer to [https://pkufool.github.io/zipformer/en/models/](https://pkufool.github.io/zipformer/en/models/)
33
+
34
+
35
+ ## Citation
36
+
37
+ ```
38
+ @inproceedings{yao2024zipformer,
39
+ title={Zipformer: A faster and better encoder for automatic speech recognition},
40
+ author={Yao, Zengwei and Guo, Liyong and Yang, Xiaoyu and Kang, Wei and Kuang, Fangjun and Yang, Yifan and Jin, Zengrui and Lin, Long and Povey, Daniel},
41
+ booktitle={International Conference on Learning Representations},
42
+ volume={2024},
43
+ pages={44440--44455},
44
+ year={2024}
45
+ }
46
+ @inproceedings{yao2025cr,
47
+ title={Cr-ctc: Consistency regularization on ctc for improved speech recognition},
48
+ author={Yao, Zengwei and Kang, Wei and Yang, Xiaoyu and Kuang, Fangjun and Guo, Liyong and Zhu, Han and Jin, Zengrui and Li, Zhaoqing and Lin, Long and Povey, Daniel},
49
+ booktitle={International Conference on Learning Representations},
50
+ volume={2025},
51
+ pages={26850--26868},
52
+ year={2025}
53
+ }
54
+ ```