hik63382 commited on
Commit
694b20e
·
verified ·
1 Parent(s): bfe9b1d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +106 -1
README.md CHANGED
@@ -1,5 +1,7 @@
1
  ---
2
  license: apache-2.0
 
 
3
  ---
4
 
5
 
@@ -23,4 +25,107 @@ The relative speeds below are measured by transcribing English speech on a A100,
23
 
24
  Whisper's performance varies widely depending on the language. The figure below shows a performance breakdown of `large-v3` and `large-v2` models by language, using WERs (word error rates) or CER (character error rates, shown in *Italic*) evaluated on the Common Voice 15 and Fleurs datasets. Additional WER/CER metrics corresponding to the other models and datasets can be found in Appendix D.1, D.2, and D.4 of [the paper](https://arxiv.org/abs/2212.04356), as well as the BLEU (Bilingual Evaluation Understudy) scores for translation in Appendix D.3.
25
 
26
- ![WER breakdown by language](https://github.com/openai/whisper/assets/266841/f4619d66-1058-4005-8f67-a9d811b77c62)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ tags:
4
+ - multimodel
5
  ---
6
 
7
 
 
25
 
26
  Whisper's performance varies widely depending on the language. The figure below shows a performance breakdown of `large-v3` and `large-v2` models by language, using WERs (word error rates) or CER (character error rates, shown in *Italic*) evaluated on the Common Voice 15 and Fleurs datasets. Additional WER/CER metrics corresponding to the other models and datasets can be found in Appendix D.1, D.2, and D.4 of [the paper](https://arxiv.org/abs/2212.04356), as well as the BLEU (Bilingual Evaluation Understudy) scores for translation in Appendix D.3.
27
 
28
+ ![WER breakdown by language](https://github.com/openai/whisper/assets/266841/f4619d66-1058-4005-8f67-a9d811b77c62)
29
+
30
+
31
+ ---
32
+ English
33
+ Chinese
34
+ German
35
+ Spanish
36
+ Russian
37
+ Korean
38
+ French
39
+ Japanese
40
+ Portuguese
41
+ Turkish
42
+ Polish
43
+ Catalan
44
+ Dutch
45
+ Arabic
46
+ Swedish
47
+ Italian
48
+ Indonesian
49
+ Hindi
50
+ Finnish
51
+ Vietnamese
52
+ Hebrew
53
+ Ukrainian
54
+ Greek
55
+ Malay
56
+ Czech
57
+ Romanian
58
+ Danish
59
+ Hungarian
60
+ Tamil
61
+ Norwegian
62
+ Thai
63
+ Urdu
64
+ Croatian
65
+ Bulgarian
66
+ Lithuanian
67
+ Latin
68
+ Māori
69
+ Malayalam
70
+ Welsh
71
+ Slovak
72
+ Telugu
73
+ Persian
74
+ Latvian
75
+ Bengali
76
+ Serbian
77
+ Azerbaijani
78
+ Slovenian
79
+ Kannada
80
+ Estonian
81
+ Macedonian
82
+ Breton
83
+ Basque
84
+ Icelandic
85
+ Armenian
86
+ Nepali
87
+ Mongolian
88
+ Bosnian
89
+ Kazakh
90
+ Albanian
91
+ Swahili
92
+ Galician
93
+ Marathi
94
+ Panjabi
95
+ Sinhala
96
+ Khmer
97
+ Shona
98
+ Yoruba
99
+ Somali
100
+ Afrikaans
101
+ Occitan
102
+ Georgian
103
+ Belarusian
104
+ Tajik
105
+ Sindhi
106
+ Gujarati
107
+ Amharic
108
+ Yiddish
109
+ Lao
110
+ Uzbek
111
+ Faroese
112
+ Haitian
113
+ Pashto
114
+ Turkmen
115
+ Norwegian Nynorsk
116
+ Maltese
117
+ Sanskrit
118
+ Luxembourgish
119
+ Burmese
120
+ Tibetan
121
+ Tagalog
122
+ Malagasy
123
+ Assamese
124
+ Tatar
125
+ Hawaiian
126
+ Lingala
127
+ Hausa
128
+ Bashkir
129
+ jw
130
+ Sundanese
131
+ ===