Update README.md
Browse files
README.md
CHANGED
|
@@ -1,5 +1,7 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
|
|
@@ -23,4 +25,107 @@ The relative speeds below are measured by transcribing English speech on a A100,
|
|
| 23 |
|
| 24 |
Whisper's performance varies widely depending on the language. The figure below shows a performance breakdown of `large-v3` and `large-v2` models by language, using WERs (word error rates) or CER (character error rates, shown in *Italic*) evaluated on the Common Voice 15 and Fleurs datasets. Additional WER/CER metrics corresponding to the other models and datasets can be found in Appendix D.1, D.2, and D.4 of [the paper](https://arxiv.org/abs/2212.04356), as well as the BLEU (Bilingual Evaluation Understudy) scores for translation in Appendix D.3.
|
| 25 |
|
| 26 |
-

|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- multimodel
|
| 5 |
---
|
| 6 |
|
| 7 |
|
|
|
|
| 25 |
|
| 26 |
Whisper's performance varies widely depending on the language. The figure below shows a performance breakdown of `large-v3` and `large-v2` models by language, using WERs (word error rates) or CER (character error rates, shown in *Italic*) evaluated on the Common Voice 15 and Fleurs datasets. Additional WER/CER metrics corresponding to the other models and datasets can be found in Appendix D.1, D.2, and D.4 of [the paper](https://arxiv.org/abs/2212.04356), as well as the BLEU (Bilingual Evaluation Understudy) scores for translation in Appendix D.3.
|
| 27 |
|
| 28 |
+

|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
---
|
| 32 |
+
English
|
| 33 |
+
Chinese
|
| 34 |
+
German
|
| 35 |
+
Spanish
|
| 36 |
+
Russian
|
| 37 |
+
Korean
|
| 38 |
+
French
|
| 39 |
+
Japanese
|
| 40 |
+
Portuguese
|
| 41 |
+
Turkish
|
| 42 |
+
Polish
|
| 43 |
+
Catalan
|
| 44 |
+
Dutch
|
| 45 |
+
Arabic
|
| 46 |
+
Swedish
|
| 47 |
+
Italian
|
| 48 |
+
Indonesian
|
| 49 |
+
Hindi
|
| 50 |
+
Finnish
|
| 51 |
+
Vietnamese
|
| 52 |
+
Hebrew
|
| 53 |
+
Ukrainian
|
| 54 |
+
Greek
|
| 55 |
+
Malay
|
| 56 |
+
Czech
|
| 57 |
+
Romanian
|
| 58 |
+
Danish
|
| 59 |
+
Hungarian
|
| 60 |
+
Tamil
|
| 61 |
+
Norwegian
|
| 62 |
+
Thai
|
| 63 |
+
Urdu
|
| 64 |
+
Croatian
|
| 65 |
+
Bulgarian
|
| 66 |
+
Lithuanian
|
| 67 |
+
Latin
|
| 68 |
+
Māori
|
| 69 |
+
Malayalam
|
| 70 |
+
Welsh
|
| 71 |
+
Slovak
|
| 72 |
+
Telugu
|
| 73 |
+
Persian
|
| 74 |
+
Latvian
|
| 75 |
+
Bengali
|
| 76 |
+
Serbian
|
| 77 |
+
Azerbaijani
|
| 78 |
+
Slovenian
|
| 79 |
+
Kannada
|
| 80 |
+
Estonian
|
| 81 |
+
Macedonian
|
| 82 |
+
Breton
|
| 83 |
+
Basque
|
| 84 |
+
Icelandic
|
| 85 |
+
Armenian
|
| 86 |
+
Nepali
|
| 87 |
+
Mongolian
|
| 88 |
+
Bosnian
|
| 89 |
+
Kazakh
|
| 90 |
+
Albanian
|
| 91 |
+
Swahili
|
| 92 |
+
Galician
|
| 93 |
+
Marathi
|
| 94 |
+
Panjabi
|
| 95 |
+
Sinhala
|
| 96 |
+
Khmer
|
| 97 |
+
Shona
|
| 98 |
+
Yoruba
|
| 99 |
+
Somali
|
| 100 |
+
Afrikaans
|
| 101 |
+
Occitan
|
| 102 |
+
Georgian
|
| 103 |
+
Belarusian
|
| 104 |
+
Tajik
|
| 105 |
+
Sindhi
|
| 106 |
+
Gujarati
|
| 107 |
+
Amharic
|
| 108 |
+
Yiddish
|
| 109 |
+
Lao
|
| 110 |
+
Uzbek
|
| 111 |
+
Faroese
|
| 112 |
+
Haitian
|
| 113 |
+
Pashto
|
| 114 |
+
Turkmen
|
| 115 |
+
Norwegian Nynorsk
|
| 116 |
+
Maltese
|
| 117 |
+
Sanskrit
|
| 118 |
+
Luxembourgish
|
| 119 |
+
Burmese
|
| 120 |
+
Tibetan
|
| 121 |
+
Tagalog
|
| 122 |
+
Malagasy
|
| 123 |
+
Assamese
|
| 124 |
+
Tatar
|
| 125 |
+
Hawaiian
|
| 126 |
+
Lingala
|
| 127 |
+
Hausa
|
| 128 |
+
Bashkir
|
| 129 |
+
jw
|
| 130 |
+
Sundanese
|
| 131 |
+
===
|