docs: add evaluation results and environment to model card (openai_whisper-large-v3-v20240930_turbo_632MB)
Browse files
README.md
CHANGED
|
@@ -1,4 +1,3 @@
|
|
| 1 |
-
|
| 2 |
---
|
| 3 |
pretty_name: "WhisperKit"
|
| 4 |
viewer: false
|
|
@@ -11,13 +10,67 @@ tags:
|
|
| 11 |
- quantized
|
| 12 |
- automatic-speech-recognition
|
| 13 |
---
|
|
|
|
| 14 |
# WhisperKit
|
| 15 |
|
| 16 |
WhisperKit is an on-device speech recognition framework for Apple Silicon:
|
| 17 |
https://github.com/argmaxinc/WhisperKit
|
| 18 |
|
| 19 |
Check out the WhisperKit paper and presentation from ICML 2025:
|
| 20 |
-
https://icml.cc/virtual/2025/47854
|
| 21 |
|
| 22 |
For real-time streaming API, custom vocabulary, speaker diarization, and more, check out Argmax SDK: https://www.argmaxinc.com/blog/argmax-sdk-2
|
| 23 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
pretty_name: "WhisperKit"
|
| 3 |
viewer: false
|
|
|
|
| 10 |
- quantized
|
| 11 |
- automatic-speech-recognition
|
| 12 |
---
|
| 13 |
+
|
| 14 |
# WhisperKit
|
| 15 |
|
| 16 |
WhisperKit is an on-device speech recognition framework for Apple Silicon:
|
| 17 |
https://github.com/argmaxinc/WhisperKit
|
| 18 |
|
| 19 |
Check out the WhisperKit paper and presentation from ICML 2025:
|
| 20 |
+
https://icml.cc/virtual/2025/47854
|
| 21 |
|
| 22 |
For real-time streaming API, custom vocabulary, speaker diarization, and more, check out Argmax SDK: https://www.argmaxinc.com/blog/argmax-sdk-2
|
| 23 |
|
| 24 |
+
---
|
| 25 |
+
|
| 26 |
+
## Evaluation: openai_whisper-large-v3-v20240930_turbo_632MB
|
| 27 |
+
|
| 28 |
+
Transcription test results for the turbo 632MB model from this repo (aoiandroid/whisperkit-coreml).
|
| 29 |
+
|
| 30 |
+
### Environment
|
| 31 |
+
|
| 32 |
+
| Item | Value |
|
| 33 |
+
|------|--------|
|
| 34 |
+
| Platform | macOS 14.x (arm64, Apple Silicon) |
|
| 35 |
+
| WhisperKit | [argmaxinc/WhisperKit](https://github.com/argmaxinc/WhisperKit) 0.15.0+ (Swift Package) |
|
| 36 |
+
| Model repo | aoiandroid/whisperkit-coreml |
|
| 37 |
+
| Test date | 2026-03-17 |
|
| 38 |
+
| Audio formats | m4a, mp3, wav, flac |
|
| 39 |
+
|
| 40 |
+
### Test results (14 files, multi-language)
|
| 41 |
+
|
| 42 |
+
| File | Language / Content | Note |
|
| 43 |
+
|------|--------------------|------|
|
| 44 |
+
| English.mp3 | English | Texas travel narration (Gage Hotel, Padre Island, Corpus Christi, seafood); stable long-form transcription |
|
| 45 |
+
| Euskara.mp3 | Basque | Speech on language and identity |
|
| 46 |
+
| Guaraní.mp3 | Guarani | Short speech |
|
| 47 |
+
| Yorùbá.mp3 | Yoruba | Education and future |
|
| 48 |
+
| afrikaasns.mp3 | Afrikaans | Value of learning a new language |
|
| 49 |
+
| arabic.mp3 | Arabic | Speech on hope and future (full Arabic) |
|
| 50 |
+
| bengali.m4a | Bengali | Some mixed-language / recognition errors |
|
| 51 |
+
| chinese.mp3 | Chinese | Long explanation on smart traffic systems |
|
| 52 |
+
| isiZulu.mp3 | isiZulu | Future, education, youth |
|
| 53 |
+
| kiswahili.mp3 | Kiswahili | Unity (umoja) |
|
| 54 |
+
| korean.mp3 | Korean | "On challenge" (도전에 대하여) |
|
| 55 |
+
| russinan.m4a | Russian | Russia–Latin America parliamentary conference (with some English at end) |
|
| 56 |
+
| test.mp3 | Japanese | Typhoon 14 news; high accuracy |
|
| 57 |
+
| 日本語.mp3 | Japanese | Ostrich facts / comedy; high accuracy |
|
| 58 |
+
|
| 59 |
+
### Quality notes
|
| 60 |
+
|
| 61 |
+
- **English**: Stable long-form narration.
|
| 62 |
+
- **Japanese**: High accuracy on news and narrative (test.mp3, 日本語.mp3).
|
| 63 |
+
- **Korean, Chinese, Arabic, Russian**: Consistent recognition on long content.
|
| 64 |
+
- **Multilingual**: Many segments reported as [en] by the model while source language was correctly transcribed.
|
| 65 |
+
- **Bengali**: Some mixed script/errors.
|
| 66 |
+
|
| 67 |
+
### Reproduce
|
| 68 |
+
|
| 69 |
+
```bash
|
| 70 |
+
cd TranslateBluePackage
|
| 71 |
+
WHISPERKIT_TEST_AUDIO_DIR=/path/to/input/audio \
|
| 72 |
+
WHISPERKIT_TEST_LOG_DIR=/path/to/Log \
|
| 73 |
+
swift test --filter WhisperKitAOIAndroidModelTests
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
(Use `WhisperKitConfig(model: "openai_whisper-large-v3-v20240930_turbo_632MB", modelRepo: "aoiandroid/whisperkit-coreml")` in your Swift code.)
|