kotoba-whisper-v2.2 CoreML

This is the CoreML conversion of kotoba-tech/kotoba-whisper-v2.2 for use with WhisperKit.

Model Details

Included Files

File Description ANE Support
AudioEncoder.mlmodelc Audio feature encoder 100%
TextDecoder.mlmodelc Text decoder 98%
MelSpectrogram.mlmodelc Mel spectrogram converter 72%

Usage with WhisperKit

import WhisperKit

let whisperKit = try await WhisperKit(
    modelFolder: "path/to/kotoba-tech_kotoba-whisper-v2.2"
)

let result = try await whisperKit.transcribe(
    audioPath: "path/to/audio.wav",
    language: "ja"
)

Notes

  • This is a distilled model with only 2 decoder layers (vs 32 in the original Whisper large model)
  • Token-level timestamps are disabled due to alignment heads configuration incompatibility with the distilled architecture

License

This model is released under the Apache License 2.0, following the original model's license.

Attribution

This is a derivative work based on:

Acknowledgments

Downloads last month
46
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yslinear/kotoba-whisper-v2.2-coreml

Finetuned
(7)
this model