--- license: apache-2.0 tags: - automatic-speech-recognition - coreml - whisperkit - apple-silicon - asr - on-device - breeze - mediatek model_type: automatic-speech-recognition library_name: whisperkit pipeline_tag: automatic-speech-recognition --- # Breeze-ASR-25 CoreML This model is based on [MediaTek-Research_Breeze-ASR-25](https://huggingface.co/MediaTek-Research/Breeze-ASR-25), a state-of-the-art automatic speech recognition (ASR) model. It has been converted into the CoreML format for compatibility with Whisperkit, enabling efficient ASR inference on Apple Silicon devices. ## Model Description Breeze-ASR-25 is a high-performance automatic speech recognition model developed by MediaTek Research. This CoreML version enables on-device inference on Apple Silicon devices through Whisperkit integration. ## Model Components This repository contains three CoreML models: 1. **AudioEncoder.mlmodelc** - Audio feature encoder 2. **MelSpectrogram.mlmodelc** - Mel spectrogram processor 3. **TextDecoder.mlmodelc** - Text decoder for transcription ## Usage ### With Whisperkit ```python import whisperkit # Load the model model = whisperkit.load_model("your-username/Breeze-ASR-25_coreml") # Transcribe audio result = model.transcribe("path/to/audio.wav") print(result.text) ``` ### Requirements - macOS with Apple Silicon (M1/M2/M3) - iOS 16.0+ or macOS 13.0+ - Whisperkit framework ## Performance - Optimized for Apple Silicon devices - On-device inference (no internet required) - Low latency and memory usage - High accuracy speech recognition ## License This model is licensed under the Apache 2.0 License. ## Citation If you use this model, please cite the original Breeze-ASR-25 paper: ```bibtex @article{breeze-asr-25, title={Breeze-ASR-25: Efficient Speech Recognition for Mobile Devices}, author={MediaTek Research}, journal={arXiv preprint}, year={2024} } ```