aoiandroid
/

Breeze-ASR-25_coreml

Automatic Speech Recognition

Model card Files Files and versions

Breeze-ASR-25_coreml / README.md

aoiandroid's picture

Upload folder using huggingface_hub

7a98249 verified 4 months ago

|

history blame contribute delete

1.93 kB

	---
	license: apache-2.0
	tags:
	- automatic-speech-recognition
	- coreml
	- whisperkit
	- apple-silicon
	- asr
	- on-device
	- breeze
	- mediatek
	model_type: automatic-speech-recognition
	library_name: whisperkit
	pipeline_tag: automatic-speech-recognition
	---

	# Breeze-ASR-25 CoreML

	This model is based on [MediaTek-Research_Breeze-ASR-25](https://huggingface.co/MediaTek-Research/Breeze-ASR-25), a state-of-the-art automatic speech recognition (ASR) model.
	It has been converted into the CoreML format for compatibility with Whisperkit, enabling efficient ASR inference on Apple Silicon devices.

	## Model Description

	Breeze-ASR-25 is a high-performance automatic speech recognition model developed by MediaTek Research. This CoreML version enables on-device inference on Apple Silicon devices through Whisperkit integration.

	## Model Components

	This repository contains three CoreML models:

	1. AudioEncoder.mlmodelc - Audio feature encoder
	2. MelSpectrogram.mlmodelc - Mel spectrogram processor
	3. TextDecoder.mlmodelc - Text decoder for transcription

	## Usage

	### With Whisperkit

	```python
	import whisperkit

	# Load the model
	model = whisperkit.load_model("your-username/Breeze-ASR-25_coreml")

	# Transcribe audio
	result = model.transcribe("path/to/audio.wav")
	print(result.text)
	```

	### Requirements

	- macOS with Apple Silicon (M1/M2/M3)
	- iOS 16.0+ or macOS 13.0+
	- Whisperkit framework

	## Performance

	- Optimized for Apple Silicon devices
	- On-device inference (no internet required)
	- Low latency and memory usage
	- High accuracy speech recognition

	## License

	This model is licensed under the Apache 2.0 License.

	## Citation

	If you use this model, please cite the original Breeze-ASR-25 paper:

	```bibtex
	@article{breeze-asr-25,
	title={Breeze-ASR-25: Efficient Speech Recognition for Mobile Devices},
	author={MediaTek Research},
	journal={arXiv preprint},
	year={2024}
	}
	```