Breeze-ASR-26 CoreML (4-bit Palette Quantized)

This repository contains the Apple CoreML version of MediaTek-Research/Breeze-ASR-26, quantized to 4-bit using palette (lookup table) quantization for efficient on-device inference.

Model Details

Property	Value
Base Model	MediaTek-Research/Breeze-ASR-26
Architecture	Whisper (large-v2 based)
Format	Apple CoreML (`.mlmodelc`)
Quantization	4-bit palette (mixedBitPalettized 4.0-bit)
Model Size	~890 MB
Languages	Chinese (zh), English (en)

Files

├── AudioEncoder.mlmodelc/    # Mel spectrogram → encoder hidden states (~317 MB)
├── MelSpectrogram.mlmodelc/  # Audio waveform → Mel spectrogram (~372 KB)
├── TextDecoder.mlmodelc/     # Encoder states → token predictions (~533 MB)
├── config.json               # Model configuration
└── generation_config.json    # Generation/decoding parameters

Usage with WhisperKit

This model is designed to run with WhisperKit on Apple devices (iPhone, iPad, Mac).

import WhisperKit

let pipe = try await WhisperKit(
    model: "weiren119/Breeze-ASR-26-coreml-4bit-palette"
)
let result = try await pipe.transcribe(audioPath: "audio.wav")
print(result.text)

Quantization Details

4-bit palette quantization compresses model weights by mapping them to a lookup table of 16 representative values (2⁴ = 16). This significantly reduces model size while maintaining reasonable accuracy, making it suitable for on-device deployment where memory is constrained.

Downloads last month: 86

Model tree for weiren119/Breeze-ASR-26-coreml-4bit-palette

Base model

openai/whisper-large-v2

Finetuned

MediaTek-Research/Breeze-ASR-26

Finetuned

(4)

this model