Breeze-ASR-26 CoreML (4-bit Palette Quantized)

This repository contains the Apple CoreML version of MediaTek-Research/Breeze-ASR-26, quantized to 4-bit using palette (lookup table) quantization for efficient on-device inference.

Model Details

Property Value
Base Model MediaTek-Research/Breeze-ASR-26
Architecture Whisper (large-v2 based)
Format Apple CoreML (.mlmodelc)
Quantization 4-bit palette (mixedBitPalettized 4.0-bit)
Model Size ~890 MB
Languages Chinese (zh), English (en)

Files

β”œβ”€β”€ AudioEncoder.mlmodelc/    # Mel spectrogram β†’ encoder hidden states (~317 MB)
β”œβ”€β”€ MelSpectrogram.mlmodelc/  # Audio waveform β†’ Mel spectrogram (~372 KB)
β”œβ”€β”€ TextDecoder.mlmodelc/     # Encoder states β†’ token predictions (~533 MB)
β”œβ”€β”€ config.json               # Model configuration
└── generation_config.json    # Generation/decoding parameters

Usage with WhisperKit

This model is designed to run with WhisperKit on Apple devices (iPhone, iPad, Mac).

import WhisperKit

let pipe = try await WhisperKit(
    model: "weiren119/Breeze-ASR-26-coreml-4bit-palette"
)
let result = try await pipe.transcribe(audioPath: "audio.wav")
print(result.text)

Quantization Details

4-bit palette quantization compresses model weights by mapping them to a lookup table of 16 representative values (2⁴ = 16). This significantly reduces model size while maintaining reasonable accuracy, making it suitable for on-device deployment where memory is constrained.

Downloads last month
86
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for weiren119/Breeze-ASR-26-coreml-4bit-palette

Finetuned
(4)
this model