CoreML Speech Models
Collection
Speech AI models for Apple Neural Engine via CoreML. iOS/macOS ready. ASR, TTS, VAD, diarization. โข 17 items โข Updated โข 1
CoreML conversion of Qwen/Qwen3-ForcedAligner-0.6B with INT8 palettization for Apple Neural Engine.
Predicts word-level timestamps in a single forward pass.
| Model | Description | Quantization |
|---|---|---|
encoder.mlmodelc |
Audio encoder (24 layers) | INT8 palettized |
decoder.mlmodelc |
Text decoder + classify head (28 layers) | INT8 palettized |
let aligner = try await CoreMLForcedAligner.fromPretrained(
modelId: "aufklarer/Qwen3-ForcedAligner-0.6B-CoreML-INT8"
)
let aligned = aligner.align(audio: samples, text: "Hello world", sampleRate: 24000)
| Variant | Backend | Size | Model ID |
|---|---|---|---|
| CoreML INT4 | Neural Engine | ~630 MB | aufklarer/Qwen3-ForcedAligner-0.6B-CoreML-INT4 |
| CoreML INT8 | Neural Engine | ~1.0 GB | aufklarer/Qwen3-ForcedAligner-0.6B-CoreML-INT8 |
| MLX 4-bit | GPU | ~979 MB | aufklarer/Qwen3-ForcedAligner-0.6B-4bit |
| MLX 8-bit | GPU | ~1.4 GB | aufklarer/Qwen3-ForcedAligner-0.6B-8bit |
Base model
Qwen/Qwen3-ForcedAligner-0.6B