Qwen3-ASR-0.6B โ€” CoreML

CoreML conversion of Qwen/Qwen3-ASR-0.6B for Apple Neural Engine.

Contains both audio encoder and text decoder for full Neural Engine inference (no GPU required).

Models

Model Description Quantization
encoder.mlmodelc Audio encoder (mel โ†’ embeddings) INT8 palettized
embedding.mlmodelc Token embedding lookup INT8 palettized
decoder.mlmodelc Text decoder with KV cache (28 layers) INT8 palettized
encoder_int4.mlpackage Audio encoder source INT4 palettized
encoder_int8.mlpackage Audio encoder source INT8 palettized

Usage

Full CoreML pipeline (encoder + decoder on Neural Engine):

Welcome to Swift!

Subcommands:

swift build Build Swift packages swift package Create and work on packages swift run Run a program from a package swift test Run package tests swift repl Experiment with Swift code interactively

Use swift --version for Swift version information.

Use swift --help for descriptions of available options and flags.

Use swift help <subcommand> for more information about a subcommand.

Hybrid mode (CoreML encoder + MLX decoder on GPU):

Architecture

  • Audio encoder: 18-layer Whisper-style transformer (896 dim, 14 heads)
  • Text decoder: 28 layers, 1024 hidden, 16 heads (8 KV heads)
  • KV cache: Fixed 1024 tokens via CoreML MLState
  • Requires: macOS 15+ / iOS 18+ (full CoreML mode)

Links


Links

Downloads last month
686
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for aufklarer/Qwen3-ASR-CoreML

Quantized
(10)
this model

Collection including aufklarer/Qwen3-ASR-CoreML