Instructions to use mlx-community/Mega-ASR-MLX-bf16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mlx-community/Mega-ASR-MLX-bf16 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Mega-ASR-MLX-bf16 mlx-community/Mega-ASR-MLX-bf16
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
Mega-ASR β MLX (bf16)
MLX port of Mega-ASR for Apple Silicon, for use with mlx-audio.
Mega-ASR is a robustness layer over Qwen3-ASR-1.7B: a tiny audio-quality router classifies each utterance as clean or degraded and switches a dense LoRA adapter in/out of the base weights at inference β degraded audio runs the LoRA (robust) path, clean audio runs the unmodified base path. This recovers large WER gains on noisy/far-field speech while leaving clean-speech accuracy essentially unchanged.
The base weights are stored as dense bf16 on purpose: Mega-ASR adds fp32 LoRA deltas to the base at inference, so the base cannot be quantized without losing the runtime router/LoRA switching.
Usage
from mlx_audio.stt import load
model = load("mlx-community/Mega-ASR-MLX-bf16")
result = model.generate("audio.wav", language="en")
print(result.text)
CLI:
python -m mlx_audio.stt.generate --model mlx-community/Mega-ASR-MLX-bf16 --audio audio.wav
The router decides per-utterance automatically; no flags needed.
Validation
Reproduces the paper's published robustness gains. Word Error Rate on the real NOIZEUS corpus (8 noise types Γ 4 SNR Γ 30 utterances, Apple Silicon):
| SNR | base (Qwen3-ASR) | Mega-ASR (robust) | paper base | paper robust |
|---|---|---|---|---|
| 0 dB | 23.35 | 20.61 | 23.97 | 19.80 |
| 5 dB | 8.47 | 6.51 | β | β |
| 10 dB | 3.31 | 2.17 | 3.41 | 2.79 |
| 15 dB | 2.12 | 0.83 | β | β |
| overall | 9.31 | 7.53 | 9.45 | 7.52 |
Overall robust WER 7.53 vs the paper's 7.52 β a ~20% relative reduction over the Qwen3-ASR baseline, reproduced. On clean read speech (FLEURS) the model matches plain Qwen3-ASR, as intended.
License & attribution
Apache-2.0. Built on zhifeixie/Mega-ASR (adapter + router) and Qwen/Qwen3-ASR-1.7B (base).
- Downloads last month
- -
Quantized
Model tree for mlx-community/Mega-ASR-MLX-bf16
Base model
Qwen/Qwen3-ASR-1.7B