MiMo-V2.5-ASR-MLX / README.md
ailuntz's picture
Refresh MLX model card layout
267498b verified
---
license: apache-2.0
library_name: mlx
pipeline_tag: automatic-speech-recognition
base_model:
- XiaomiMiMo/MiMo-V2.5-ASR
tags:
- mlx
- apple-silicon
- automatic-speech-recognition
- speech-recognition
- mimo
language:
- zh
- en
---
Part of the [MiMo V2.5 ASR MLX](https://huggingface.co/collections/mlx-community/mimo-v25-asr-6a02cb99466d6a36475a4d70) collection.
# MiMo-V2.5-ASR-MLX (MLX)
Apple MLX weights for [XiaomiMiMo/MiMo-V2.5-ASR](https://github.com/XiaomiMiMo/MiMo-V2.5-ASR), converted for local ASR on Apple Silicon.
## TL;DR
| | |
|---|---|
| **Variant** | default MLX entry |
| **Best for** | default checkpoint entry |
| **Official code** | [`XiaomiMiMo/MiMo-V2.5-ASR`](https://github.com/XiaomiMiMo/MiMo-V2.5-ASR) |
| **MLX helper** | [`ailuntx/MiMo-V2.5-ASR-MLX`](https://github.com/ailuntx/MiMo-V2.5-ASR-MLX) |
| **MLX runtime dependency** | [`ailuntx/mlx-audio`](https://github.com/ailuntx/mlx-audio) |
| **Audio tokenizer** | [`mlx-community/MiMo-Audio-Tokenizer`](https://huggingface.co/mlx-community/MiMo-Audio-Tokenizer) |
| **Hardware** | Apple Silicon recommended; HF Spaces CPU fallback is only a load smoke test |
## Quick Start
```bash
git clone https://github.com/ailuntx/MiMo-V2.5-ASR-MLX.git
cd MiMo-V2.5-ASR-MLX
python -m venv .venv
.venv/bin/pip install git+https://github.com/ailuntx/mlx-audio.git
hf download mlx-community/MiMo-Audio-Tokenizer --local-dir ./models/MiMo-Audio-Tokenizer
hf download mlx-community/MiMo-V2.5-ASR-MLX --local-dir ./models/MiMo-V2.5-ASR-MLX
.venv/bin/python run_mimo_asr_mlx.py --model ./models/MiMo-V2.5-ASR-MLX --audio path/to/audio.wav
```
## Variants
| Variant | Best for |
|---|---|
| [`MiMo-V2.5-ASR-MLX`](https://huggingface.co/mlx-community/MiMo-V2.5-ASR-MLX) | default entry |
| [`MiMo-V2.5-ASR-MLX-fp32`](https://huggingface.co/mlx-community/MiMo-V2.5-ASR-MLX-fp32) | high-precision baseline |
| [`MiMo-V2.5-ASR-MLX-bf16`](https://huggingface.co/mlx-community/MiMo-V2.5-ASR-MLX-bf16) | high-quality Apple Silicon use |
| [`MiMo-V2.5-ASR-MLX-8bit`](https://huggingface.co/mlx-community/MiMo-V2.5-ASR-MLX-8bit) | smaller local checkpoint |
| [`MiMo-V2.5-ASR-MLX-4bit`](https://huggingface.co/mlx-community/MiMo-V2.5-ASR-MLX-4bit) | smallest checkpoint |
## Layout
```text
MiMo-V2.5-ASR-MLX/
β”œβ”€β”€ config.json
β”œβ”€β”€ model.safetensors / shards
β”œβ”€β”€ tokenizer files
└── mlx_manifest.json
```
## Conversion Notes
| Component | Source | MLX handling |
|---|---|---|
| ASR model | `XiaomiMiMo/MiMo-V2.5-ASR` | converted to MLX weights |
| audio tokenizer | `mlx-community/MiMo-Audio-Tokenizer` | downloaded separately and passed to the runtime |
| helper script | `ailuntx/MiMo-V2.5-ASR-MLX` | wraps `mlx-audio` loading/generation |
## Validation
Local Apple Silicon is the intended runtime. The HF Space can start and load the model on Linux CPU fallback, but full ASR on `cpu-basic` can exceed request timeouts.
## License and Citation
License follows the upstream MiMo release. Cite the original MiMo project for the model and cite [`ailuntx/MiMo-V2.5-ASR-MLX`](https://github.com/ailuntx/MiMo-V2.5-ASR-MLX) for the MLX helper.