|
|
--- |
|
|
library_name: mlx |
|
|
tags: |
|
|
- mlx |
|
|
- speech-recognition |
|
|
- asr |
|
|
- canary |
|
|
- apple-silicon |
|
|
license: cc-by-4.0 |
|
|
language: |
|
|
- en |
|
|
- bg |
|
|
- hr |
|
|
- cs |
|
|
- da |
|
|
- nl |
|
|
- et |
|
|
- fi |
|
|
- fr |
|
|
- de |
|
|
- el |
|
|
- hu |
|
|
- it |
|
|
- lv |
|
|
- lt |
|
|
- mt |
|
|
- pl |
|
|
- pt |
|
|
- ro |
|
|
- sk |
|
|
- sl |
|
|
- es |
|
|
- sv |
|
|
- ru |
|
|
- uk |
|
|
--- |
|
|
|
|
|
# Canary MLX |
|
|
|
|
|
NVIDIA Canary ASR model converted to MLX format for Apple Silicon. |
|
|
|
|
|
## Usage |
|
|
|
|
|
```bash |
|
|
pip install canary-mlx |
|
|
``` |
|
|
|
|
|
```python |
|
|
from canary_mlx import load_model |
|
|
|
|
|
model = load_model("qfuxa/canary-mlx") |
|
|
result = model.transcribe("audio.wav", language="en") |
|
|
print(result) |
|
|
``` |
|
|
|
|
|
## Model Details |
|
|
|
|
|
This model is a conversion of NVIDIA's Canary ASR model to Apple's MLX framework. |
|
|
|
|
|
- **Architecture**: Conformer encoder + Transformer decoder |
|
|
- **Parameters**: ~1B |
|
|
- **Supported Languages**: 25 languages (see tags) |
|
|
|
|
|
## Original Model |
|
|
|
|
|
Based on NVIDIA NeMo Canary model. See [NVIDIA NeMo](https://github.com/NVIDIA/NeMo) for the original implementation. |
|
|
|
|
|
## License |
|
|
|
|
|
Model weights are released under CC-BY-4.0 license (same as original NVIDIA model). |
|
|
|
|
|
|