mlx-community/Fun-ASR-Nano-2512-fp16
This model was converted to MLX format from FunAudioLLM/Fun-ASR-Nano-2512 using mlx-audio-plus version 0.1.4.
Features
| Feature |
Description |
| Multilingual |
Supports 13+ languages |
| Translation |
Translate speech directly to English text |
| Custom prompting |
Guide recognition with domain-specific context |
| Streaming |
Real-time token-by-token output |
Installation
pip install -U mlx-audio-plus
Usage
Basic Transcription
from mlx_audio.stt.models.funasr import Model
model = Model.from_pretrained("mlx-community/Fun-ASR-Nano-2512-fp16")
result = model.generate("audio.wav")
print(result.text)
print(f"Duration: {result.duration:.2f}s")
print(f"Language: {result.language}")
Translation (Speech to English Text)
result = model.generate(
"chinese_speech.wav",
task="translate",
target_language="en"
)
print(result.text)
Custom Prompting
Provide context to improve recognition accuracy for specialized domains:
result = model.generate(
"doctor_notes.wav",
initial_prompt="Medical consultation discussing cardiac symptoms and treatment options."
)
result = model.generate(
"tech_podcast.wav",
initial_prompt="Discussion about machine learning, APIs, and software development."
)
Streaming Output
Get real-time output as the model generates:
result = model.generate("audio.wav", verbose=True)
for chunk in model.generate("audio.wav", stream=True):
print(chunk, end="", flush=True)
Supported Languages
See original model for the full list of supported languages.