mlx-community/Fun-ASR-Nano-2512-fp16

This model was converted to MLX format from FunAudioLLM/Fun-ASR-Nano-2512 using mlx-audio-plus version 0.1.4.

Features

Feature	Description
Multilingual	Supports 13+ languages
Translation	Translate speech directly to English text
Custom prompting	Guide recognition with domain-specific context
Streaming	Real-time token-by-token output

Installation

pip install -U mlx-audio-plus

Usage

Basic Transcription

from mlx_audio.stt.models.funasr import Model

# Load the model
model = Model.from_pretrained("mlx-community/Fun-ASR-Nano-2512-fp16")

# Transcribe audio
result = model.generate("audio.wav")
print(result.text)
# Output: "The quick brown fox jumps over the lazy dog."

print(f"Duration: {result.duration:.2f}s")
print(f"Language: {result.language}")

Translation (Speech to English Text)

# Translate Chinese/Japanese/etc. audio to English
result = model.generate(
    "chinese_speech.wav",
    task="translate",
    target_language="en"
)
print(result.text)  # English translation

Custom Prompting

Provide context to improve recognition accuracy for specialized domains:

# Medical transcription
result = model.generate(
    "doctor_notes.wav",
    initial_prompt="Medical consultation discussing cardiac symptoms and treatment options."
)

# Technical content
result = model.generate(
    "tech_podcast.wav",
    initial_prompt="Discussion about machine learning, APIs, and software development."
)

Streaming Output

Get real-time output as the model generates:

# Print tokens as they're generated
result = model.generate("audio.wav", verbose=True)
# Tokens stream to stdout in real-time

# Or use the streaming generator
for chunk in model.generate("audio.wav", stream=True):
    print(chunk, end="", flush=True)

Supported Languages

See original model for the full list of supported languages.

Downloads last month: 159

Safetensors

Model size

1.0B params

Tensor type

F16

MLX

Hardware compatibility

Quantized

Model tree for mlx-community/Fun-ASR-Nano-2512-fp16

Base model

FunAudioLLM/Fun-ASR-Nano-2512

Finetuned

(5)

this model