--- library_name: mlx-audio-plus base_model: - FunAudioLLM/Fun-ASR-Nano-2512 tags: - mlx - funasr - speech-recognition - speech-to-text - stt pipeline_tag: automatic-speech-recognition language: - multilingual --- # mlx-community/Fun-ASR-Nano-2512-8bit This model was converted to MLX format from [FunAudioLLM/Fun-ASR-Nano-2512](https://huggingface.co/FunAudioLLM/Fun-ASR-Nano-2512) using [mlx-audio-plus](https://github.com/DePasqualeOrg/mlx-audio-plus) version **0.1.4**. ## Features | Feature | Description | |---------|-------------| | **Multilingual** | Supports 13+ languages | | **Translation** | Translate speech directly to English text | | **Custom prompting** | Guide recognition with domain-specific context | | **Streaming** | Real-time token-by-token output | ## Installation ```bash pip install -U mlx-audio-plus ``` ## Usage ### Basic Transcription ```python from mlx_audio.stt.models.funasr import Model # Load the model model = Model.from_pretrained("mlx-community/Fun-ASR-Nano-2512-8bit") # Transcribe audio result = model.generate("audio.wav") print(result.text) # Output: "The quick brown fox jumps over the lazy dog." print(f"Duration: {result.duration:.2f}s") print(f"Language: {result.language}") ``` ### Translation (Speech to English Text) ```python # Translate Chinese/Japanese/etc. audio to English result = model.generate( "chinese_speech.wav", task="translate", target_language="en" ) print(result.text) # English translation ``` ### Custom Prompting Provide context to improve recognition accuracy for specialized domains: ```python # Medical transcription result = model.generate( "doctor_notes.wav", initial_prompt="Medical consultation discussing cardiac symptoms and treatment options." ) # Technical content result = model.generate( "tech_podcast.wav", initial_prompt="Discussion about machine learning, APIs, and software development." ) ``` ### Streaming Output Get real-time output as the model generates: ```python # Print tokens as they're generated result = model.generate("audio.wav", verbose=True) # Tokens stream to stdout in real-time # Or use the streaming generator for chunk in model.generate("audio.wav", stream=True): print(chunk, end="", flush=True) ``` ## Supported Languages See [original model](https://huggingface.co/FunAudioLLM/Fun-ASR-Nano-2512) for the full list of supported languages.