VibeVoice-ASR-bf16 / README.md
prince-canuma's picture
Upload folder using huggingface_hub
12076ff verified
metadata
language:
  - en
  - zh
license: mit
pipeline_tag: automatic-speech-recognition
tags:
  - ASR
  - Transcriptoin
  - Diarization
  - Speech-to-Text
  - mlx
  - speech-to-text
  - speech
  - transcription
  - asr
  - stt
  - mlx-audio
library_name: mlx-audio

mlx-community/VibeVoice-ASR-bf16

This model was converted to MLX format from microsoft/VibeVoice-ASR using mlx-audio version 0.3.0.

Refer to the original model card for more details on the model.

Use with mlx-audio

pip install -U mlx-audio

CLI Example:

python -m mlx_audio.stt.generate --model mlx-community/VibeVoice-ASR-bf16 --audio "audio.wav"

Python Example:

from mlx_audio.stt.utils import load_model
from mlx_audio.stt.generate import generate_transcription

model = load_model("mlx-community/VibeVoice-ASR-bf16")
transcription = generate_transcription(
    model=model,
    audio_path="path_to_audio.wav",
    output_path="path_to_output.txt",
    format="txt",
    verbose=True,
)
print(transcription.text)