Automatic Speech Recognition
Transformers
PyTorch
TensorFlow
JAX
Safetensors
whisper
audio
hf-asr-leaderboard
Eval Results
Instructions to use openai/whisper-large-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openai/whisper-large-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="openai/whisper-large-v2")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("openai/whisper-large-v2") model = AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-large-v2") - Notebooks
- Google Colab
- Kaggle
Difference in Transcription Quality Between Local Whisper Large V2 and Model Card Inference API
#103
by nkanaka1 - opened
I've recently started using OpenAI's Whisper for transcribing audio files, specifically using the whisper.load_model("large-v2") configuration in my local environment. I expected to achieve a high level of accuracy based on the model's reported capabilities.
However, I've noticed that the transcription results I get locally are significantly worse than those I get when using the model's inference API as showcased on the model's card.