Automatic Speech Recognition
Transformers
PyTorch
JAX
Safetensors
whisper
audio
hf-asr-leaderboard
Eval Results
Instructions to use openai/whisper-large-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openai/whisper-large-v3 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="openai/whisper-large-v3")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("openai/whisper-large-v3") model = AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-large-v3") - Inference
- Notebooks
- Google Colab
- Kaggle
Fix error in config.json
#9
by pere - opened
The decoder_start_token_id should refer to the <|startoftranscript|> token in the vocabulary.
Thanks for the fix, I agree that this needs to be corrected as it should match v2 in it's generation config: https://huggingface.co/openai/whisper-large-v2/blob/696465c62215e36a9ab3f9b7672fe7749f1a1df5/config.json#L19
patrickvonplaten changed pull request status to merged
Thanks a lot @pere
Good catch @pere ! We converted the generation_config standalone but missed the generation attributes in the config. The bos_token_id and eos_token_id also need updating: https://huggingface.co/openai/whisper-large-v3/discussions/25#6555f5d2ef6e96329fd5db2f