Instructions to use openai/whisper-large-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openai/whisper-large-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="openai/whisper-large-v2")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("openai/whisper-large-v2") model = AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-large-v2") - Notebooks
- Google Colab
- Kaggle
ONNX implementation
Can anyone suggest how to use the exported whisper-large model (ONXX version) for transcription or translation?
Maybe it's not exactly what you wanted. But there is an example of audio stream transcribing on Github.
I used the library from Github, for HuggingFace I couldn't find an
example of inference.
I got the following:
import whisper
...
model = whisper.load_model("medium.en")
result = model.transcribe("/path/to/file.mp3", language="en")
...
You can use it with ORT pipeline: https://github.com/huggingface/optimum/pull/420#issue-1406136285
Or ONNX runtime: https://huggingface.co/docs/transformers/serialization#exporting-a-model-to-onnx (here you'll need to modify the template code snippet to pass the appropriate inputs to the ONNX model)
Hi @sanchit-gandhi .
https://github.com/huggingface/optimum/pull/420#issue-1406136285 this ORT pipeline throws an error in the latest version of transformers (4.26.0)
Hey @kirankumaram - could you open an issue on the optimum repo please with a description of the error and a link to your Colab? https://github.com/huggingface/optimum/issues/new?assignees=&labels=bug&template=bug-report.yml
