Robust Speech Recognition via Large-Scale Weak Supervision
Paper • 2212.04356 • Published • 54
How to use versae/whisper-large-v3 with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="versae/whisper-large-v3") # Load model directly
from transformers import AutoProcessor, AutoModel
processor = AutoProcessor.from_pretrained("versae/whisper-large-v3")
model = AutoModel.from_pretrained("versae/whisper-large-v3")Version 3 of OpenAI's Whisper Large model converted from https://openaipublic.azureedge.net/main/whisper/models/e5b1a55b89c1367dacf97e3e19bfd829a01529dbfdeefa8caeb59b3f1b81dadb/large-v3.pt using HF's conversion script.
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalise to many datasets and domains without the need for fine-tuning.
Whisper was proposed in the paper Robust Speech Recognition via Large-Scale Weak Supervision by Alec Radford et al. from OpenAI. The original code repository can be found here.