Automatic Speech Recognition
Transformers
PyTorch
TensorFlow
JAX
Safetensors
whisper
audio
hf-asr-leaderboard
Eval Results
Instructions to use openai/whisper-large-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openai/whisper-large-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="openai/whisper-large-v2")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("openai/whisper-large-v2") model = AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-large-v2") - Notebooks
- Google Colab
- Kaggle
Large audio file (more then 2 hours)
#59
by jonfv - opened
My code:
pipe = pipeline(
"automatic-speech-recognition",
model="openai/whisper-large-v2",
generate_kwargs={"language": "br", "task": "transcribe"},
device="cpu",
use_fast=True
)
res = pipe(YT_AUDIO_FILE, batch_size=10, return_timestamps=True, chunk_length_s=30, stride_length_s=(4, 2))
Why the pipe finish after end of audio? The audio have more then 2 hours and less then minutes is generated.
Thx!!!
Hey @jonfv - your code looks good. Could you share the audio file so I can reproduce locally on my end?