Automatic Speech Recognition
Transformers
PyTorch
TensorFlow
JAX
Safetensors
whisper
audio
hf-asr-leaderboard
Eval Results (legacy)
Instructions to use openai/whisper-tiny with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openai/whisper-tiny with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="openai/whisper-tiny")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("openai/whisper-tiny") model = AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-tiny") - Notebooks
- Google Colab
- Kaggle
Commit ·
f8fb469
1
Parent(s): 4954caa
Fix Syntax Error, close parenthesis. (#2)
Browse files- Fix Syntax Error, close parenthesis. (4204457610c9d68596a993f543b5ebd970ce44f1)
Co-authored-by: steven tartakovsky <startakovsky@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -227,7 +227,7 @@ The "<|en|>" token is used to specify that the speech is in english and should b
|
|
| 227 |
>>> input_features = processor(ds[0]["audio"]["array"], return_tensors="pt").input_features
|
| 228 |
|
| 229 |
>>> # Generate logits
|
| 230 |
-
>>> logits = model(input_features, decoder_input_ids = torch.tensor([[50258]]).logits
|
| 231 |
>>> # take argmax and decode
|
| 232 |
>>> predicted_ids = torch.argmax(logits, dim=-1)
|
| 233 |
>>> transcription = processor.batch_decode(predicted_ids)
|
|
|
|
| 227 |
>>> input_features = processor(ds[0]["audio"]["array"], return_tensors="pt").input_features
|
| 228 |
|
| 229 |
>>> # Generate logits
|
| 230 |
+
>>> logits = model(input_features, decoder_input_ids = torch.tensor([[50258]])).logits
|
| 231 |
>>> # take argmax and decode
|
| 232 |
>>> predicted_ids = torch.argmax(logits, dim=-1)
|
| 233 |
>>> transcription = processor.batch_decode(predicted_ids)
|