Automatic Speech Recognition
Transformers
PyTorch
TensorFlow
JAX
Safetensors
whisper
audio
hf-asr-leaderboard
Eval Results
Instructions to use openai/whisper-large-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openai/whisper-large-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="openai/whisper-large-v2")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("openai/whisper-large-v2") model = AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-large-v2") - Notebooks
- Google Colab
- Kaggle
Add forced decoder ids
#2
by sanchit-gandhi - opened
- config.json +18 -0
config.json
CHANGED
|
@@ -23,6 +23,24 @@
|
|
| 23 |
"encoder_layerdrop": 0.0,
|
| 24 |
"encoder_layers": 32,
|
| 25 |
"eos_token_id": 50256,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
"init_std": 0.02,
|
| 27 |
"is_encoder_decoder": true,
|
| 28 |
"max_source_positions": 1500,
|
|
|
|
| 23 |
"encoder_layerdrop": 0.0,
|
| 24 |
"encoder_layers": 32,
|
| 25 |
"eos_token_id": 50256,
|
| 26 |
+
"forced_decoder_ids": [
|
| 27 |
+
[
|
| 28 |
+
1,
|
| 29 |
+
50258
|
| 30 |
+
],
|
| 31 |
+
[
|
| 32 |
+
2,
|
| 33 |
+
50259
|
| 34 |
+
],
|
| 35 |
+
[
|
| 36 |
+
3,
|
| 37 |
+
50359
|
| 38 |
+
],
|
| 39 |
+
[
|
| 40 |
+
4,
|
| 41 |
+
50363
|
| 42 |
+
]
|
| 43 |
+
],
|
| 44 |
"init_std": 0.02,
|
| 45 |
"is_encoder_decoder": true,
|
| 46 |
"max_source_positions": 1500,
|