Instructions to use facebook/wav2vec2-large-960h with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use facebook/wav2vec2-large-960h with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-large-960h")# Load model directly from transformers import AutoProcessor, AutoModelForCTC processor = AutoProcessor.from_pretrained("facebook/wav2vec2-large-960h") model = AutoModelForCTC.from_pretrained("facebook/wav2vec2-large-960h") - Notebooks
- Google Colab
- Kaggle
Example code fixed
#2
by samyxdev - opened
README.md
CHANGED
|
@@ -43,7 +43,7 @@ To transcribe audio files the model can be used as a standalone acoustic model a
|
|
| 43 |
ds = load_dataset("patrickvonplaten/librispeech_asr_dummy", "clean", split="validation")
|
| 44 |
|
| 45 |
# tokenize
|
| 46 |
-
input_values = processor(ds[0]["audio"]["array"],
|
| 47 |
|
| 48 |
# retrieve logits
|
| 49 |
logits = model(input_values).logits
|
|
|
|
| 43 |
ds = load_dataset("patrickvonplaten/librispeech_asr_dummy", "clean", split="validation")
|
| 44 |
|
| 45 |
# tokenize
|
| 46 |
+
input_values = processor(ds[0]["audio"]["array"], return_tensors="pt", padding="longest").input_values # Batch size 1
|
| 47 |
|
| 48 |
# retrieve logits
|
| 49 |
logits = model(input_values).logits
|