How to use speech-seq2seq/wav2vec2-2-gpt2-medium with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="speech-seq2seq/wav2vec2-2-gpt2-medium")
# Load model directly from transformers import AutoTokenizer, AutoModelForSpeechSeq2Seq tokenizer = AutoTokenizer.from_pretrained("speech-seq2seq/wav2vec2-2-gpt2-medium") model = AutoModelForSpeechSeq2Seq.from_pretrained("speech-seq2seq/wav2vec2-2-gpt2-medium")