Instructions to use openai/whisper-medium with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use openai/whisper-medium with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="openai/whisper-medium")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("openai/whisper-medium") model = AutoModelForSpeechSeq2Seq.from_pretrained("openai/whisper-medium") - Notebooks
- Google Colab
- Kaggle
Inference api other languages
I am trying to use api inference for other languages but without success yet
API_URL = "https://api-inference.huggingface.co/models/openai/whisper-medium"
headers = {"Authorization": "Bearer xxxx"}
def query(filename, language="pt"):
with open(filename, "rb") as f:
sound = f.read()
sound_base64 = base64.b64encode(sound).decode("utf-8") # Codificar em base64
data = {"inputs": {"speech": sound_base64, "language": language}}
response = requests.post(API_URL, headers=headers, json=data)
return response.json()
I get the feedback
{'error': ['Error in inputs: Malformed soundfile']}
I am also looking for a way to get transcript from Spanish to Spanish instead from Spanish (audio) to English. I am looking for some instructions, but didn't managed to find anything...
Any help is welcome.