Unable to process complete audio

#1
by eshwarpilli - opened

I'm passing in the audio of 40 secs but the model transcribes uptill 20 secs. What could be the issue? Am I missing anything?

I even tried setting max_new_tokens=8192, still no luck.

from transformers import AutoModel
import librosa

repo_name = "ekacare/parrotlet-a-en-5b"
model = AutoModel.from_pretrained(repo_name, trust_remote_code=True)

audio_path = "<myaudio.wav>"
audio, sample_rate = librosa.load(audio_path, sr=16000)

transcription = model.transcribe(audio, sample_rate)
print("Transcription:", transcription)

Also on a side note: Do you provide API? The API documentation seems to be in progress, I see 404!

Eka Care org

Every audio has to be of length 30 seconds or less. Not more than that. Since we are using Whisper based encoder.
Hope that helps.

ds-EkaCare changed discussion status to closed

Sign up or log in to comment