Unable to process complete audio
#1
by eshwarpilli - opened
I'm passing in the audio of 40 secs but the model transcribes uptill 20 secs. What could be the issue? Am I missing anything?
I even tried setting max_new_tokens=8192, still no luck.
from transformers import AutoModel
import librosa
repo_name = "ekacare/parrotlet-a-en-5b"
model = AutoModel.from_pretrained(repo_name, trust_remote_code=True)
audio_path = "<myaudio.wav>"
audio, sample_rate = librosa.load(audio_path, sr=16000)
transcription = model.transcribe(audio, sample_rate)
print("Transcription:", transcription)
Also on a side note: Do you provide API? The API documentation seems to be in progress, I see 404!
Every audio has to be of length 30 seconds or less. Not more than that. Since we are using Whisper based encoder.
Hope that helps.
ds-EkaCare changed discussion status to closed