Unable to process complete audio

by eshwarpilli - opened Oct 19, 2025

Oct 19, 2025

•

edited Oct 19, 2025

I'm passing in the audio of 40 secs but the model transcribes uptill 20 secs. What could be the issue? Am I missing anything?

I even tried setting max_new_tokens=8192, still no luck.

from transformers import AutoModel
import librosa

repo_name = "ekacare/parrotlet-a-en-5b"
model = AutoModel.from_pretrained(repo_name, trust_remote_code=True)

audio_path = "<myaudio.wav>"
audio, sample_rate = librosa.load(audio_path, sr=16000)

transcription = model.transcribe(audio, sample_rate)
print("Transcription:", transcription)

Also on a side note: Do you provide API? The API documentation seems to be in progress, I see 404!

ds-EkaCare

Eka Care org Oct 28, 2025

Every audio has to be of length 30 seconds or less. Not more than that. Since we are using Whisper based encoder.
Hope that helps.

ds-EkaCare changed discussion status to closed Oct 28, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment