Using on numpy array instead of audio files

by Respair - opened Apr 23, 2025

Apr 23, 2025

•

edited Apr 23, 2025

Hi. I checked the code, but it was really finicky to make this work on numpy audio arrays.
is there any easier way to get the embeddings with array inputs?

kpeverson

Dec 12, 2025

I assume you already figured this out but for future reference you can just use "speaker_model.infer_segment(wav_np_array)":

https://github.com/NVIDIA-NeMo/NeMo/blob/a599d89f66abeb5f61c99c73902567fda7fae76a/nemo/collections/asr/models/label_models.py#L609

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment