Using on numpy array instead of audio files

#7
by Respair - opened

Hi. I checked the code, but it was really finicky to make this work on numpy audio arrays.
is there any easier way to get the embeddings with array inputs?

I assume you already figured this out but for future reference you can just use "speaker_model.infer_segment(wav_np_array)":

https://github.com/NVIDIA-NeMo/NeMo/blob/a599d89f66abeb5f61c99c73902567fda7fae76a/nemo/collections/asr/models/label_models.py#L609

Sign up or log in to comment