Can't match expected performance on VoxCeleb test set

by 607HF - opened Mar 3, 2025

Mar 3, 2025

When evaluating this model on VoxCeleb-O, I get an EER of 4.9% (threshold around .87, which is very close to what's reported here). This seems high, and according to the WavLM paper it should be 0.84%. What EER do you get using this model?

607HF

Mar 21, 2025

What might be a clue is that when putting the input through the model I get the following warning:
torch\nn\functional.py:5962: UserWarning: Support for mismatched key_padding_mask and attn_mask is deprecated. Use same type for both instead.
I have not been able to figure out what's causing this, though. I have also tried setting up the Unispeech git repository and comparing to the model checkpoints released there, but I have not been able to set the environment up successfully.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment