Alkd's picture
Add model card
edf8e8c verified
---
license: cc-by-4.0
tags:
- onnx
- speaker-verification
- wespeaker
- pyannote
---
# speaker-embedding-onnx
ONNX export of the ResNet34 backbone from [pyannote/wespeaker-voxceleb-resnet34-LM](https://huggingface.co/pyannote/wespeaker-voxceleb-resnet34-LM).
Follows the official [wespeaker/bin/export_onnx.py](https://github.com/wenet-e2e/wespeaker/blob/master/wespeaker/bin/export_onnx.py) approach: fbank features are computed externally, only the backbone is in ONNX.
## Inputs / Outputs
| Name | Shape | Description |
|---|---|---|
| `input_features` | `(batch, T, 80)` | Kaldi fbank features (T is dynamic) |
| `embedding` | `(batch, 256)` | Speaker embedding vector |
## Fbank parameters (must match at inference)
`kaldi.fbank(wav * 32768, num_mel_bins=80, frame_length=25, frame_shift=10,
round_to_power_of_two=True, window_type="hamming", use_energy=False,
snip_edges=True, dither=0.0, sample_frequency=16000)`
Then subtract per-bin mean: `feats -= feats.mean(axis=0)`.