segmentation-3.0 ONNX
ONNX export of pyannote/segmentation-3.0 for speaker diarization (voice activity and speaker segmentation).
- Input: waveform
[batch, channels, samples], 16 kHz mono, e.g.[1, 1, 160000]for 10 seconds. - Output: logits
[batch, num_frames, num_classes](7 classes, powerset decoding). - Exported with opset 14. Use ONNX Runtime to run on device (Core ML conversion is not supported for this model due to control-flow ops).
Derived from pyannote.audio; see pyannote/segmentation-3.0 for the original model and license.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support