File size: 747 Bytes
8e62a4e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | ---
license: mit
tags:
- audio
- speaker-diarization
- onnx
- pyannote
---
# segmentation-3.0 ONNX
ONNX export of [pyannote/segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) for speaker diarization (voice activity and speaker segmentation).
- **Input**: waveform `[batch, channels, samples]`, 16 kHz mono, e.g. `[1, 1, 160000]` for 10 seconds.
- **Output**: logits `[batch, num_frames, num_classes]` (7 classes, powerset decoding).
- Exported with opset 14. Use ONNX Runtime to run on device (Core ML conversion is not supported for this model due to control-flow ops).
Derived from pyannote.audio; see [pyannote/segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) for the original model and license.
|