| | --- |
| | license: mit |
| | tags: |
| | - audio |
| | - speaker-diarization |
| | - onnx |
| | - pyannote |
| | --- |
| | |
| | # segmentation-3.0 ONNX |
| |
|
| | ONNX export of [pyannote/segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) for speaker diarization (voice activity and speaker segmentation). |
| |
|
| | - **Input**: waveform `[batch, channels, samples]`, 16 kHz mono, e.g. `[1, 1, 160000]` for 10 seconds. |
| | - **Output**: logits `[batch, num_frames, num_classes]` (7 classes, powerset decoding). |
| | - Exported with opset 14. Use ONNX Runtime to run on device (Core ML conversion is not supported for this model due to control-flow ops). |
| |
|
| | Derived from pyannote.audio; see [pyannote/segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) for the original model and license. |
| |
|