File size: 747 Bytes
8e62a4e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
license: mit
tags:
  - audio
  - speaker-diarization
  - onnx
  - pyannote
---

# segmentation-3.0 ONNX

ONNX export of [pyannote/segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) for speaker diarization (voice activity and speaker segmentation).

- **Input**: waveform `[batch, channels, samples]`, 16 kHz mono, e.g. `[1, 1, 160000]` for 10 seconds.
- **Output**: logits `[batch, num_frames, num_classes]` (7 classes, powerset decoding).
- Exported with opset 14. Use ONNX Runtime to run on device (Core ML conversion is not supported for this model due to control-flow ops).

Derived from pyannote.audio; see [pyannote/segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) for the original model and license.