Add model card (README)

8e62a4e verified 7 days ago

747 Bytes

license: mit
tags:
  - audio
  - speaker-diarization
  - onnx
  - pyannote

segmentation-3.0 ONNX

ONNX export of pyannote/segmentation-3.0 for speaker diarization (voice activity and speaker segmentation).

Input: waveform [batch, channels, samples], 16 kHz mono, e.g. [1, 1, 160000] for 10 seconds.
Output: logits [batch, num_frames, num_classes] (7 classes, powerset decoding).
Exported with opset 14. Use ONNX Runtime to run on device (Core ML conversion is not supported for this model due to control-flow ops).

Derived from pyannote.audio; see pyannote/segmentation-3.0 for the original model and license.