aoiandroid's picture
Add model card (README)
8e62a4e verified
---
license: mit
tags:
- audio
- speaker-diarization
- onnx
- pyannote
---
# segmentation-3.0 ONNX
ONNX export of [pyannote/segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) for speaker diarization (voice activity and speaker segmentation).
- **Input**: waveform `[batch, channels, samples]`, 16 kHz mono, e.g. `[1, 1, 160000]` for 10 seconds.
- **Output**: logits `[batch, num_frames, num_classes]` (7 classes, powerset decoding).
- Exported with opset 14. Use ONNX Runtime to run on device (Core ML conversion is not supported for this model due to control-flow ops).
Derived from pyannote.audio; see [pyannote/segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) for the original model and license.