speech-turn-detector-onnx
ONNX export of pyannote/segmentation-3.0 (PyanNet architecture).
Exported from local pytorch_model.bin using torch.jit.trace + legacy ONNX exporter (opset 14).
Inputs / Outputs
| Name | Shape | Description |
|---|---|---|
input_values |
(batch, 1, 160000) |
Raw waveform, 10 s @ 16 kHz |
logits |
(batch, 589, 7) |
Powerset speaker-activity logits |
The 7 output classes are powerset-encoded speaker activity for up to 3 speakers with max 2 simultaneous.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support