aoiandroid
/

segmentation-3.0-onnx

speaker-diarization

Model card Files Files and versions

segmentation-3.0-onnx / README.md

aoiandroid's picture

Add model card (README)

8e62a4e verified 7 days ago

|

history blame contribute delete

747 Bytes

	---
	license: mit
	tags:
	- audio
	- speaker-diarization
	- onnx
	- pyannote
	---

	# segmentation-3.0 ONNX

	ONNX export of [pyannote/segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) for speaker diarization (voice activity and speaker segmentation).

	- Input: waveform `[batch, channels, samples]`, 16 kHz mono, e.g. `[1, 1, 160000]` for 10 seconds.
	- Output: logits `[batch, num_frames, num_classes]` (7 classes, powerset decoding).
	- Exported with opset 14. Use ONNX Runtime to run on device (Core ML conversion is not supported for this model due to control-flow ops).

	Derived from pyannote.audio; see [pyannote/segmentation-3.0](https://huggingface.co/pyannote/segmentation-3.0) for the original model and license.