docs: fix GitHub repo URL case (QuintinShaw/OpenASR -> QuintinShaw/openasr)

aaae214 verified 3 days ago

4.63 kB

	---
	license: mit
	base_model: onnx-community/pyannote-segmentation-3.0
	pipeline_tag: voice-activity-detection
	library_name: openasr
	tags:
	- speaker-diarization
	- openasr
	- oasr
	---

	<div align="center">

	# pyannote Segmentation 3.0 · OpenASR

	pyannote segmentation-3.0 — speaker-change and overlap aware speech segmentation for OpenASR diarization

	[![License](https://img.shields.io/badge/license-MIT-2563eb.svg)](https://huggingface.co/onnx-community/pyannote-segmentation-3.0)
	[![Format](https://img.shields.io/badge/format-.oasr-7c3aed.svg)](https://github.com/QuintinShaw/openasr)
	[![Runtime](https://img.shields.io/badge/runtime-OpenASR-111827.svg)](https://openasr.org)
	[![Base model](https://img.shields.io/badge/base-pyannote--segmentation--3.0-f59e0b.svg)](https://huggingface.co/onnx-community/pyannote-segmentation-3.0)

	Speaker-diarization support pack for the [OpenASR](https://github.com/QuintinShaw/openasr) runtime —
	pure-Rust inference, no Python at inference time.

	</div>

	---

	## ✨ Highlights

	- ✂️ Speaker-change aware segmentation — PyanNet (SincNet + BiLSTM) with a powerset head that detects up to 3 concurrent speakers, including overlapped speech
	- 🤝 Quality upgrade for `--diarize` — installed alongside the CAM++ embedder pack, it replaces coarse VAD slices with fine speaker-turn boundaries
	- 🔒 Diarization, not identification — anonymous session-relative labels; nothing leaves the machine
	- 🎯 Bit-exact packaging — single raw-f32 build; the pure-Rust forward pass matches the upstream ONNX logits (max abs error ~7e-5)
	- 🦀 Native in OpenASR — `.oasr` packs run with no Python at inference, engineered for peak performance on CPU & GPU

	## 🚀 Quickstart

	```bash
	# 1. Install the OpenASR CLI · https://openasr.org
	# 2. Pull the pack
	openasr pull pyannote-segmentation-3.0:f32

	# 3. Diarize any transcription (works with every OpenASR ASR model)
	openasr transcribe meeting.wav --model xasr-zh-en --diarize --format srt
	```

	## 📦 Pack

	\| Quant \| File (`.oasr`) \| Size \|
	\|:------\|:---------------\|-----:\|
	\| f32 \| `pyannote-segmentation-3.0-f32.oasr` \| 6 MB \|

	<sub>Single raw-f32 build: the pure-Rust forward pass consumes f32 directly and the
	parity gates assert bit-exact outputs vs the upstream weights, so no integer
	quantization is produced.</sub>

	## 🧠 About pyannote Segmentation 3.0

	pyannote segmentation-3.0 is the local speech-segmentation model from the pyannote speaker
	diarization toolkit: a PyanNet (SincNet front-end + bidirectional LSTM) classifier over a 7-class
	powerset that labels every 10 s window with which of up to three speakers are active — including
	overlapped speech. OpenASR uses it as the optional segmentation stage of its model-agnostic
	diarization pipeline: when this pack is installed, `--diarize` splits speech at speaker changes
	instead of relying on coarse VAD slices, then the CAM++ embedder pack clusters the segments into
	anonymous speaker turns. Weights are extracted from the un-gated, MIT-licensed
	onnx-community ONNX mirror at a pinned revision and repackaged as a raw-f32 `.oasr` pack that
	runs in pure Rust — no Python at inference time.

	## ⚙️ How this pack was made

	Converted from [onnx-community/pyannote-segmentation-3.0](https://huggingface.co/onnx-community/pyannote-segmentation-3.0) with the OpenASR importer:

	```bash
	openasr model-pack import-pyannote-local <src>.safetensors <out>.oasr \
	--package-id pyannote-segmentation-3.0
	```

	The `.oasr` container is GGUF-backed; every tensor is stored as raw f32 so the
	pack round-trips bit-identically against the source weights.

	## ⚖️ License

	This pack inherits the upstream model's license: MIT
	([source](https://huggingface.co/onnx-community/pyannote-segmentation-3.0)). OpenASR packaging retains the upstream copyright;
	the only modification is format conversion.

	## 🙏 Acknowledgements

	This pack is a redistribution of pyannote segmentation-3.0, created by Hervé Bredin and the
	pyannote.audio project, via the un-gated ONNX mirror
	([onnx-community/pyannote-segmentation-3.0](https://huggingface.co/onnx-community/pyannote-segmentation-3.0)).
	All credit for the architecture, training, and weights belongs to the upstream authors; the license
	is inherited from and identical to the upstream model (MIT).

	## 🔗 Links

	- 🦀 OpenASR — <https://github.com/QuintinShaw/openasr>
	- 🌐 Website — <https://openasr.org>
	- 🤗 Upstream model — [onnx-community/pyannote-segmentation-3.0](https://huggingface.co/onnx-community/pyannote-segmentation-3.0)