| --- |
| license: mit |
| base_model: onnx-community/pyannote-segmentation-3.0 |
| pipeline_tag: voice-activity-detection |
| library_name: openasr |
| tags: |
| - speaker-diarization |
| - openasr |
| - oasr |
| --- |
| |
| <div align="center"> |
|
|
| # pyannote Segmentation 3.0 Β· OpenASR |
|
|
| **pyannote segmentation-3.0 β speaker-change and overlap aware speech segmentation for OpenASR diarization** |
|
|
| [](https://huggingface.co/onnx-community/pyannote-segmentation-3.0) |
| [](https://github.com/QuintinShaw/openasr) |
| [](https://openasr.org) |
| [](https://huggingface.co/onnx-community/pyannote-segmentation-3.0) |
|
|
| Speaker-diarization support pack for the **[OpenASR](https://github.com/QuintinShaw/openasr)** runtime β |
| pure-Rust inference, **no Python at inference time**. |
|
|
| </div> |
|
|
| --- |
|
|
| ## β¨ Highlights |
|
|
| - βοΈ **Speaker-change aware segmentation** β PyanNet (SincNet + BiLSTM) with a powerset head that detects up to 3 concurrent speakers, including overlapped speech |
| - π€ **Quality upgrade for `--diarize`** β installed alongside the CAM++ embedder pack, it replaces coarse VAD slices with fine speaker-turn boundaries |
| - π **Diarization, not identification** β anonymous session-relative labels; nothing leaves the machine |
| - π― **Bit-exact packaging** β single raw-f32 build; the pure-Rust forward pass matches the upstream ONNX logits (max abs error ~7e-5) |
| - π¦ **Native in OpenASR** β `.oasr` packs run with no Python at inference, engineered for peak performance on CPU & GPU |
|
|
| ## π Quickstart |
|
|
| ```bash |
| # 1. Install the OpenASR CLI Β· https://openasr.org |
| # 2. Pull the pack |
| openasr pull pyannote-segmentation-3.0:f32 |
| |
| # 3. Diarize any transcription (works with every OpenASR ASR model) |
| openasr transcribe meeting.wav --model xasr-zh-en --diarize --format srt |
| ``` |
|
|
| ## π¦ Pack |
|
|
| | Quant | File (`.oasr`) | Size | |
| |:------|:---------------|-----:| |
| | f32 | `pyannote-segmentation-3.0-f32.oasr` | 6 MB | |
|
|
| <sub>Single raw-**f32** build: the pure-Rust forward pass consumes f32 directly and the |
| parity gates assert bit-exact outputs vs the upstream weights, so no integer |
| quantization is produced.</sub> |
|
|
| ## π§ About pyannote Segmentation 3.0 |
|
|
| pyannote **segmentation-3.0** is the local speech-segmentation model from the pyannote speaker |
| diarization toolkit: a PyanNet (SincNet front-end + bidirectional LSTM) classifier over a 7-class |
| powerset that labels every 10 s window with which of up to three speakers are active β including |
| overlapped speech. OpenASR uses it as the optional segmentation stage of its model-agnostic |
| diarization pipeline: when this pack is installed, `--diarize` splits speech at speaker changes |
| instead of relying on coarse VAD slices, then the CAM++ embedder pack clusters the segments into |
| anonymous speaker turns. Weights are extracted from the un-gated, MIT-licensed |
| **onnx-community** ONNX mirror at a pinned revision and repackaged as a raw-f32 `.oasr` pack that |
| runs in pure Rust β no Python at inference time. |
|
|
| ## βοΈ How this pack was made |
|
|
| Converted from [onnx-community/pyannote-segmentation-3.0](https://huggingface.co/onnx-community/pyannote-segmentation-3.0) with the OpenASR importer: |
|
|
| ```bash |
| openasr model-pack import-pyannote-local <src>.safetensors <out>.oasr \ |
| --package-id pyannote-segmentation-3.0 |
| ``` |
|
|
| The `.oasr` container is GGUF-backed; every tensor is stored as raw f32 so the |
| pack round-trips bit-identically against the source weights. |
|
|
| ## βοΈ License |
|
|
| This pack **inherits the upstream model's license: MIT** |
| ([source](https://huggingface.co/onnx-community/pyannote-segmentation-3.0)). OpenASR packaging retains the upstream copyright; |
| the only modification is format conversion. |
|
|
| ## π Acknowledgements |
|
|
| This pack is a redistribution of **pyannote segmentation-3.0**, created by HervΓ© Bredin and the |
| **pyannote.audio** project, via the un-gated ONNX mirror |
| ([onnx-community/pyannote-segmentation-3.0](https://huggingface.co/onnx-community/pyannote-segmentation-3.0)). |
| All credit for the architecture, training, and weights belongs to the upstream authors; the license |
| is inherited from and identical to the upstream model (MIT). |
|
|
| ## π Links |
|
|
| - π¦ **OpenASR** β <https://github.com/QuintinShaw/openasr> |
| - π **Website** β <https://openasr.org> |
| - π€ **Upstream model** β [onnx-community/pyannote-segmentation-3.0](https://huggingface.co/onnx-community/pyannote-segmentation-3.0) |