diarization-js · community-1 artifacts

ONNX-exported model weights and PLDA parameters bundled for the diarization-js JavaScript library — a port of pyannote/speaker-diarization-community-1 that runs fully in the browser (WebGPU / WASM) or in Node, no Python required.

⚠️ This repository hosts artifacts only (model files). The pipeline code lives in the npm package diarization-js.

Live demo

'test a full scale demo in browser'

Contents

File Size Purpose Origin License
segmentation-3.0.onnx ~6 MB Powerset multi-label segmentation backbone pyannote/segmentation-3.0 MIT
embedding-resnet34.onnx ~26 MB ResNet34 speaker embedding extractor pyannote/wespeaker-voxceleb-resnet34-LM CC-BY-4.0
plda-params-vbx.json ~1 MB PLDA parameters for VBx clustering pyannote/pyannote-audio MIT

These three files are functionally equivalent to the upstream pyannote/speaker-diarization-community-1 pipeline at the time of export. Validated end-to-end against the Python reference (DER = 1.73% on VoxConverse v0.3 dev).

Usage

Install the runtime:

npm install diarization-js onnxruntime-web   # browser
npm install diarization-js onnxruntime-node  # Node

Load and run:

import * as ort from "onnxruntime-web/webgpu";
import { DiarizationPipeline, ensureArtifacts } from "diarization-js";

const artifacts = await ensureArtifacts(); // fetches this repo, caches
const pipeline = await DiarizationPipeline.load({
  ort,
  ...artifacts,
});

const result = await pipeline.run(float32Audio16k);
console.log(result.segments);

In Node the artifacts are cached on disk under ~/.cache/diarization-js/. In the browser the HTTP cache (max-age headers from huggingface.co) handles deduplication — first load fetches ~33 MB, subsequent loads are local.

Self-hosting

The library never hard-requires huggingface.co. Pass any base URL that serves the same three filenames to ensureArtifacts({ source: "..." }), or load the bytes yourself and pass them directly to DiarizationPipeline.load.

Citing

If you use this package in academic work, please cite the upstream pyannote.audio papers:

@inproceedings{Bredin2023,
  title={pyannote.audio 2.1 speaker diarization pipeline},
  author={Bredin, Hervé},
  booktitle={Interspeech},
  year={2023}
}

License

  • Code (diarization-js npm package): MIT
  • Weights in this repository: see per-file table above. Redistributed here under the upstream licenses; original copyright holders retain their rights.
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support