diarization-js · community-1 artifacts

ONNX-exported model weights and PLDA parameters bundled for the diarization-js JavaScript library — a port of pyannote/speaker-diarization-community-1 that runs fully in the browser (WebGPU / WASM) or in Node, no Python required.

⚠️ This repository hosts artifacts only (model files). The pipeline code lives in the npm package diarization-js.

Live demo

'test a full scale demo in browser'

File	Size	Purpose	Origin	License
`segmentation-3.0.onnx`	~6 MB	Powerset multi-label segmentation backbone	`pyannote/segmentation-3.0`	MIT
`embedding-resnet34.onnx`	~26 MB	ResNet34 speaker embedding extractor	`pyannote/wespeaker-voxceleb-resnet34-LM`	CC-BY-4.0
`plda-params-vbx.json`	~1 MB	PLDA parameters for VBx clustering	`pyannote/pyannote-audio`	MIT

These three files are functionally equivalent to the upstream pyannote/speaker-diarization-community-1 pipeline at the time of export. Validated end-to-end against the Python reference (DER = 1.73% on VoxConverse v0.3 dev).

Usage

Install the runtime:

npm install diarization-js onnxruntime-web   # browser
npm install diarization-js onnxruntime-node  # Node

Load and run:

import * as ort from "onnxruntime-web/webgpu";
import { DiarizationPipeline, ensureArtifacts } from "diarization-js";

const artifacts = await ensureArtifacts(); // fetches this repo, caches
const pipeline = await DiarizationPipeline.load({
  ort,
  ...artifacts,
});

const result = await pipeline.run(float32Audio16k);
console.log(result.segments);

In Node the artifacts are cached on disk under ~/.cache/diarization-js/. In the browser the HTTP cache (max-age headers from huggingface.co) handles deduplication — first load fetches ~33 MB, subsequent loads are local.

Self-hosting

The library never hard-requires huggingface.co. Pass any base URL that serves the same three filenames to ensureArtifacts({ source: "..." }), or load the bytes yourself and pass them directly to DiarizationPipeline.load.

Citing

If you use this package in academic work, please cite the upstream pyannote.audio papers:

@inproceedings{Bredin2023,
  title={pyannote.audio 2.1 speaker diarization pipeline},
  author={Bredin, Hervé},
  booktitle={Interspeech},
  year={2023}
}

License

Code (diarization-js npm package): MIT
Weights in this repository: see per-file table above. Redistributed here under the upstream licenses; original copyright holders retain their rights.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Voice Activity Detection

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support