Sortformer-ExecuTorch-XNNPACK

Pre-exported ExecuTorch .pte file for Streaming Sortformer with XNNPACK backend (CPU). A streaming speaker diarization model that identifies up to 4 speakers in audio.

Installation

git clone https://github.com/pytorch/executorch/ ~/executorch
cd ~/executorch && ./install_executorch.sh
make sortformer-cpu

Download

pip install huggingface_hub
huggingface-cli download younghan-meta/Sortformer-ExecuTorch-XNNPACK --local-dir ~/sortformer

Run

cmake-out/examples/models/sortformer/sortformer_runner \
    --model_path ~/sortformer/sortformer.pte \
    --audio_path ~/sortformer/poem.wav

Output shows detected speaker segments with start/end times.

Optional flags:

  • --threshold 0.5 -- speaker activity threshold (0.0-1.0)
  • --chunk_len 124 -- encode chunk size in 80ms frames
  • --fifo_len 124 -- FIFO buffer size in 80ms frames

Export Command

pip install "nemo_toolkit[asr]"
python examples/models/sortformer/export_sortformer.py --backend xnnpack --output-dir ./sortformer_exports

More Info

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for younghan-meta/Sortformer-ExecuTorch-XNNPACK

Finetuned
(1)
this model