Upload folder using huggingface_hub
Browse files- .gitattributes +2 -0
- README.md +60 -0
- poem.wav +3 -0
- sortformer.pte +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
poem.wav filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
sortformer.pte filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,60 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- executorch
|
| 5 |
+
- xnnpack
|
| 6 |
+
- speaker-diarization
|
| 7 |
+
- on-device
|
| 8 |
+
- streaming
|
| 9 |
+
pipeline_tag: audio-classification
|
| 10 |
+
base_model: nvidia/diar_streaming_sortformer_4spk-v2
|
| 11 |
+
---
|
| 12 |
+
|
| 13 |
+
# Sortformer-ExecuTorch-XNNPACK
|
| 14 |
+
|
| 15 |
+
Pre-exported [ExecuTorch](https://github.com/pytorch/executorch) `.pte` file
|
| 16 |
+
for [Streaming Sortformer](https://huggingface.co/nvidia/diar_streaming_sortformer_4spk-v2)
|
| 17 |
+
with **XNNPACK** backend (CPU). A streaming speaker diarization model that
|
| 18 |
+
identifies up to 4 speakers in audio.
|
| 19 |
+
|
| 20 |
+
## Installation
|
| 21 |
+
|
| 22 |
+
```bash
|
| 23 |
+
git clone https://github.com/pytorch/executorch/ ~/executorch
|
| 24 |
+
cd ~/executorch && ./install_executorch.sh
|
| 25 |
+
make sortformer-cpu
|
| 26 |
+
```
|
| 27 |
+
|
| 28 |
+
## Download
|
| 29 |
+
|
| 30 |
+
```bash
|
| 31 |
+
pip install huggingface_hub
|
| 32 |
+
huggingface-cli download younghan-meta/Sortformer-ExecuTorch-XNNPACK --local-dir ~/sortformer
|
| 33 |
+
```
|
| 34 |
+
|
| 35 |
+
## Run
|
| 36 |
+
|
| 37 |
+
```bash
|
| 38 |
+
cmake-out/examples/models/sortformer/sortformer_runner \
|
| 39 |
+
--model_path ~/sortformer/sortformer.pte \
|
| 40 |
+
--audio_path ~/sortformer/poem.wav
|
| 41 |
+
```
|
| 42 |
+
|
| 43 |
+
Output shows detected speaker segments with start/end times.
|
| 44 |
+
|
| 45 |
+
Optional flags:
|
| 46 |
+
- `--threshold 0.5` -- speaker activity threshold (0.0-1.0)
|
| 47 |
+
- `--chunk_len 124` -- encode chunk size in 80ms frames
|
| 48 |
+
- `--fifo_len 124` -- FIFO buffer size in 80ms frames
|
| 49 |
+
|
| 50 |
+
## Export Command
|
| 51 |
+
|
| 52 |
+
```bash
|
| 53 |
+
pip install "nemo_toolkit[asr]"
|
| 54 |
+
python examples/models/sortformer/export_sortformer.py --backend xnnpack --output-dir ./sortformer_exports
|
| 55 |
+
```
|
| 56 |
+
|
| 57 |
+
## More Info
|
| 58 |
+
|
| 59 |
+
- [Official ExecuTorch Sortformer guide](https://github.com/pytorch/executorch/tree/main/examples/models/sortformer)
|
| 60 |
+
- [Original model](https://huggingface.co/nvidia/diar_streaming_sortformer_4spk-v2)
|
poem.wav
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0dd03dfb6fe83b7d10df166cb77d28bf139f9be2c739e9927c757d88255aa88b
|
| 3 |
+
size 768042
|
sortformer.pte
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:e763fae031bc8675252f2d8de0e84ff71992db4eb04257e4a50b43c9b31a77c1
|
| 3 |
+
size 492384528
|