nguyenvulebinh commited on
Commit
c0638dc
·
verified ·
1 Parent(s): edb3c27

Upload folder using huggingface_hub

Browse files
config.yaml ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ dependencies:
2
+ pyannote.audio: 4.0.0
3
+
4
+ pipeline:
5
+ name: pyannote.audio.pipelines.SpeakerDiarization
6
+ params:
7
+ clustering: VBxClustering
8
+ segmentation: $model/segmentation
9
+ segmentation_batch_size: 32
10
+ embedding: $model/embedding
11
+ embedding_batch_size: 32
12
+ embedding_exclude_overlap: true
13
+ plda: $model/plda
14
+
15
+ params:
16
+ clustering:
17
+ threshold: 0.6
18
+ Fa: 0.07
19
+ Fb: 0.8
20
+ segmentation:
21
+ min_duration_off: 0.0
embedding/README.md ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Copied from https://huggingface.co/pyannote/wespeaker-voxceleb-resnet34-LM
2
+
3
+ ## License
4
+
5
+ According to [this page](https://github.com/wenet-e2e/wespeaker/blob/master/docs/pretrained.md):
6
+
7
+ > The pretrained model in WeNet follows the license of it's corresponding dataset. For example, the pretrained model on VoxCeleb follows Creative Commons Attribution 4.0 International License., since it is used as license of the VoxCeleb dataset, see https://mm.kaist.ac.kr/datasets/voxceleb/.
8
+
9
+ ## Citation
10
+
11
+ ```bibtex
12
+ @inproceedings{Wang2023,
13
+ title={Wespeaker: A research and production oriented speaker embedding learning toolkit},
14
+ author={Wang, Hongji and Liang, Chengdong and Wang, Shuai and Chen, Zhengyang and Zhang, Binbin and Xiang, Xu and Deng, Yanlei and Qian, Yanmin},
15
+ booktitle={ICASSP 2023, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
16
+ pages={1--5},
17
+ year={2023},
18
+ organization={IEEE}
19
+ }
20
+ ```
embedding/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6f10ff60898a1d185fa22e1d11e0bfa8a92efec811f11bca48cb8cafebefd929
3
+ size 26646242
plda/README.md ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ PLDA model trained by [BUT Speech@FIT](https://speech.fit.vut.cz/) group.
2
+
3
+ Thanks to [Jiangyu Han](https://github.com/jyhan03) and [Petr Pálka](https://github.com/Selesnyan) for the integration of VBx in pyannote.audio.
plda/plda.npz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b77bcd840692710dd3496f62ecfeed8d8e5f002fd991b785079b244eab7d255
3
+ size 133852
plda/xvec_transform.npz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:325f1ce8e48f7e55e9c8aa47e05d2766b7c48c4b25b8de8dd751e7a4cc5fbe8f
3
+ size 134376
segmentation/pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ad24338d844fb95985486eb1a464e32d229f6d7a03c9abe60f978bacf3f816e
3
+ size 5906507