nielsr HF Staff commited on
Commit
d45b38a
·
verified ·
1 Parent(s): bd36143

Add comprehensive model card for w2v-BERT 2.0 Speaker Verification model

Browse files

This PR adds a comprehensive model card for the w2v-BERT 2.0 based Speaker Verification model, as described in the paper [Enhancing Speaker Verification with w2v-BERT 2.0 and Knowledge Distillation guided Structured Pruning](https://huggingface.co/papers/2510.04213).

The updates include:
- Adding the `pipeline_tag: audio-classification` for better discoverability of speaker verification models on the Hub.
- Specifying the `license: mit`.
- Including an additional `speaker-verification` tag.
- Linking to the academic paper and the official GitHub repository.
- Incorporating the paper's abstract for a quick overview.
- Adding key diagrams and performance tables directly from the GitHub README.
- Providing a BibTeX citation for the paper.

As per instructions, no direct Python usage snippet is included since the GitHub README only provides shell commands and refers to scripts, not a readily copy-pasteable inference snippet for a Hugging Face library.

Please review and merge if these improvements align with expectations.

Files changed (1) hide show
  1. README.md +46 -0
README.md ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: audio-classification
3
+ license: mit
4
+ tags:
5
+ - speaker-verification
6
+ ---
7
+
8
+ # Enhancing Speaker Verification with w2v-BERT 2.0 and Knowledge Distillation guided Structured Pruning
9
+
10
+ This repository contains the models and code presented in the paper [Enhancing Speaker Verification with w2v-BERT 2.0 and Knowledge Distillation guided Structured Pruning](https://huggingface.co/papers/2510.04213).
11
+
12
+ The official GitHub repository can be found at: [https://github.com/ZXHY-82/w2v-BERT-2.0_SV](https://github.com/ZXHY-82/w2v-BERT-2.0_SV)
13
+
14
+ ## Abstract
15
+ Large-scale self-supervised Pre-Trained Models (PTMs) have shown significant improvements in the speaker verification (SV) task by providing rich feature representations. In this paper, we utilize w2v-BERT 2.0, a model with approximately 600 million parameters trained on 450 million hours of unlabeled data across 143 languages, for the SV task. The MFA structure with Layer Adapter is employed to process the multi-layer feature outputs from the PTM and extract speaker embeddings. Additionally, we incorporate LoRA for efficient fine-tuning. Our model achieves state-of-the-art results with 0.12% and 0.55% EER on the Vox1-O and Vox1-H test sets, respectively. Furthermore, we apply knowledge distillation guided structured pruning, reducing the model size by 80% while achieving only a 0.04% EER degradation.
16
+
17
+ ## Framework
18
+ ![Framework Diagram](https://github.com/ZXHY-82/w2v-BERT-2.0_SV/raw/main/assets/framework.png)
19
+
20
+ ## Performance
21
+ The model demonstrates state-of-the-art results on speaker verification tasks.
22
+
23
+ ### Speaker Verification Results
24
+ | Vox1-O (EER) | Vox1-E (EER) | Vox1-H (EER) | LMFT |
25
+ | :----------- | :----------- | :----------- | :--- |
26
+ | 0.23% | 0.38% | 0.81% | × |
27
+ | 0.14% | 0.31% | 0.73% | √ |
28
+
29
+ ### Pruning Results
30
+ The paper also presents knowledge distillation guided structured pruning:
31
+ ![Pruning Diagram](https://github.com/ZXHY-82/w2v-BERT-2.0_SV/raw/main/assets/prune.png)
32
+
33
+ ## How to use
34
+ For detailed instructions on preparation, training, pruning, and testing, please refer to the [GitHub repository](https://github.com/ZXHY-82/w2v-BERT-2.0_SV).
35
+ The GitHub repository provides shell commands to run various stages, including a `get_embd_w2v.py` script for the test stage.
36
+
37
+ ## Citation
38
+ If you find this work helpful or inspiring, please cite the paper:
39
+ ```bibtex
40
+ @article{Hu2025EnhancingSV,
41
+ title={Enhancing Speaker Verification with w2v-BERT 2.0 and Knowledge Distillation guided Structured Pruning},
42
+ author={Zixuan Hu and Yi Hu and Pengcheng Wei and Hongting Bai and Li Yan},
43
+ journal={arXiv:2510.04213},
44
+ year={2025}
45
+ }
46
+ ```