Fine‑Tuning WeSpeaker with CA‑MHFA WavLM‑Base‑Plus on VoxCeleb 2
This guide explains how to replace a few core WeSpeaker components with improved SSL back‑end code, plug in the CA‑MHFA_WavLM‑Base‑Plus_VoxCeleb2 checkpoint, and reproduce the reported speaker‑verification scores.
1 Prerequisites
Please install wespeaker toolkit.
2 Patch WeSpeaker
In the current directory you will find an updated wespeaker folder containing new and modified code. Replace the corresponding files in your original repository:
- Frontend
- Replace
wespeaker/frontend/s3prl.pywith the version provided here.
- Replace
- SSL back‑end
- Create a new folder
wespeaker/models/ssl_backendin the original repo. - Copy all contents from
wespeaker/models/ssl_backend(this directory) into the newly created folder.
- Create a new folder
- Speaker model wrapper
- Overwrite
wespeaker/models/speaker_model.pywith the patched file.
- Overwrite
# From the root of this (patched) repo
cp -f wespeaker/frontend/s3prl.py <PATH_TO_ORIG>/wespeaker/frontend/
mkdir -p <PATH_TO_ORIG>/wespeaker/models/ssl_backend
cp -r wespeaker/models/ssl_backend/* <PATH_TO_ORIG>/wespeaker/models/ssl_backend/
cp -f wespeaker/models/speaker_model.py <PATH_TO_ORIG>/wespeaker/models/
3 Add the Pre‑trained Checkpoint
Copy the directory CA-MHFA_WavLM-Base-Plus_VoxCeleb2 into WeSpeaker’s VoxCeleb v2 experiment folder:
cp -r CA-MHFA_WavLM-Base-Plus_VoxCeleb2 \
<PATH_TO_ORIG>/examples/voxceleb/v2/exp/
The resulting tree should look like:
examples/voxceleb/v2
└── exp
└── CA-MHFA_WavLM-Base-Plus_VoxCeleb2
├── models
└──avg_model.pt
└── config.yaml
4 Edit run.sh
Inside examples/voxceleb/v2/run.sh make two quick edits:
# Where to write logs, checkpoints, and scores
exp_dir=exp/CA-MHFA_WavLM-Base-Plus_VoxCeleb2
# Which stage to start from (we only need scoring)
stage=4
Why stage 4? Stages 0‑3 perform data preparation and training, which are already complete in the supplied checkpoint.
5 Run the Recipe
cd examples/voxceleb/v2
bash run.sh
The script will score the VoxCeleb 1 cleaned trials and print results similar to the block below.
6 Expected Results
---- cali_vox2_dev_asnorm300_vox1_O_cleaned.kaldi.score -----
EER = 0.627
minDCF (p_target:0.01 c_miss:1 c_fa:1) = 0.095
---- cali_vox2_dev_asnorm300_vox1_E_cleaned.kaldi.score -----
EER = 0.674
minDCF (p_target:0.01 c_miss:1 c_fa:1) = 0.069
---- cali_vox2_dev_asnorm300_vox1_H_cleaned.kaldi.score -----
EER = 1.314
minDCF (p_target:0.01 c_miss:1 c_fa:1) = 0.125