Parameter Count Mismatch: WavLM-Large vs FINALLY Speech Enhancement Model's Claim
#8
by
fahim-inverseai
- opened
I was examining the FINALLY speech enhancement model (https://arxiv.org/abs/2410.05920) released by SamsungLabs, which uses WavLM-Large as part of its architecture. I noticed a discrepancy in the reported number of parameters:
- The original WavLM-Large model on Hugging Face is documented to have around 316M parameters.
- However, the FINALLY model description mentions a larger number of parameters (358M parameters) attributed to the WavLM component, which does not match the standard WavLM-Large count.

I’m curious if this is due to:
- A modified WavLM-Large backbone in FINALLY (e.g., some layers added),
- A different counting method (trainable vs total parameters), or
- A documentation oversight.
Has anyone else noticed this mismatch? It would be helpful to clarify for anyone trying to reproduce results or analyze computational requirements.
Thanks in advance!