Parameter Count Mismatch: WavLM-Large vs FINALLY Speech Enhancement Model's Claim

by fahim-inverseai - opened Aug 13, 2025

Aug 13, 2025

I was examining the FINALLY speech enhancement model (https://arxiv.org/abs/2410.05920) released by SamsungLabs, which uses WavLM-Large as part of its architecture. I noticed a discrepancy in the reported number of parameters:

The original WavLM-Large model on Hugging Face is documented to have around 316M parameters.
However, the FINALLY model description mentions a larger number of parameters (358M parameters) attributed to the WavLM component, which does not match the standard WavLM-Large count.

I’m curious if this is due to:

A modified WavLM-Large backbone in FINALLY (e.g., some layers added),
A different counting method (trainable vs total parameters), or
A documentation oversight.

Has anyone else noticed this mismatch? It would be helpful to clarify for anyone trying to reproduce results or analyze computational requirements.

Thanks in advance!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment