Instructions to use microsoft/wavlm-large with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use microsoft/wavlm-large with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="microsoft/wavlm-large")# Load model directly from transformers import AutoProcessor, AutoModel processor = AutoProcessor.from_pretrained("microsoft/wavlm-large") model = AutoModel.from_pretrained("microsoft/wavlm-large") - Notebooks
- Google Colab
- Kaggle
Parameter Count Mismatch: WavLM-Large vs FINALLY Speech Enhancement Model's Claim
#8
by fahim-inverseai - opened
I was examining the FINALLY speech enhancement model (https://arxiv.org/abs/2410.05920) released by SamsungLabs, which uses WavLM-Large as part of its architecture. I noticed a discrepancy in the reported number of parameters:
- The original WavLM-Large model on Hugging Face is documented to have around 316M parameters.
- However, the FINALLY model description mentions a larger number of parameters (358M parameters) attributed to the WavLM component, which does not match the standard WavLM-Large count.

I’m curious if this is due to:
- A modified WavLM-Large backbone in FINALLY (e.g., some layers added),
- A different counting method (trainable vs total parameters), or
- A documentation oversight.
Has anyone else noticed this mismatch? It would be helpful to clarify for anyone trying to reproduce results or analyze computational requirements.
Thanks in advance!