Automatic Speech Recognition
Transformers
Safetensors
English
lasr_ctc
medical-asr
radiology
medical

Steps to build the llm

#10
by kundangraticare - opened

Kindly advise the steps to prepare text and build custom llm to use in beam search.

Also is it feasible to mix the provided lm_6.kenlm with my custom data?

pls guide

Google org

Hi @kundangraticare ,

Just to be clear, are you asking about fine tuning the medasr model so that it can be used in beam search?

Thank you!

I am actually building KenLM arpa model and passing it to the Medasr to increase accuracy

Google org

Hey @kundangraticare , building a KenLM n-gram model for CTC beam search is the correct approach to increase accuracy.

First, ensure your LM training text matches the acoustic model's vocabulary. If MedASR uses subword tokens, you can try tokenise your normalised text with the exact same spiece.model vocabulary before training KenLM. If it uses character level CTC, train KenLM directly on properly normalised text. In all cases, make sure normalisation matches the acoustic model's training transcript.

Next, estimate the model using lmplz (choose the n-gram order based on corpus size; 4–6 is common, and lm_6.kenlm is a 6-gram), then convert the ARPA file to binary format using build_binary for optimised memory usage and faster loading.

To combine your custom model with the provided lm_6.kenlm, you cannot merge compiled binaries directly. If you have access to the original ARPA files or training text, you can retrain or interpolate offline. otherwise, if your decoder supports it, use shallow fusion (multi-LM decoding) during beam search so both language models are evaluated simultaneously with configurable weights.

Finally, systematically tune the LM weight (alpha) and word insertion bonus (beta) on a validation set. proper tuning is critical for achieving actual Word Error Rate(WER) improvements.

Thank you!

Sign up or log in to comment