Is it possible to use "prompt" or "hotwords" to steer decoding similar to Whisper?

by spashii - opened Aug 25, 2025

Discussion

spashii

Aug 25, 2025

^title

grofte

Aug 26, 2025

It should be possible to do with a corpus at least: https://docs.nvidia.com/nemo-framework/user-guide/latest/nemotoolkit/asr/asr_customization/ngpulm_language_modeling_and_customization.html#ngpulm-ngram-modeling

Training the n-gram model is really fast.

But any time I add it and try to transcribe on a sound file that's more than 40 seconds long (still less than a minute) it will drop a bunch of sentences.

grofte

Aug 26, 2025

And then they have word boosting which doesn't seem to work on an AER model like canary

https://docs.nvidia.com/nemo-framework/user-guide/latest/nemotoolkit/asr/asr_customization/word_boosting.html#

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment