Automatic Speech Recognition
NeMo
PyTorch
automatic-speech-translation
speech
audio
Transformer
FastConformer
Conformer
NeMo
hf-asr-leaderboard
Eval Results (legacy)
Eval Results
Instructions to use nvidia/canary-180m-flash with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- NeMo
How to use nvidia/canary-180m-flash with NeMo:
import nemo.collections.asr as nemo_asr asr_model = nemo_asr.models.ASRModel.from_pretrained("nvidia/canary-180m-flash") transcriptions = asr_model.transcribe(["file.wav"]) - Notebooks
- Google Colab
- Kaggle
finetune the decoder on text only data?
#7
by leestevennz - opened
Hi there,
Just wondering if it possible to finetune the decoder on text only data, for domain adaptation?
Canary's decoder is a Transformer LM conditioned on encoder outputs so it is possible to adapt the decoder of a sequence-to-sequence ASR model like Canary using only text data.
It can be done through shallow or cold/deep fusion, or (what I would recommend) continued pretraining.