Automatic Speech Recognition
NeMo
PyTorch
automatic-speech-translation
speech
audio
Transformer
FastConformer
Conformer
NeMo
hf-asr-leaderboard
Eval Results (legacy)
Eval Results
Instructions to use nvidia/canary-1b-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- NeMo
How to use nvidia/canary-1b-v2 with NeMo:
import nemo.collections.asr as nemo_asr asr_model = nemo_asr.models.ASRModel.from_pretrained("nvidia/canary-1b-v2") transcriptions = asr_model.transcribe(["file.wav"]) - Notebooks
- Google Colab
- Kaggle
there is diff in size of output feautres between EncDecMultiTaskModel and pretrained canary-1b-v2.nemo loaded
#14
by mrunique007 - opened
I'm try fine-tuning canary-1b-v2 using fast-coformer_aed.conf multitask. idk, but output size of canary-1b-v2.nemo is 16384 while 16400 is of EncDecMultiTaskModel ? its seems like 180m flash and 1b-flash is 16384 but 1b-v2 is 16400 ?
