Run canary-qwen-2.5b on Jetson

#15
by raymondlo84 - opened

I finally figured out how to run this on Jetson AGX Orin on the GPU! and here is a few key things ;)

  1. Install Jetpack 6.2 -- you can flash it with SDK manager
  2. Install torch and torchvision with the jetson specific wheels
pip install torch torchvision --index-url=https://pypi.jetson-ai-lab.io/jp6/cu126
  1. Install nemo just with this, not with the github
pip install sacrebleu
pip install nemo_toolkit[asr]

Lastly, make sure you add the torch.device("cuda") so it will point to the GPU not CPU.

from nemo.collections.speechlm2.models import SALM
import torch
device = torch.device("cuda")

#model = SALM.from_pretrained('nvidia/canary-qwen-2.5b')
model = SALM.from_pretrained('nvidia/canary-qwen-2.5b').bfloat16().eval().to(device)

answer_ids = model.generate(
    prompts=[
        [{"role": "user", "content": f"Transcribe the following: {model.audio_locator_tag}", "audio": ["harvard.wav"]}]
    ],
    max_new_tokens=128,
)
print(model.tokenizer.ids_to_text(answer_ids[0].cpu()))

Screenshot 2025-10-31 at 9.43.43 PM

Sign up or log in to comment