Run canary-qwen-2.5b on Jetson
#15
by raymondlo84 - opened
I finally figured out how to run this on Jetson AGX Orin on the GPU! and here is a few key things ;)
- Install Jetpack 6.2 -- you can flash it with SDK manager
- Install torch and torchvision with the jetson specific wheels
pip install torch torchvision --index-url=https://pypi.jetson-ai-lab.io/jp6/cu126
- Install nemo just with this, not with the github
pip install sacrebleu
pip install nemo_toolkit[asr]
Lastly, make sure you add the torch.device("cuda") so it will point to the GPU not CPU.
from nemo.collections.speechlm2.models import SALM
import torch
device = torch.device("cuda")
#model = SALM.from_pretrained('nvidia/canary-qwen-2.5b')
model = SALM.from_pretrained('nvidia/canary-qwen-2.5b').bfloat16().eval().to(device)
answer_ids = model.generate(
prompts=[
[{"role": "user", "content": f"Transcribe the following: {model.audio_locator_tag}", "audio": ["harvard.wav"]}]
],
max_new_tokens=128,
)
print(model.tokenizer.ids_to_text(answer_ids[0].cpu()))
