Update build/torch-universal/triton_kernels/target_info.py

by KernelMC - opened Feb 24

base: refs/heads/main

←

from: refs/pr/7

Discussion Files changed

-1

KernelMC

Feb 24

If num_sms is called on a system running HIP, it currently returns None. But, the expression in the is_cuda() branch of this function (torch.cuda.get_device_properties(0).multi_processor_count) can also be used on a HIP system. This can be verified by evaluating this expression in a docker container running rocm/pytorch:rocm7.2_ubuntu24.04_py3.12_pytorch_release_2.9.1 on a system with a supported AMD GPU or APU. Thus, I propose this branch should be taken if is_cuda() or is_hip().

Update build/torch-universal/triton_kernels/target_info.pyeba5a411

KernelMC

Feb 24

I just realized that @janantos opened a similar PR 5 months ago. Not sure if it makes more sense to add a new branch to num_sms when the expression is the same between the two. But either way, can we get one of these small changes merged to fix support for gpt-oss on ROCm?

pcuenq

Feb 25

cc @danieldk @marcsun13 , see also https://huggingface.co/kernels-community/triton_kernels/discussions/7 (previous PR)

KernelMC

14 days ago

It's been a few months since this was opened. Has a decision been reached on whether or not to merge this or https://huggingface.co/kernels-community/triton_kernels/discussions/6?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment