Amd R9700 - vLLM crashes on startup

#6
by jmander11 - opened

Hello,

Does anyone have a solution for trying to load this model in vLLM with rocm and hitting KeyError: 'layers.0.attn.qkv_proj.output_scale'?
I have only gotten GGUF models working with vllm, but I really want to test out an fp8 model to see what the R9700 can do. I am using the rocm7.2_navi_ubuntu24.04_py3.12_pytorch_2.9_vllm_0.14.0rc0 docker image.

Thanks!

Edit:
I attempted to follow the deployment tips to use two pull requests, but they don't have instructions. I now get:
alar_t = __hip_bfloat16, THRDS = 64, YTILE = 4, WvPrGrp = 16, A_CHUNK = 8, UNRL = 1, N = 4]: Device-side assertion `false' failed.

GPU coredump: execvp failed: No such file or directory

Failed to write segment data to pipe: Broken pipe

GPU coredump: handler exited with error (status: 1)

GPU core dump failed

:0:rocdevice.cpp :3586: 697995489480 us: Callback: Queue 0x7d42b4200000 aborting with error : HSA_STATUS_ERROR_EXCEPTION: An HSAIL operation resulted in a hardware exception. code: 0x1016

It's pretty disappointing the deployment instructions don't give any guidance whatsoever :(

jmander11 changed discussion title from Amd R9700 - KeyError: 'layers.0.attn.qkv_proj.output_scale' to Amd R9700 - vLLM crashes on startup
AMD org

Please make sure the PR#29008 is reflected in your working vllm branch. Thanks.

XuebinWang changed discussion status to closed

@XuebinWang What is the suggested method for doing this? This is what caused the exception on startup, so I am not sure how it should be applied. It upgrades my vllm version to a 0.15 version. Is this not the way to do it?

Do I clone the repo off of the official navi docker image and use that or what repo/branch do I apply the fixes to and using what sort of merge strategy?

Sign up or log in to comment