Amd R9700 - vLLM crashes on startup
Hello,
Does anyone have a solution for trying to load this model in vLLM with rocm and hitting KeyError: 'layers.0.attn.qkv_proj.output_scale'?
I have only gotten GGUF models working with vllm, but I really want to test out an fp8 model to see what the R9700 can do. I am using the rocm7.2_navi_ubuntu24.04_py3.12_pytorch_2.9_vllm_0.14.0rc0 docker image.
Thanks!
Edit:
I attempted to follow the deployment tips to use two pull requests, but they don't have instructions. I now get:
alar_t = __hip_bfloat16, THRDS = 64, YTILE = 4, WvPrGrp = 16, A_CHUNK = 8, UNRL = 1, N = 4]: Device-side assertion `false' failed.
GPU coredump: execvp failed: No such file or directory
Failed to write segment data to pipe: Broken pipe
GPU coredump: handler exited with error (status: 1)
GPU core dump failed
:0:rocdevice.cpp :3586: 697995489480 us: Callback: Queue 0x7d42b4200000 aborting with error : HSA_STATUS_ERROR_EXCEPTION: An HSAIL operation resulted in a hardware exception. code: 0x1016
It's pretty disappointing the deployment instructions don't give any guidance whatsoever :(
Please make sure the PR#29008 is reflected in your working vllm branch. Thanks.
@XuebinWang What is the suggested method for doing this? This is what caused the exception on startup, so I am not sure how it should be applied. It upgrades my vllm version to a 0.15 version. Is this not the way to do it?
Do I clone the repo off of the official navi docker image and use that or what repo/branch do I apply the fixes to and using what sort of merge strategy?