Error running model with VLLM

#1
by lcollini - opened

I am running this model with vllm using 4 H200 gpus.

Running this model with VLLM throws:

^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m INFO 01-22 09:41:01 [weight_utils.py:413] Time spent downloading weights for MediaTek-Research/DeepSeek-V2-Lite-Coder-Instruct-RTLCoder-Finetune: 40.286573 seconds
^[[1;36m(Worker_TP0 pid=2573460)^[[0;0m ^MLoading safetensors checkpoint shards:   0% Completed | 0/13 [00:00<?, ?it/s]
^[[1;36m(Worker_TP0 pid=2573460)^[[0;0m ^MLoading safetensors checkpoint shards:   8% Completed | 1/13 [00:00<00:04,  2.78it/s]
^[[1;36m(Worker_TP0 pid=2573460)^[[0;0m ^MLoading safetensors checkpoint shards:  15% Completed | 2/13 [00:00<00:04,  2.66it/s]
^[[1;36m(Worker_TP0 pid=2573460)^[[0;0m ^MLoading safetensors checkpoint shards:  23% Completed | 3/13 [00:01<00:03,  2.61it/s]
^[[1;36m(Worker_TP0 pid=2573460)^[[0;0m ^MLoading safetensors checkpoint shards:  31% Completed | 4/13 [00:01<00:03,  2.59it/s]
^[[1;36m(Worker_TP0 pid=2573460)^[[0;0m ^MLoading safetensors checkpoint shards:  38% Completed | 5/13 [00:01<00:03,  2.60it/s]
^[[1;36m(Worker_TP0 pid=2573460)^[[0;0m ^MLoading safetensors checkpoint shards:  46% Completed | 6/13 [00:02<00:02,  2.60it/s]
^[[1;36m(Worker_TP0 pid=2573460)^[[0;0m ^MLoading safetensors checkpoint shards:  54% Completed | 7/13 [00:02<00:02,  2.61it/s]
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597] WorkerProc failed to start.^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597] Traceback (most recent call last):^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]   File "/ext3/miniforge3/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 571, in worker_main^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]     worker = WorkerProc(*args, **kwargs)^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]   File "/ext3/miniforge3/lib/python3.12/site-packages/vllm/v1/executor/multiproc_executor.py", line 437, in __init__^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]     self.worker.load_model()^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]   File "/ext3/miniforge3/lib/python3.12/site-packages/vllm/v1/worker/gpu_worker.py", line 213, in load_model^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]     self.model_runner.load_model(eep_scale_up=eep_scale_up)^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]   File "/ext3/miniforge3/lib/python3.12/site-packages/vllm/v1/worker/gpu_model_runner.py", line 2635, in load_model^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]     self.model = model_loader.load_model(^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]                  ^^^^^^^^^^^^^^^^^^^^^^^^^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]   File "/ext3/miniforge3/lib/python3.12/site-packages/vllm/model_executor/model_loader/base_loader.py", line 50, in load_model^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]     self.load_weights(model, model_config)^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]   File "/ext3/miniforge3/lib/python3.12/site-packages/vllm/model_executor/model_loader/default_loader.py", line 264, in load_weights^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]     loaded_weights = model.load_weights(^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]                      ^^^^^^^^^^^^^^^^^^^^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]   File "/ext3/miniforge3/lib/python3.12/site-packages/vllm/model_executor/models/deepseek_v2.py", line 1420, in load_weights^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]     param = params_dict[name]^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597]             ~~~~~~~~~~~^^^^^^^M
^[[1;36m(Worker_TP3 pid=2573463)^[[0;0m ERROR 01-22 09:41:04 [multiproc_executor.py:597] KeyError: '_flat_param'

Other models do not give me this problem.
Any idea of what's going wrong?

Sign up or log in to comment