| /share/liangzy/miniconda3/envs/vllm/lib/python3.10/site-packages/_distutils_hack/__init__.py:54: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml |
| warnings.warn( |
| Total Video Size: 31436 |
|
0%| | 0/31436 [00:00<?, ?it/s]
100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 31436/31436 [00:00<00:00, 801488.92it/s] |
| Total Clips Size: 37658 |
| Start: 0, End: 40 |
| to process size: 40 |
| Total size: 40 |
| Sample show: <|im_start|>system |
| You are an AI assistant tasked with generating |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| 36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:48:40 multiproc_worker_utils.py:229] Worker ready; awaiting tasks |
| INFO 04-29 00:48:40 custom_cache_manager.py:19] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager |
| INFO 04-29 00:48:40 custom_cache_manager.py:19] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager |
| [1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:48:40 multiproc_worker_utils.py:229] Worker ready; awaiting tasks |
| [1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:48:40 multiproc_worker_utils.py:229] Worker ready; awaiting tasks |
| [1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:48:40 multiproc_worker_utils.py:229] Worker ready; awaiting tasks |
| INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend. |
| INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend. |
| INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend. |
| INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend. |
| [1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend. |
| [1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend. |
| [1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend. |
| [1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend. |
| [1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2 |
| INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2 |
| [1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5 |
| INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Bootstrap : Using net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so) |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO NET/Plugin: Using internal network plugin. |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO cudaDriverVersion 12040 |
| NCCL version 2.21.5+cuda12.4 |
| [1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2 |
| INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2 |
| [1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5 |
| INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Bootstrap : Using net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so) |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO NET/Plugin: Using internal network plugin. |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO cudaDriverVersion 12040 |
| NCCL version 2.21.5+cuda12.4 |
| INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2 |
| INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5 |
| [1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2 |
| [1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Bootstrap : Using net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so) |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO NET/Plugin: Using internal network plugin. |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO cudaDriverVersion 12040 |
| NCCL version 2.21.5+cuda12.4 |
| INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2 |
| [1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2 |
| INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5 |
| [1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Bootstrap : Using net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so) |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO NET/Plugin: Using internal network plugin. |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO cudaDriverVersion 12040 |
| NCCL version 2.21.5+cuda12.4 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO ncclCommInitRank comm 0xfcadc20 rank 0 nranks 2 cudaDev 0 nvmlDev 0 busId 70 commId 0x107ef7d20723ace8 - Init START |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0. |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Setting affinity for GPU 0 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO comm 0xfcadc20 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 00/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 01/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 02/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 03/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 04/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 05/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 06/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 07/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 08/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 09/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 10/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 11/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 12/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 13/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 14/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 15/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 00/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 01/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 02/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 03/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 04/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 05/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 06/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 07/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 08/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 09/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 10/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 11/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 12/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 13/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-66dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO cudaDriverVersion 12040 |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Bootstrap : Using net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so) |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO NET/Plugin: Using internal network plugin. |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO ncclCommInitRank comm 0xfcaf480 rank 1 nranks 2 cudaDev 1 nvmlDev 1 busId 80 commId 0x107ef7d20723ace8 - Init START |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0. |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Setting affinity for GPU 1 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO comm 0xfcaf480 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1 |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 00/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 01/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 02/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 03/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 04/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 05/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 06/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 07/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 08/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 09/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 10/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 11/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 12/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 13/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 14/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 15/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Connected all rings |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Connected all trees |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512 |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared objINFO 04-29 00:48:41 custom_all_reduce_utils.py:206] generating GPU P2P access cache in /root/.cache/vllm/gpu_p2p_access_cache_for_0,1.json |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO ncclCommInitRank comm 0xfcae060 rank 0 nranks 2 cudaDev 0 nvmlDev 2 busId 90 commId 0x2c8677096eda1f49 - Init START |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0. |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Setting affinity for GPU 2 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO comm 0xfcae060 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 00/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 01/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 02/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 03/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 04/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 05/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 06/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 07/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 08/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 09/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 10/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 11/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 12/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 13/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 14/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 15/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 00/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 01/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 02/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 03/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 04/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 05/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 06/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 07/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 08/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 09/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 10/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 11/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 12/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 13/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-66dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO ncclCommInitRank comm 0xfcae910 rank 0 nranks 2 cudaDev 0 nvmlDev 4 busId b0 commId 0x9c435407d5d72319 - Init START |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0. |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Setting affinity for GPU 4 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO comm 0xfcae910 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 00/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 01/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 02/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 03/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 04/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 05/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 06/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 07/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 08/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 09/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 10/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 11/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 12/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 13/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 14/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 15/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 00/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 01/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 02/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 03/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 04/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 05/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 06/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 07/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 08/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 09/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 10/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 11/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 12/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 13/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-66dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO cudaDriverVersion 12040 |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Bootstrap : Using net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so) |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO NET/Plugin: Using internal network plugin. |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO ncclCommInitRank comm 0xfce82a0 rank 1 nranks 2 cudaDev 1 nvmlDev 3 busId a0 commId 0x2c8677096eda1f49 - Init START |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0. |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Setting affinity for GPU 3 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO comm 0xfce82a0 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1 |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 00/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 01/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 02/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 03/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 04/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 05/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 06/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 07/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 08/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 09/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 10/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 11/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 12/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 13/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 14/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 15/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Connected all rings |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Connected all trees |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512 |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared objINFO 04-29 00:48:41 custom_all_reduce_utils.py:206] generating GPU P2P access cache in /root/.cache/vllm/gpu_p2p_access_cache_for_2,3.json |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO ncclCommInitRank comm 0xfce9680 rank 0 nranks 2 cudaDev 0 nvmlDev 6 busId d0 commId 0xb7bbad76a9287dbc - Init START |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0. |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Setting affinity for GPU 6 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO comm 0xfce9680 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 00/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 01/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 02/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 03/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 04/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 05/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 06/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 07/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 08/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 09/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 10/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 11/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 12/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 13/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 14/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 15/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 00/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 01/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 02/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 03/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 04/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 05/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 06/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 07/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 08/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 09/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 10/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 11/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 12/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 13/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-66dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO cudaDriverVersion 12040 |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Bootstrap : Using net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so) |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO NET/Plugin: Using internal network plugin. |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO ncclCommInitRank comm 0xfcb0800 rank 1 nranks 2 cudaDev 1 nvmlDev 5 busId c0 commId 0x9c435407d5d72319 - Init START |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0. |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Setting affinity for GPU 5 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO comm 0xfcb0800 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1 |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 00/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 01/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 02/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 03/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 04/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 05/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 06/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 07/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 08/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 09/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 10/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 11/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 12/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 13/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 14/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 15/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Connected all rings |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Connected all trees |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512 |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared objINFO 04-29 00:48:41 custom_all_reduce_utils.py:206] generating GPU P2P access cache in /root/.cache/vllm/gpu_p2p_access_cache_for_4,5.json |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO cudaDriverVersion 12040 |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Bootstrap : Using net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so) |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO NET/Plugin: Using internal network plugin. |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0> |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO ncclCommInitRank comm 0xfce99f0 rank 1 nranks 2 cudaDev 1 nvmlDev 7 busId e0 commId 0xb7bbad76a9287dbc - Init START |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0. |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Setting affinity for GPU 7 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO comm 0xfce99f0 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1 |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 00/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 01/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 02/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 03/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 04/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 05/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 06/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 07/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 08/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 09/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 10/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 11/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 12/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 13/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 14/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 15/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Connected all rings |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Connected all trees |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512 |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared objINFO 04-29 00:48:41 custom_all_reduce_utils.py:206] generating GPU P2P access cache in /root/.cache/vllm/gpu_p2p_access_cache_for_6,7.json |
| INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_4,5.json |
| [1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_4,5.json |
| INFO 04-29 00:49:11 shm_broadcast.py:258] vLLM message queue communication handle: Handle(connect_ip='127.0.0.1', local_reader_ranks=[1], buffer_handle=(1, 4194304, 6, 'psm_8a09a9f1'), local_subscribe_port=36381, remote_subscribe_port=None) |
| INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8... |
| [1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8... |
| INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_6,7.json |
| [1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_6,7.json |
| INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod |
| [1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod |
| INFO 04-29 00:49:11 shm_broadcast.py:258] vLLM message queue communication handle: Handle(connect_ip='127.0.0.1', local_reader_ranks=[1], buffer_handle=(1, 4194304, 6, 'psm_61344bd8'), local_subscribe_port=56743, remote_subscribe_port=None) |
| INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8... |
| [1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8... |
| INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod |
| [1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod |
| INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_2,3.json |
| [1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_2,3.json |
| INFO 04-29 00:49:11 shm_broadcast.py:258] vLLM message queue communication handle: Handle(connect_ip='127.0.0.1', local_reader_ranks=[1], buffer_handle=(1, 4194304, 6, 'psm_133de269'), local_subscribe_port=40037, remote_subscribe_port=None) |
| INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_0,1.json |
| [1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_0,1.json |
| INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8... |
| [1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8... |
| INFO 04-29 00:49:11 shm_broadcast.py:258] vLLM message queue communication handle: Handle(connect_ip='127.0.0.1', local_reader_ranks=[1], buffer_handle=(1, 4194304, 6, 'psm_27adf207'), local_subscribe_port=51079, remote_subscribe_port=None) |
| INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod |
| [1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod |
| INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8... |
| [1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8... |
| INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod |
| [1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod |
|
Loading safetensors checkpoint shards: 0% Completed | 0/20 [00:00<?, ?it/s] |
|
Loading safetensors checkpoint shards: 0% Completed | 0/20 [00:00<?, ?it/s] |
|
Loading safetensors checkpoint shards: 0% Completed | 0/20 [00:00<?, ?it/s] |
|
Loading safetensors checkpoint shards: 0% Completed | 0/20 [00:00<?, ?it/s] |
|
Loading safetensors checkpoint shards: 5% Completed | 1/20 [00:02<00:44, 2.36s/it] |
|
Loading safetensors checkpoint shards: 10% Completed | 2/20 [00:04<00:42, 2.39s/it] |
|
Loading safetensors checkpoint shards: 5% Completed | 1/20 [00:04<01:32, 4.89s/it] |
|
Loading safetensors checkpoint shards: 5% Completed | 1/20 [00:04<01:34, 4.98s/it] |
|
Loading safetensors checkpoint shards: 5% Completed | 1/20 [00:04<01:32, 4.86s/it] |
|
Loading safetensors checkpoint shards: 15% Completed | 3/20 [00:07<00:41, 2.46s/it] |
|
Loading safetensors checkpoint shards: 10% Completed | 2/20 [00:07<01:09, 3.84s/it] |
|
Loading safetensors checkpoint shards: 10% Completed | 2/20 [00:08<01:09, 3.86s/it] |
|
Loading safetensors checkpoint shards: 10% Completed | 2/20 [00:08<01:10, 3.90s/it] |
|
Loading safetensors checkpoint shards: 20% Completed | 4/20 [00:09<00:38, 2.43s/it] |
|
Loading safetensors checkpoint shards: 15% Completed | 3/20 [00:10<00:55, 3.25s/it] |
|
Loading safetensors checkpoint shards: 15% Completed | 3/20 [00:10<00:54, 3.23s/it] |
|
Loading safetensors checkpoint shards: 15% Completed | 3/20 [00:10<00:54, 3.23s/it] |
|
Loading safetensors checkpoint shards: 25% Completed | 5/20 [00:12<00:36, 2.47s/it] |
|
Loading safetensors checkpoint shards: 20% Completed | 4/20 [00:13<00:47, 2.98s/it] |
|
Loading safetensors checkpoint shards: 20% Completed | 4/20 [00:13<00:47, 2.97s/it] |
|
Loading safetensors checkpoint shards: 20% Completed | 4/20 [00:13<00:47, 2.97s/it] |
|
Loading safetensors checkpoint shards: 30% Completed | 6/20 [00:14<00:34, 2.46s/it] |
|
Loading safetensors checkpoint shards: 25% Completed | 5/20 [00:15<00:42, 2.83s/it] |
|
Loading safetensors checkpoint shards: 25% Completed | 5/20 [00:15<00:42, 2.84s/it] |
|
Loading safetensors checkpoint shards: 25% Completed | 5/20 [00:15<00:42, 2.83s/it] |
|
Loading safetensors checkpoint shards: 35% Completed | 7/20 [00:17<00:32, 2.50s/it] |
|
Loading safetensors checkpoint shards: 30% Completed | 6/20 [00:18<00:37, 2.71s/it] |
|
Loading safetensors checkpoint shards: 30% Completed | 6/20 [00:18<00:37, 2.70s/it] |
|
Loading safetensors checkpoint shards: 30% Completed | 6/20 [00:18<00:37, 2.70s/it] |
|
Loading safetensors checkpoint shards: 40% Completed | 8/20 [00:19<00:30, 2.51s/it] |
|
Loading safetensors checkpoint shards: 35% Completed | 7/20 [00:20<00:34, 2.65s/it] |
|
Loading safetensors checkpoint shards: 35% Completed | 7/20 [00:20<00:34, 2.65s/it] |
|
Loading safetensors checkpoint shards: 35% Completed | 7/20 [00:20<00:34, 2.65s/it] |
|
Loading safetensors checkpoint shards: 45% Completed | 9/20 [00:20<00:22, 2.08s/it] |
|
Loading safetensors checkpoint shards: 40% Completed | 8/20 [00:23<00:33, 2.81s/it] |
|
Loading safetensors checkpoint shards: 40% Completed | 8/20 [00:23<00:33, 2.81s/it] |
|
Loading safetensors checkpoint shards: 40% Completed | 8/20 [00:23<00:33, 2.81s/it] |
|
Loading safetensors checkpoint shards: 50% Completed | 10/20 [00:23<00:23, 2.38s/it] |
|
Loading safetensors checkpoint shards: 45% Completed | 9/20 [00:25<00:25, 2.30s/it] |
|
Loading safetensors checkpoint shards: 45% Completed | 9/20 [00:24<00:25, 2.30s/it] |
|
Loading safetensors checkpoint shards: 45% Completed | 9/20 [00:24<00:25, 2.30s/it] |
|
Loading safetensors checkpoint shards: 55% Completed | 11/20 [00:26<00:22, 2.50s/it] |
|
Loading safetensors checkpoint shards: 50% Completed | 10/20 [00:27<00:22, 2.29s/it] |
|
Loading safetensors checkpoint shards: 50% Completed | 10/20 [00:27<00:22, 2.29s/it] |
|
Loading safetensors checkpoint shards: 50% Completed | 10/20 [00:27<00:22, 2.29s/it] |
|
Loading safetensors checkpoint shards: 60% Completed | 12/20 [00:29<00:20, 2.58s/it] |
|
Loading safetensors checkpoint shards: 55% Completed | 11/20 [00:29<00:21, 2.35s/it] |
|
Loading safetensors checkpoint shards: 55% Completed | 11/20 [00:29<00:21, 2.34s/it] |
|
Loading safetensors checkpoint shards: 55% Completed | 11/20 [00:29<00:21, 2.35s/it] |
|
Loading safetensors checkpoint shards: 65% Completed | 13/20 [00:32<00:18, 2.57s/it] |
|
Loading safetensors checkpoint shards: 60% Completed | 12/20 [00:32<00:20, 2.51s/it] |
|
Loading safetensors checkpoint shards: 60% Completed | 12/20 [00:32<00:20, 2.51s/it] |
|
Loading safetensors checkpoint shards: 60% Completed | 12/20 [00:32<00:20, 2.51s/it] |
|
Loading safetensors checkpoint shards: 70% Completed | 14/20 [00:34<00:15, 2.54s/it] |
|
Loading safetensors checkpoint shards: 65% Completed | 13/20 [00:35<00:17, 2.54s/it] |
|
Loading safetensors checkpoint shards: 65% Completed | 13/20 [00:35<00:17, 2.54s/it] |
|
Loading safetensors checkpoint shards: 65% Completed | 13/20 [00:35<00:17, 2.54s/it] |
|
Loading safetensors checkpoint shards: 75% Completed | 15/20 [00:35<00:10, 2.19s/it] |
|
Loading safetensors checkpoint shards: 70% Completed | 14/20 [00:37<00:15, 2.55s/it] |
|
Loading safetensors checkpoint shards: 70% Completed | 14/20 [00:37<00:15, 2.55s/it] |
|
Loading safetensors checkpoint shards: 70% Completed | 14/20 [00:37<00:15, 2.55s/it] |
|
Loading safetensors checkpoint shards: 80% Completed | 16/20 [00:38<00:08, 2.24s/it] |
|
Loading safetensors checkpoint shards: 75% Completed | 15/20 [00:39<00:11, 2.25s/it] |
|
Loading safetensors checkpoint shards: 75% Completed | 15/20 [00:39<00:11, 2.25s/it] |
|
Loading safetensors checkpoint shards: 75% Completed | 15/20 [00:39<00:11, 2.25s/it] |
|
Loading safetensors checkpoint shards: 85% Completed | 17/20 [00:40<00:06, 2.33s/it] |
|
Loading safetensors checkpoint shards: 80% Completed | 16/20 [00:41<00:09, 2.31s/it] |
|
Loading safetensors checkpoint shards: 80% Completed | 16/20 [00:41<00:09, 2.31s/it] |
|
Loading safetensors checkpoint shards: 80% Completed | 16/20 [00:41<00:09, 2.31s/it] |
|
Loading safetensors checkpoint shards: 90% Completed | 18/20 [00:43<00:04, 2.33s/it] |
|
Loading safetensors checkpoint shards: 85% Completed | 17/20 [00:44<00:07, 2.41s/it] |
|
Loading safetensors checkpoint shards: 85% Completed | 17/20 [00:44<00:07, 2.41s/it] |
|
Loading safetensors checkpoint shards: 85% Completed | 17/20 [00:44<00:07, 2.41s/it] |
|
Loading safetensors checkpoint shards: 95% Completed | 19/20 [00:45<00:02, 2.38s/it] |
|
Loading safetensors checkpoint shards: 90% Completed | 18/20 [00:46<00:04, 2.42s/it] |
|
Loading safetensors checkpoint shards: 90% Completed | 18/20 [00:46<00:04, 2.42s/it] |
|
Loading safetensors checkpoint shards: 90% Completed | 18/20 [00:46<00:04, 2.42s/it] |
|
Loading safetensors checkpoint shards: 100% Completed | 20/20 [00:48<00:00, 2.41s/it] |
|
Loading safetensors checkpoint shards: 100% Completed | 20/20 [00:48<00:00, 2.41s/it] |
|
|
| [1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:50:00 model_runner.py:1115] Loading model weights took 35.6627 GB |
| INFO 04-29 00:50:00 model_runner.py:1115] Loading model weights took 35.6627 GB |
| [1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:50:00 model_runner.py:1115] Loading model weights took 35.6627 GB |
|
Loading safetensors checkpoint shards: 95% Completed | 19/20 [00:49<00:02, 2.43s/it] |
|
Loading safetensors checkpoint shards: 95% Completed | 19/20 [00:49<00:02, 2.43s/it] |
|
Loading safetensors checkpoint shards: 95% Completed | 19/20 [00:49<00:02, 2.43s/it] |
|
Loading safetensors checkpoint shards: 100% Completed | 20/20 [00:51<00:00, 2.49s/it] |
|
Loading safetensors checkpoint shards: 100% Completed | 20/20 [00:52<00:00, 2.49s/it] |
|
Loading safetensors checkpoint shards: 100% Completed | 20/20 [00:51<00:00, 2.49s/it] |
|
Loading safetensors checkpoint shards: 100% Completed | 20/20 [00:51<00:00, 2.60s/it] |
|
|
|
Loading safetensors checkpoint shards: 100% Completed | 20/20 [00:52<00:00, 2.60s/it] |
|
|
|
Loading safetensors checkpoint shards: 100% Completed | 20/20 [00:51<00:00, 2.60s/it] |
|
|
| [1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:50:04 model_runner.py:1115] Loading model weights took 35.6627 GB |
| INFO 04-29 00:50:04 model_runner.py:1115] Loading model weights took 35.6627 GB |
| INFO 04-29 00:50:04 model_runner.py:1115] Loading model weights took 35.6627 GB |
| INFO 04-29 00:50:04 model_runner.py:1115] Loading model weights took 35.6627 GB |
| [1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:50:04 model_runner.py:1115] Loading model weights took 35.6627 GB |
| 8f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 14/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 15/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Connected all rings |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Connected all trees |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512 |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-tuner.so |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO TUNER/Plugin: Using internal tuner plugin. |
| dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO ncclCommInitRank comm 0xfce9680 rank 0 nranks 2 cudaDev 0 nvmlDev 6 busId d0 commId 0xb7bbad76a9287dbc - Init COMPLETE |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO ncclCommInitRank comm 0x36f82500 rank 0 nranks 2 cudaDev 0 nvmlDev 6 busId d0 commId 0x67ee5f83dd395178 - Init START |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Setting affinity for GPU 6 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO comm 0x36f82500 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 00/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 01/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 02/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 03/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 04/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 05/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 06/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 07/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 08/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 09/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 10/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 11/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 12/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 13/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 14/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 15/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 00/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 01/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 02/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 03/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 04/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 05/0 : 0[6] -> 1[7] via P2P/IPC/read |
| dsw-222255-668f79686ect file: No such file or directory : when loading libnccl-tuner.so |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO TUNER/Plugin: Using internal tuner plugin. |
| dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO ncclCommInitRank comm 0xfce99f0 rank 1 nranks 2 cudaDev 1 nvmlDev 7 busId e0 commId 0xb7bbad76a9287dbc - Init COMPLETE |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO ncclCommInitRank comm 0x36f8b2f0 rank 1 nranks 2 cudaDev 1 nvmlDev 7 busId e0 commId 0x67ee5f83dd395178 - Init START |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Setting affinity for GPU 7 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO comm 0x36f8b2f0 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1 |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 00/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 01/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 02/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 03/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 04/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 05/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 06/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 07/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 08/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 09/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 10/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 11/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 12/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 13/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 14/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 15/0 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Connected all rings |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Connected all trees |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512 |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer |
| dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO ncclCommInitRank comm 0x36f8b2f0 rank 1 nranks 2 cudaDev 1 nvmlDev 7 busId e0 commId 0x67ee5f83dd395178 - Init COMPLETE |
| dsw-222255-668f79686f-2vnl5:3572216:3573061 [1] NCCL INFO Channel 00/1 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573061 [1] NCCL INFO Channel 01/1 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573061 [1] NCCL INFO Channel 02/1 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573061 [1] NCCL INFO Channel 03/1 : 1[7] -> 0[6] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572216:3573061 [1] NCCL INF[1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:50:23 worker.py:267] Memory profiling takes 22.61 seconds
|
| [1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:50:23 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
|
| [1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:50:23 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB. |
| INFO 04-29 00:50:23 worker.py:267] Memory profiling takes 22.65 seconds
|
| INFO 04-29 00:50:23 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
|
| INFO 04-29 00:50:23 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB. |
| INFO 04-29 00:50:23 executor_base.py:111] # cuda blocks: 11901, # CPU blocks: 1638 |
| INFO 04-29 00:50:23 executor_base.py:116] Maximum concurrency for 32768 tokens per request: 5.81x |
| 8f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 14/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 15/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Connected all rings |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Connected all trees |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512 |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-tuner.so |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO TUNER/Plugin: Using internal tuner plugin. |
| dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO ncclCommInitRank comm 0xfcae910 rank 0 nranks 2 cudaDev 0 nvmlDev 4 busId b0 commId 0x9c435407d5d72319 - Init COMPLETE |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO ncclCommInitRank comm 0x36e33d50 rank 0 nranks 2 cudaDev 0 nvmlDev 4 busId b0 commId 0x9e29b5fcef03aeae - Init START |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Setting affinity for GPU 4 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO comm 0x36e33d50 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 00/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 01/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 02/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 03/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 04/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 05/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 06/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 07/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 08/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 09/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 10/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 11/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 12/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 13/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 14/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 15/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 00/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 01/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 02/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 03/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 04/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 05/0 : 0[4] -> 1[5] via P2P/IPC/read |
| dsw-222255-668f79686ect file: No such file or directory : when loading libnccl-tuner.so |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO TUNER/Plugin: Using internal tuner plugin. |
| dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO ncclCommInitRank comm 0xfcb0800 rank 1 nranks 2 cudaDev 1 nvmlDev 5 busId c0 commId 0x9c435407d5d72319 - Init COMPLETE |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO ncclCommInitRank comm 0x36e3ac50 rank 1 nranks 2 cudaDev 1 nvmlDev 5 busId c0 commId 0x9e29b5fcef03aeae - Init START |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Setting affinity for GPU 5 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO comm 0x36e3ac50 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1 |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 00/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 01/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 02/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 03/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 04/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 05/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 06/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 07/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 08/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 09/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 10/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 11/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 12/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 13/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 14/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 15/0 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Connected all rings |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Connected all trees |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512 |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer |
| dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO ncclCommInitRank comm 0x36e3ac50 rank 1 nranks 2 cudaDev 1 nvmlDev 5 busId c0 commId 0x9e29b5fcef03aeae - Init COMPLETE |
| dsw-222255-668f79686f-2vnl5:3572223:3573071 [1] NCCL INFO Channel 00/1 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573071 [1] NCCL INFO Channel 01/1 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573071 [1] NCCL INFO Channel 02/1 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573071 [1] NCCL INFO Channel 03/1 : 1[5] -> 0[4] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572223:3573071 [1] NCCL INF8f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 14/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 15/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Connected all rings |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Connected all trees |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512 |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-tuner.so |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO TUNER/Plugin: Using internal tuner plugin. |
| dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO ncclCommInitRank comm 0xfcadc20 rank 0 nranks 2 cudaDev 0 nvmlDev 0 busId 70 commId 0x107ef7d20723ace8 - Init COMPLETE |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO ncclCommInitRank comm 0x36e33830 rank 0 nranks 2 cudaDev 0 nvmlDev 0 busId 70 commId 0x54df945661a68a33 - Init START |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Setting affinity for GPU 0 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO comm 0x36e33830 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 00/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 01/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 02/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 03/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 04/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 05/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 06/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 07/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 08/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 09/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 10/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 11/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 12/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 13/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 14/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 15/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 00/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 01/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 02/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 03/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 04/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 05/0 : 0[0] -> 1[1] via P2P/IPC/read |
| dsw-222255-668f79686[1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:50:24 worker.py:267] Memory profiling takes 19.39 seconds
|
| [1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:50:24 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
|
| [1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:50:24 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB. |
| INFO 04-29 00:50:24 worker.py:267] Memory profiling takes 19.46 seconds
|
| INFO 04-29 00:50:24 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
|
| INFO 04-29 00:50:24 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB. |
| ect file: No such file or directory : when loading libnccl-tuner.so |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO TUNER/Plugin: Using internal tuner plugin. |
| dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO ncclCommInitRank comm 0xfcaf480 rank 1 nranks 2 cudaDev 1 nvmlDev 1 busId 80 commId 0x107ef7d20723ace8 - Init COMPLETE |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO ncclCommInitRank comm 0x36e388d0 rank 1 nranks 2 cudaDev 1 nvmlDev 1 busId 80 commId 0x54df945661a68a33 - Init START |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Setting affinity for GPU 1 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO comm 0x36e388d0 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1 |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 00/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 01/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 02/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 03/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 04/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 05/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 06/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 07/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 08/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 09/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 10/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 11/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 12/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 13/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 14/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 15/0 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Connected all rings |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Connected all trees |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512 |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer |
| dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO ncclCommInitRank comm 0x36e388d0 rank 1 nranks 2 cudaDev 1 nvmlDev 1 busId 80 commId 0x54df945661a68a33 - Init COMPLETE |
| dsw-222255-668f79686f-2vnl5:3572220:3573090 [1] NCCL INFO Channel 00/1 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573090 [1] NCCL INFO Channel 01/1 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573090 [1] NCCL INFO Channel 02/1 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573090 [1] NCCL INFO Channel 03/1 : 1[1] -> 0[0] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572220:3573090 [1] NCCL INF8f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 14/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 15/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Connected all rings |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Connected all trees |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512 |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-tuner.so |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO TUNER/Plugin: Using internal tuner plugin. |
| dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO ncclCommInitRank comm 0xfcae060 rank 0 nranks 2 cudaDev 0 nvmlDev 2 busId 90 commId 0x2c8677096eda1f49 - Init COMPLETE |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO ncclCommInitRank comm 0x36e330a0 rank 0 nranks 2 cudaDev 0 nvmlDev 2 busId 90 commId 0xacaabfeeee980308 - Init START |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Setting affinity for GPU 2 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO comm 0x36e330a0 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 00/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 01/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 02/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 03/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 04/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 05/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 06/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 07/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 08/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 09/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 10/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 11/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 12/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 13/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 14/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 15/16 : 0 1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 00/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 01/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 02/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 03/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 04/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 05/0 : 0[2] -> 1[3] via P2P/IPC/read |
| dsw-222255-668f79686ect file: No such file or directory : when loading libnccl-tuner.so |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO TUNER/Plugin: Using internal tuner plugin. |
| dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO ncclCommInitRank comm 0xfce82a0 rank 1 nranks 2 cudaDev 1 nvmlDev 3 busId a0 commId 0x2c8677096eda1f49 - Init COMPLETE |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Using non-device net plugin version 0 |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Using network IB |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO ncclCommInitRank comm 0x36e3b2d0 rank 1 nranks 2 cudaDev 1 nvmlDev 3 busId a0 commId 0xacaabfeeee980308 - Init START |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Setting affinity for GPU 3 to ffff,ffffffff |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO comm 0x36e3b2d0 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0 |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1 |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO P2P Chunksize set to 524288 |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 00/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 01/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 02/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 03/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 04/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 05/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 06/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 07/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 08/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 09/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 10/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 11/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 12/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 13/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 14/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 15/0 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Connected all rings |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Connected all trees |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO threadThresholds 8/8/64 | 16/8/64 | 512 | 512 |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer |
| dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO ncclCommInitRank comm 0x36e3b2d0 rank 1 nranks 2 cudaDev 1 nvmlDev 3 busId a0 commId 0xacaabfeeee980308 - Init COMPLETE |
| dsw-222255-668f79686f-2vnl5:3572112:3573093 [1] NCCL INFO Channel 00/1 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573093 [1] NCCL INFO Channel 01/1 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573093 [1] NCCL INFO Channel 02/1 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573093 [1] NCCL INFO Channel 03/1 : 1[3] -> 0[2] via P2P/IPC/read |
| dsw-222255-668f79686f-2vnl5:3572112:3573093 [1] NCCL INFINFO 04-29 00:50:24 executor_base.py:111] # cuda blocks: 11901, # CPU blocks: 1638 |
| INFO 04-29 00:50:24 executor_base.py:116] Maximum concurrency for 32768 tokens per request: 5.81x |
| [1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:50:24 worker.py:267] Memory profiling takes 19.75 seconds
|
| [1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:50:24 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
|
| [1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:50:24 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB. |
| INFO 04-29 00:50:24 worker.py:267] Memory profiling takes 19.79 seconds
|
| INFO 04-29 00:50:24 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
|
| INFO 04-29 00:50:24 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB. |
| [1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:50:24 worker.py:267] Memory profiling takes 19.54 seconds
|
| [1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:50:24 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
|
| [1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:50:24 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB. |
| INFO 04-29 00:50:24 worker.py:267] Memory profiling takes 19.58 seconds
|
| INFO 04-29 00:50:24 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
|
| INFO 04-29 00:50:24 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB. |
| INFO 04-29 00:50:24 executor_base.py:111] # cuda blocks: 11901, # CPU blocks: 1638 |
| INFO 04-29 00:50:24 executor_base.py:116] Maximum concurrency for 32768 tokens per request: 5.81x |
| INFO 04-29 00:50:24 executor_base.py:111] # cuda blocks: 11901, # CPU blocks: 1638 |
| INFO 04-29 00:50:24 executor_base.py:116] Maximum concurrency for 32768 tokens per request: 5.81x |
| INFO 04-29 00:50:26 llm_engine.py:436] init engine (profile, create kv cache, warmup model) took 26.30 seconds |
|
Processed prompts: 0%| | 0/10 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]INFO 04-29 00:50:27 llm_engine.py:436] init engine (profile, create kv cache, warmup model) took 23.47 seconds |
|
Processed prompts: 0%| | 0/10 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]INFO 04-29 00:50:28 llm_engine.py:436] init engine (profile, create kv cache, warmup model) took 23.97 seconds |
| INFO 04-29 00:50:28 llm_engine.py:436] init engine (profile, create kv cache, warmup model) took 23.77 seconds |
|
Processed prompts: 0%| | 0/10 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]
Processed prompts: 0%| | 0/10 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]
Processed prompts: 10%|โ | 1/10 [00:07<01:11, 7.93s/it, est. speed input: 139.56 toks/s, output: 6.05 toks/s]
Processed prompts: 40%|โโโ | 4/10 [00:08<00:09, 1.56s/it, est. speed input: 543.18 toks/s, output: 24.62 toks/s]
Processed prompts: 60%|โโโโโ | 6/10 [00:08<00:03, 1.03it/s, est. speed input: 774.07 toks/s, output: 37.05 toks/s]
Processed prompts: 70%|โโโโโ | 7/10 [00:08<00:02, 1.28it/s, est. speed input: 887.12 toks/s, output: 43.85 toks/s]
Processed prompts: 10%|โ | 1/10 [00:08<01:13, 8.11s/it, est. speed input: 136.45 toks/s, output: 6.16 toks/s]
Processed prompts: 20%|โโ | 2/10 [00:08<00:27, 3.50s/it, est. speed input: 265.96 toks/s, output: 12.65 toks/s]
Processed prompts: 10%|โ | 1/10 [00:07<01:10, 7.85s/it, est. speed input: 140.59 toks/s, output: 5.86 toks/s]
Processed prompts: 30%|โโ | 3/10 [00:08<00:13, 1.96s/it, est. speed input: 399.32 toks/s, output: 19.38 toks/s]
Processed prompts: 30%|โโ | 3/10 [00:08<00:14, 2.10s/it, est. speed input: 426.80 toks/s, output: 17.95 toks/s]
Processed prompts: 10%|โ | 1/10 [00:08<01:13, 8.12s/it, est. speed input: 132.71 toks/s, output: 6.40 toks/s]
Processed prompts: 70%|โโโโโ | 7/10 [00:08<00:01, 1.76it/s, est. speed input: 904.74 toks/s, output: 47.29 toks/s]
Processed prompts: 100%|โโโโโ| 10/10 [00:09<00:00, 1.83it/s, est. speed input: 1154.39 toks/s, output: 61.43 toks/s]
Processed prompts: 100%|โโโโโ| 10/10 [00:09<00:00, 1.03it/s, est. speed input: 1154.39 toks/s, output: 61.43 toks/s] |
|
Processed prompts: 60%|โโโโโ | 6/10 [00:08<00:03, 1.18it/s, est. speed input: 823.58 toks/s, output: 36.85 toks/s]
Processed prompts: 30%|โโ | 3/10 [00:08<00:15, 2.16s/it, est. speed input: 398.04 toks/s, output: 19.50 toks/s]
Processed prompts: 60%|โโโโโ | 6/10 [00:08<00:03, 1.15it/s, est. speed input: 787.71 toks/s, output: 39.37 toks/s]ๆจ็ๅฎๆ Total Finish:10 |
| batch time cost: 9.761106252670288s |
| [Memory] CPU: 7136.72 MB |
| [Memory] GPU: 66294.75 MB |
| INFO 04-29 00:50:37 multiproc_worker_utils.py:141] Terminating local vLLM worker processes |
| [1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:50:37 multiproc_worker_utils.py:253] Worker exiting |
|
Processed prompts: 80%|โโโโโ | 8/10 [00:08<00:01, 1.71it/s, est. speed input: 1068.07 toks/s, output: 49.79 toks/s]
Processed prompts: 90%|โโโโโโ| 9/10 [00:08<00:00, 2.06it/s, est. speed input: 1189.21 toks/s, output: 56.41 toks/s]
Processed prompts: 100%|โโโโโ| 10/10 [00:08<00:00, 1.17it/s, est. speed input: 1314.94 toks/s, output: 63.64 toks/s] |
|
Processed prompts: 90%|โโโโโโ| 9/10 [00:09<00:00, 2.13it/s, est. speed input: 1099.14 toks/s, output: 60.35 toks/s]
Processed prompts: 80%|โโโโโ | 8/10 [00:08<00:01, 1.64it/s, est. speed input: 1011.19 toks/s, output: 52.29 toks/s]ๆจ็ๅฎๆ Total Finish:10 |
| batch time cost: 8.617689609527588s |
| [Memory] CPU: 7125.53 MB |
| [Memory] GPU: 66294.76 MB |
| INFO 04-29 00:50:37 multiproc_worker_utils.py:141] Terminating local vLLM worker processes |
| [1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:50:37 multiproc_worker_utils.py:253] Worker exiting |
| /share/liangzy/miniconda3/envs/vllm/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown |
| warnings.warn('resource_tracker: There appear to be %d ' |
|
Processed prompts: 100%|โโโโโ| 10/10 [00:10<00:00, 1.05s/it, est. speed input: 1073.40 toks/s, output: 60.81 toks/s] |
|
Processed prompts: 100%|โโโโโ| 10/10 [00:10<00:00, 1.55it/s, est. speed input: 1099.19 toks/s, output: 60.35 toks/s]
Processed prompts: 100%|โโโโโ| 10/10 [00:10<00:00, 1.02s/it, est. speed input: 1099.19 toks/s, output: 60.35 toks/s] |
| ๆจ็ๅฎๆ Total Finish:10 |
| batch time cost: 10.546858549118042s |
| [Memory] CPU: 7134.02 MB |
| [Memory] GPU: 66294.75 MB |
| INFO 04-29 00:50:38 multiproc_worker_utils.py:141] Terminating local vLLM worker processes |
| [1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:50:38 multiproc_worker_utils.py:253] Worker exiting |
| ๆจ็ๅฎๆ Total Finish:10 |
| batch time cost: 10.227976083755493s |
| [Memory] CPU: 7132.92 MB |
| [Memory] GPU: 66294.75 MB |
| INFO 04-29 00:50:39 multiproc_worker_utils.py:141] Terminating local vLLM worker processes |
| [1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:50:39 multiproc_worker_utils.py:253] Worker exiting |
| /share/liangzy/miniconda3/envs/vllm/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown |
| warnings.warn('resource_tracker: There appear to be %d ' |
| /share/liangzy/miniconda3/envs/vllm/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown |
| warnings.warn('resource_tracker: There appear to be %d ' |
| /share/liangzy/miniconda3/envs/vllm/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown |
| warnings.warn('resource_tracker: There appear to be %d ' |
| OOMไบๆฒกๆ๏ผ |
| Total size: 40 Total time cost: 142.67037987709045s |
|
|