nohup: ignoring input + CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 + DATA_ROOT=/jizhicfs/bojoli + VISION_TOWER_NAME=clip-vit-large-patch14-336 + MODEL_NAME=vicuna-7b-1.5 + VISION_TOWER=/jizhicfs/bojoli/clip-vit-large-patch14-336 + MODEL_PATH=/jizhicfs/bojoli/vicuna-7b-1.5 + BASE_RUN_NAME=mmpe_vicuna-7b-1.5_clip-vit-large-patch14-336 + RUN_NAME=final_mmpe_finetune_vicuna-7b-1.5_clip-vit-large-patch14-336 + PROMPT_VERSION=v1 + EVAL_MODEL=../checkpoints/final_mmpe_finetune_vicuna-7b-1.5_clip-vit-large-patch14-336 + BENCHMARKS=mmbench_en + python3 -m accelerate.commands.launch --num_processes=8 --main_process_port 12899 -m lmms_eval --model llava --model_args pretrained=../checkpoints/final_mmpe_finetune_vicuna-7b-1.5_clip-vit-large-patch14-336,conv_template=v1 --tasks mmbench_en --batch_size 1 --log_samples --log_samples_suffix final_mmpe_finetune_vicuna-7b-1.5_clip-vit-large-patch14-336 --output_path ./logs/ The following values were not passed to `accelerate launch` and had defaults used instead: More than one GPU was found, enabling multi-GPU training. If this was unintended please pass in `--num_processes=1`. `--num_machines` was set to a value of `1` `--mixed_precision` was set to a value of `'no'` `--dynamo_backend` was set to a value of `'no'` To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`. 2025-05-19 01:44:28.140 | INFO  | __main__:cli_evaluate:294 - Verbosity set to INFO 2025-05-19 01:44:28.206 | INFO  | __main__:cli_evaluate:294 - Verbosity set to INFO 2025-05-19 01:44:28.914 | INFO  | __main__:cli_evaluate:294 - Verbosity set to INFO 2025-05-19 01:44:28.916 | INFO  | __main__:cli_evaluate:294 - Verbosity set to INFO 2025-05-19 01:44:28.922 | INFO  | __main__:cli_evaluate:294 - Verbosity set to INFO 2025-05-19 01:44:28.923 | INFO  | __main__:cli_evaluate:294 - Verbosity set to INFO 2025-05-19 01:44:28.931 | INFO  | __main__:cli_evaluate:294 - Verbosity set to INFO 2025-05-19 01:44:28.941 | INFO  | __main__:cli_evaluate:294 - Verbosity set to INFO Detected kernel version 5.4.119, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher. 2025-05-19 01:44:31.949 | INFO  | __main__:cli_evaluate_single:377 - Evaluation tracker args: {'output_path': './logs/'} 2025-05-19 01:44:31.950 | INFO  | __main__:cli_evaluate_single:466 - Selected Tasks: ['mmbench_en'] 2025-05-19 01:44:31.955 | INFO  | lmms_eval.evaluator:simple_evaluate:155 - Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 2025-05-19 01:44:32.253 | INFO  | __main__:cli_evaluate_single:377 - Evaluation tracker args: {'output_path': './logs/'} 2025-05-19 01:44:32.253 | INFO  | __main__:cli_evaluate_single:466 - Selected Tasks: ['mmbench_en'] 2025-05-19 01:44:32.260 | INFO  | lmms_eval.evaluator:simple_evaluate:155 - Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 2025-05-19 01:44:32.902 | INFO  | __main__:cli_evaluate_single:377 - Evaluation tracker args: {'output_path': './logs/'} 2025-05-19 01:44:32.902 | INFO  | __main__:cli_evaluate_single:466 - Selected Tasks: ['mmbench_en'] 2025-05-19 01:44:32.907 | INFO  | lmms_eval.evaluator:simple_evaluate:155 - Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 2025-05-19 01:44:32.960 | INFO  | __main__:cli_evaluate_single:377 - Evaluation tracker args: {'output_path': './logs/'} 2025-05-19 01:44:32.960 | INFO  | __main__:cli_evaluate_single:466 - Selected Tasks: ['mmbench_en'] 2025-05-19 01:44:32.964 | INFO  | __main__:cli_evaluate_single:377 - Evaluation tracker args: {'output_path': './logs/'} 2025-05-19 01:44:32.965 | INFO  | __main__:cli_evaluate_single:466 - Selected Tasks: ['mmbench_en'] 2025-05-19 01:44:32.966 | INFO  | lmms_eval.evaluator:simple_evaluate:155 - Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 2025-05-19 01:44:32.969 | INFO  | __main__:cli_evaluate_single:377 - Evaluation tracker args: {'output_path': './logs/'} 2025-05-19 01:44:32.970 | INFO  | __main__:cli_evaluate_single:466 - Selected Tasks: ['mmbench_en'] 2025-05-19 01:44:32.971 | INFO  | lmms_eval.evaluator:simple_evaluate:155 - Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 2025-05-19 01:44:32.972 | INFO  | __main__:cli_evaluate_single:377 - Evaluation tracker args: {'output_path': './logs/'} 2025-05-19 01:44:32.972 | INFO  | __main__:cli_evaluate_single:377 - Evaluation tracker args: {'output_path': './logs/'} 2025-05-19 01:44:32.973 | INFO  | __main__:cli_evaluate_single:466 - Selected Tasks: ['mmbench_en'] 2025-05-19 01:44:32.973 | INFO  | __main__:cli_evaluate_single:466 - Selected Tasks: ['mmbench_en'] 2025-05-19 01:44:32.977 | INFO  | lmms_eval.evaluator:simple_evaluate:155 - Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 2025-05-19 01:44:32.982 | INFO  | lmms_eval.evaluator:simple_evaluate:155 - Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 2025-05-19 01:44:32.982 | INFO  | lmms_eval.evaluator:simple_evaluate:155 - Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 You are using a model of type llama to instantiate a model of type llava_llama. This is not supported for all configurations of models and can yield errors. Loading checkpoint shards: 0%| | 0/3 [00:00= 1.5 and < 2.0 but detected 2.3  [WARNING]  using untested triton version (2.3.1), only 1.0.0 is known to be compatible Loading checkpoint shards: 100%|██████████| 3/3 [00:14<00:00, 4.62s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:14<00:00, 4.70s/it] [2025-05-19 01:45:01,621] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)  [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.  [WARNING]  async_io: please install the libaio-dev package with apt  [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.  [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH 2025-05-19 01:45:02.002 | INFO  | lmms_eval.api.task:build_all_requests:425 - Building contexts for mmbench_en_dev on rank 2...  [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3  [WARNING]  using untested triton version (2.3.1), only 1.0.0 is known to be compatible 2025-05-19 01:45:03.150 | INFO  | lmms_eval.api.task:build_all_requests:425 - Building contexts for mmbench_en_dev on rank 6... Loading checkpoint shards: 100%|██████████| 3/3 [00:15<00:00, 5.11s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:15<00:00, 5.08s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:15<00:00, 5.10s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:15<00:00, 5.06s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:15<00:00, 5.13s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:15<00:00, 5.10s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:15<00:00, 5.15s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:15<00:00, 5.13s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:15<00:00, 5.16s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:15<00:00, 5.15s/it] [2025-05-19 01:45:03,657] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-05-19 01:45:03,662] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-05-19 01:45:03,670] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) Rank 0: Model Class: LlavaLlamaForCausalLM [2025-05-19 01:45:03,857] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-05-19 01:45:03,970] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)  [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.  [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.  [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.  [WARNING]  async_io: please install the libaio-dev package with apt  [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.  [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH  [WARNING]  async_io: please install the libaio-dev package with apt  [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.  [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH  [WARNING]  async_io: please install the libaio-dev package with apt  [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.  [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH  [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.  [WARNING]  async_io: please install the libaio-dev package with apt  [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.  [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH  [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.  [WARNING]  async_io: please install the libaio-dev package with apt  [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.  [WARNING]  Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH  [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3  [WARNING]  using untested triton version (2.3.1), only 1.0.0 is known to be compatible  [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3  [WARNING]  using untested triton version (2.3.1), only 1.0.0 is known to be compatible  [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3  [WARNING]  using untested triton version (2.3.1), only 1.0.0 is known to be compatible  [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3  [WARNING]  using untested triton version (2.3.1), only 1.0.0 is known to be compatible  [WARNING]  sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3  [WARNING]  using untested triton version (2.3.1), only 1.0.0 is known to be compatible Loading checkpoint shards: 100%|██████████| 3/3 [00:15<00:00, 5.11s/it] Loading checkpoint shards: 100%|██████████| 3/3 [00:15<00:00, 5.10s/it] [2025-05-19 01:45:05,606] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect) 0%| | 0/541 [00:00= 1.5 and < 2.0 but detected 2.3  [WARNING]  using untested triton version (2.3.1), only 1.0.0 is known to be compatible 0%| | 0/541 [00:00 TENCENT64:79953:79953 [0] NCCL INFO NET/Plugin : dlerror=libnccl-net.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net.so), using internal implementation TENCENT64:79953:79953 [0] NCCL INFO cudaDriverVersion 12020 NCCL version 2.20.5+cuda12.4 TENCENT64:79959:79959 [6] NCCL INFO cudaDriverVersion 12020 TENCENT64:79959:79959 [6] NCCL INFO Bootstrap : Using bond1:30.207.96.97<0> TENCENT64:79958:79958 [5] NCCL INFO cudaDriverVersion 12020 TENCENT64:79955:79955 [2] NCCL INFO cudaDriverVersion 12020 TENCENT64:79954:79954 [1] NCCL INFO cudaDriverVersion 12020 TENCENT64:79956:79956 [3] NCCL INFO cudaDriverVersion 12020 TENCENT64:79957:79957 [4] NCCL INFO cudaDriverVersion 12020 TENCENT64:79958:79958 [5] NCCL INFO Bootstrap : Using bond1:30.207.96.97<0> TENCENT64:79955:79955 [2] NCCL INFO Bootstrap : Using bond1:30.207.96.97<0> TENCENT64:79954:79954 [1] NCCL INFO Bootstrap : Using bond1:30.207.96.97<0> TENCENT64:79957:79957 [4] NCCL INFO Bootstrap : Using bond1:30.207.96.97<0> TENCENT64:79956:79956 [3] NCCL INFO Bootstrap : Using bond1:30.207.96.97<0> TENCENT64:79959:79959 [6] NCCL INFO NET/Plugin : dlerror=libnccl-net.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net.so), using internal implementation TENCENT64:79958:79958 [5] NCCL INFO NET/Plugin : dlerror=libnccl-net.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net.so), using internal implementation TENCENT64:79955:79955 [2] NCCL INFO NET/Plugin : dlerror=libnccl-net.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net.so), using internal implementation TENCENT64:79956:79956 [3] NCCL INFO NET/Plugin : dlerror=libnccl-net.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net.so), using internal implementation TENCENT64:79954:79954 [1] NCCL INFO NET/Plugin : dlerror=libnccl-net.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net.so), using internal implementation TENCENT64:79957:79957 [4] NCCL INFO NET/Plugin : dlerror=libnccl-net.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net.so), using internal implementation TENCENT64:79953:80265 [0] NCCL INFO NET/IB : Using [0]mlx5_bond_0:1/RoCE [1]mlx5_bond_1:1/RoCE [2]mlx5_bond_2:1/RoCE [3]mlx5_bond_3:1/RoCE [4]mlx5_bond_4:1/RoCE [5]mlx5_bond_5:1/RoCE [6]mlx5_bond_6:1/RoCE [7]mlx5_bond_7:1/RoCE [8]mlx5_bond_8:1/RoCE [RO]; OOB bond1:30.207.96.97<0> TENCENT64:79953:80265 [0] NCCL INFO Using non-device net plugin version 0 TENCENT64:79953:80265 [0] NCCL INFO Using network IB TENCENT64:79959:80266 [6] NCCL INFO NET/IB : Using [0]mlx5_bond_0:1/RoCE [1]mlx5_bond_1:1/RoCE [2]mlx5_bond_2:1/RoCE [3]mlx5_bond_3:1/RoCE [4]mlx5_bond_4:1/RoCE [5]mlx5_bond_5:1/RoCE [6]mlx5_bond_6:1/RoCE [7]mlx5_bond_7:1/RoCE [8]mlx5_bond_8:1/RoCE [RO]; OOB bond1:30.207.96.97<0> TENCENT64:79959:80266 [6] NCCL INFO Using non-device net plugin version 0 TENCENT64:79959:80266 [6] NCCL INFO Using network IB TENCENT64:79958:80267 [5] NCCL INFO NET/IB : Using [0]mlx5_bond_0:1/RoCE [1]mlx5_bond_1:1/RoCE [2]mlx5_bond_2:1/RoCE [3]mlx5_bond_3:1/RoCE [4]mlx5_bond_4:1/RoCE [5]mlx5_bond_5:1/RoCE [6]mlx5_bond_6:1/RoCE [7]mlx5_bond_7:1/RoCE [8]mlx5_bond_8:1/RoCE [RO]; OOB bond1:30.207.96.97<0> TENCENT64:79958:80267 [5] NCCL INFO Using non-device net plugin version 0 TENCENT64:79958:80267 [5] NCCL INFO Using network IB TENCENT64:79954:80270 [1] NCCL INFO NET/IB : Using [0]mlx5_bond_0:1/RoCE [1]mlx5_bond_1:1/RoCE [2]mlx5_bond_2:1/RoCE [3]mlx5_bond_3:1/RoCE [4]mlx5_bond_4:1/RoCE [5]mlx5_bond_5:1/RoCE [6]mlx5_bond_6:1/RoCE [7]mlx5_bond_7:1/RoCE [8]mlx5_bond_8:1/RoCE [RO]; OOB bond1:30.207.96.97<0> TENCENT64:79954:80270 [1] NCCL INFO Using non-device net plugin version 0 TENCENT64:79954:80270 [1] NCCL INFO Using network IB TENCENT64:79956:80269 [3] NCCL INFO NET/IB : Using [0]mlx5_bond_0:1/RoCE [1]mlx5_bond_1:1/RoCE [2]mlx5_bond_2:1/RoCE [3]mlx5_bond_3:1/RoCE [4]mlx5_bond_4:1/RoCE [5]mlx5_bond_5:1/RoCE [6]mlx5_bond_6:1/RoCE [7]mlx5_bond_7:1/RoCE [8]mlx5_bond_8:1/RoCE [RO]; OOB bond1:30.207.96.97<0> TENCENT64:79956:80269 [3] NCCL INFO Using non-device net plugin version 0 TENCENT64:79956:80269 [3] NCCL INFO Using network IB TENCENT64:79957:80271 [4] NCCL INFO NET/IB : Using [0]mlx5_bond_0:1/RoCE [1]mlx5_bond_1:1/RoCE [2]mlx5_bond_2:1/RoCE [3]mlx5_bond_3:1/RoCE [4]mlx5_bond_4:1/RoCE [5]mlx5_bond_5:1/RoCE [6]mlx5_bond_6:1/RoCE [7]mlx5_bond_7:1/RoCE [8]mlx5_bond_8:1/RoCE [RO]; OOB bond1:30.207.96.97<0> TENCENT64:79955:80268 [2] NCCL INFO NET/IB : Using [0]mlx5_bond_0:1/RoCE [1]mlx5_bond_1:1/RoCE [2]mlx5_bond_2:1/RoCE [3]mlx5_bond_3:1/RoCE [4]mlx5_bond_4:1/RoCE [5]mlx5_bond_5:1/RoCE [6]mlx5_bond_6:1/RoCE [7]mlx5_bond_7:1/RoCE [8]mlx5_bond_8:1/RoCE [RO]; OOB bond1:30.207.96.97<0> TENCENT64:79957:80271 [4] NCCL INFO Using non-device net plugin version 0 TENCENT64:79957:80271 [4] NCCL INFO Using network IB TENCENT64:79955:80268 [2] NCCL INFO Using non-device net plugin version 0 TENCENT64:79955:80268 [2] NCCL INFO Using network IB 0%| | 0/541 [00:00 TENCENT64:79960:79960 [7] NCCL INFO NET/Plugin : dlerror=libnccl-net.so: cannot open shared object file: No such file or directory No plugin found (libnccl-net.so), using internal implementation TENCENT64:79960:80336 [7] NCCL INFO NET/IB : Using [0]mlx5_bond_0:1/RoCE [1]mlx5_bond_1:1/RoCE [2]mlx5_bond_2:1/RoCE [3]mlx5_bond_3:1/RoCE [4]mlx5_bond_4:1/RoCE [5]mlx5_bond_5:1/RoCE [6]mlx5_bond_6:1/RoCE [7]mlx5_bond_7:1/RoCE [8]mlx5_bond_8:1/RoCE [RO]; OOB bond1:30.207.96.97<0> TENCENT64:79960:80336 [7] NCCL INFO Using non-device net plugin version 0 TENCENT64:79960:80336 [7] NCCL INFO Using network IB TENCENT64:79956:80269 [3] NCCL INFO comm 0xcd6dc30 rank 3 nranks 8 cudaDev 3 nvmlDev 3 busId 4d000 commId 0x402b4d0350a71a1b - Init START TENCENT64:79955:80268 [2] NCCL INFO comm 0xb2ff1f0 rank 2 nranks 8 cudaDev 2 nvmlDev 2 busId 49000 commId 0x402b4d0350a71a1b - Init START TENCENT64:79954:80270 [1] NCCL INFO comm 0x1d2d43f0 rank 1 nranks 8 cudaDev 1 nvmlDev 1 busId 16000 commId 0x402b4d0350a71a1b - Init START TENCENT64:79953:80265 [0] NCCL INFO comm 0xd3ff140 rank 0 nranks 8 cudaDev 0 nvmlDev 0 busId 10000 commId 0x402b4d0350a71a1b - Init START TENCENT64:79960:80336 [7] NCCL INFO comm 0xe630280 rank 7 nranks 8 cudaDev 7 nvmlDev 7 busId c9000 commId 0x402b4d0350a71a1b - Init START TENCENT64:79957:80271 [4] NCCL INFO comm 0xf018850 rank 4 nranks 8 cudaDev 4 nvmlDev 4 busId 89000 commId 0x402b4d0350a71a1b - Init START TENCENT64:79958:80267 [5] NCCL INFO comm 0x11e9ea30 rank 5 nranks 8 cudaDev 5 nvmlDev 5 busId 8e000 commId 0x402b4d0350a71a1b - Init START TENCENT64:79959:80266 [6] NCCL INFO comm 0xf3037b0 rank 6 nranks 8 cudaDev 6 nvmlDev 6 busId c5000 commId 0x402b4d0350a71a1b - Init START TENCENT64:79954:80270 [1] NCCL INFO Setting affinity for GPU 1 to 0fff,ffffff00,0000000f,ffffffff TENCENT64:79954:80270 [1] NCCL INFO NVLS multicast support is not available on dev 1 TENCENT64:79955:80268 [2] NCCL INFO Setting affinity for GPU 2 to 0fff,ffffff00,0000000f,ffffffff TENCENT64:79955:80268 [2] NCCL INFO NVLS multicast support is not available on dev 2 TENCENT64:79953:80265 [0] NCCL INFO Setting affinity for GPU 0 to 0fff,ffffff00,0000000f,ffffffff TENCENT64:79953:80265 [0] NCCL INFO NVLS multicast support is not available on dev 0 TENCENT64:79957:80271 [4] NCCL INFO Setting affinity for GPU 4 to ffff,fffff000,000000ff,fffffff0,00000000 TENCENT64:79957:80271 [4] NCCL INFO NVLS multicast support is not available on dev 4 TENCENT64:79956:80269 [3] NCCL INFO Setting affinity for GPU 3 to 0fff,ffffff00,0000000f,ffffffff TENCENT64:79956:80269 [3] NCCL INFO NVLS multicast support is not available on dev 3 TENCENT64:79960:80336 [7] NCCL INFO Setting affinity for GPU 7 to ffff,fffff000,000000ff,fffffff0,00000000 TENCENT64:79960:80336 [7] NCCL INFO NVLS multicast support is not available on dev 7 TENCENT64:79958:80267 [5] NCCL INFO Setting affinity for GPU 5 to ffff,fffff000,000000ff,fffffff0,00000000 TENCENT64:79959:80266 [6] NCCL INFO Setting affinity for GPU 6 to ffff,fffff000,000000ff,fffffff0,00000000 TENCENT64:79959:80266 [6] NCCL INFO NVLS multicast support is not available on dev 6 TENCENT64:79958:80267 [5] NCCL INFO NVLS multicast support is not available on dev 5 TENCENT64:79957:80271 [4] NCCL INFO comm 0xf018850 rank 4 nRanks 8 nNodes 1 localRanks 8 localRank 4 MNNVL 0 TENCENT64:79956:80269 [3] NCCL INFO comm 0xcd6dc30 rank 3 nRanks 8 nNodes 1 localRanks 8 localRank 3 MNNVL 0 TENCENT64:79959:80266 [6] NCCL INFO comm 0xf3037b0 rank 6 nRanks 8 nNodes 1 localRanks 8 localRank 6 MNNVL 0 TENCENT64:79958:80267 [5] NCCL INFO comm 0x11e9ea30 rank 5 nRanks 8 nNodes 1 localRanks 8 localRank 5 MNNVL 0 TENCENT64:79955:80268 [2] NCCL INFO comm 0xb2ff1f0 rank 2 nRanks 8 nNodes 1 localRanks 8 localRank 2 MNNVL 0 TENCENT64:79960:80336 [7] NCCL INFO comm 0xe630280 rank 7 nRanks 8 nNodes 1 localRanks 8 localRank 7 MNNVL 0 TENCENT64:79954:80270 [1] NCCL INFO comm 0x1d2d43f0 rank 1 nRanks 8 nNodes 1 localRanks 8 localRank 1 MNNVL 0 TENCENT64:79953:80265 [0] NCCL INFO comm 0xd3ff140 rank 0 nRanks 8 nNodes 1 localRanks 8 localRank 0 MNNVL 0 TENCENT64:79957:80271 [4] NCCL INFO Trees [0] 5/-1/-1->4->3 [1] 5/-1/-1->4->3 [2] 5/-1/-1->4->3 [3] 5/-1/-1->4->3 [4] 5/-1/-1->4->3 [5] 5/-1/-1->4->3 [6] 5/-1/-1->4->3 [7] 5/-1/-1->4->3 [8] 5/-1/-1->4->3 [9] 5/-1/-1->4->3 [10] 5/-1/-1->4->3 [11] 5/-1/-1->4->3 [12] 5/-1/-1->4->3 [13] 5/-1/-1->4->3 [14] 5/-1/-1->4->3 [15] 5/-1/-1->4->3 TENCENT64:79956:80269 [3] NCCL INFO Trees [0] 4/-1/-1->3->2 [1] 4/-1/-1->3->2 [2] 4/-1/-1->3->2 [3] 4/-1/-1->3->2 [4] 4/-1/-1->3->2 [5] 4/-1/-1->3->2 [6] 4/-1/-1->3->2 [7] 4/-1/-1->3->2 [8] 4/-1/-1->3->2 [9] 4/-1/-1->3->2 [10] 4/-1/-1->3->2 [11] 4/-1/-1->3->2 [12] 4/-1/-1->3->2 [13] 4/-1/-1->3->2 [14] 4/-1/-1->3->2 [15] 4/-1/-1->3->2 TENCENT64:79956:80269 [3] NCCL INFO P2P Chunksize set to 524288 TENCENT64:79959:80266 [6] NCCL INFO Trees [0] 7/-1/-1->6->5 [1] 7/-1/-1->6->5 [2] 7/-1/-1->6->5 [3] 7/-1/-1->6->5 [4] 7/-1/-1->6->5 [5] 7/-1/-1->6->5 [6] 7/-1/-1->6->5 [7] 7/-1/-1->6->5 [8] 7/-1/-1->6->5 [9] 7/-1/-1->6->5 [10] 7/-1/-1->6->5 [11] 7/-1/-1->6->5 [12] 7/-1/-1->6->5 [13] 7/-1/-1->6->5 [14] 7/-1/-1->6->5 [15] 7/-1/-1->6->5 TENCENT64:79958:80267 [5] NCCL INFO Trees [0] 6/-1/-1->5->4 [1] 6/-1/-1->5->4 [2] 6/-1/-1->5->4 [3] 6/-1/-1->5->4 [4] 6/-1/-1->5->4 [5] 6/-1/-1->5->4 [6] 6/-1/-1->5->4 [7] 6/-1/-1->5->4 [8] 6/-1/-1->5->4 [9] 6/-1/-1->5->4 [10] 6/-1/-1->5->4 [11] 6/-1/-1->5->4 [12] 6/-1/-1->5->4 [13] 6/-1/-1->5->4 [14] 6/-1/-1->5->4 [15] 6/-1/-1->5->4 TENCENT64:79958:80267 [5] NCCL INFO P2P Chunksize set to 524288 TENCENT64:79955:80268 [2] NCCL INFO Trees [0] 3/-1/-1->2->1 [1] 3/-1/-1->2->1 [2] 3/-1/-1->2->1 [3] 3/-1/-1->2->1 [4] 3/-1/-1->2->1 [5] 3/-1/-1->2->1 [6] 3/-1/-1->2->1 [7] 3/-1/-1->2->1 [8] 3/-1/-1->2->1 [9] 3/-1/-1->2->1 [10] 3/-1/-1->2->1 [11] 3/-1/-1->2->1 [12] 3/-1/-1->2->1 [13] 3/-1/-1->2->1 [14] 3/-1/-1->2->1 [15] 3/-1/-1->2->1 TENCENT64:79955:80268 [2] NCCL INFO P2P Chunksize set to 524288 TENCENT64:79960:80336 [7] NCCL INFO Trees [0] -1/-1/-1->7->6 [1] -1/-1/-1->7->6 [2] -1/-1/-1->7->6 [3] -1/-1/-1->7->6 [4] -1/-1/-1->7->6 [5] -1/-1/-1->7->6 [6] -1/-1/-1->7->6 [7] -1/-1/-1->7->6 [8] -1/-1/-1->7->6 [9] -1/-1/-1->7->6 [10] -1/-1/-1->7->6 [11] -1/-1/-1->7->6 [12] -1/-1/-1->7->6 [13] -1/-1/-1->7->6 [14] -1/-1/-1->7->6 [15] -1/-1/-1->7->6 TENCENT64:79960:80336 [7] NCCL INFO P2P Chunksize set to 524288 TENCENT64:79954:80270 [1] NCCL INFO Trees [0] 2/-1/-1->1->0 [1] 2/-1/-1->1->0 [2] 2/-1/-1->1->0 [3] 2/-1/-1->1->0 [4] 2/-1/-1->1->0 [5] 2/-1/-1->1->0 [6] 2/-1/-1->1->0 [7] 2/-1/-1->1->0 [8] 2/-1/-1->1->0 [9] 2/-1/-1->1->0 [10] 2/-1/-1->1->0 [11] 2/-1/-1->1->0 [12] 2/-1/-1->1->0 [13] 2/-1/-1->1->0 [14] 2/-1/-1->1->0 [15] 2/-1/-1->1->0 TENCENT64:79953:80265 [0] NCCL INFO Channel 00/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 01/16 : 0 1 2 3 4 5 6 7 TENCENT64:79957:80271 [4] NCCL INFO P2P Chunksize set to 524288 TENCENT64:79959:80266 [6] NCCL INFO P2P Chunksize set to 524288 TENCENT64:79954:80270 [1] NCCL INFO P2P Chunksize set to 524288 TENCENT64:79953:80265 [0] NCCL INFO Channel 02/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 03/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 04/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 05/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 06/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 07/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 08/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 09/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 10/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 11/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 12/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 13/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 14/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Channel 15/16 : 0 1 2 3 4 5 6 7 TENCENT64:79953:80265 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] 1/-1/-1->0->-1 [5] 1/-1/-1->0->-1 [6] 1/-1/-1->0->-1 [7] 1/-1/-1->0->-1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] 1/-1/-1->0->-1 [13] 1/-1/-1->0->-1 [14] 1/-1/-1->0->-1 [15] 1/-1/-1->0->-1 TENCENT64:79953:80265 [0] NCCL INFO P2P Chunksize set to 524288 TENCENT64:79956:80269 [3] NCCL INFO Channel 00/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 00/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 00/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 00/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 00/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 00/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 00/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 00/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 01/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 01/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 01/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 01/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 01/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 01/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 01/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 01/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 02/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 02/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 02/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 02/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 02/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 02/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 02/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 02/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 03/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 03/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 03/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 03/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 03/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 03/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 03/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 03/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 04/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 04/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 04/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 04/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 04/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 04/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 04/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 04/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 05/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 05/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 05/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 05/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 05/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 05/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 05/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 05/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 06/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 06/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 06/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 06/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 06/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 06/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 06/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 06/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 07/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 07/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 07/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 07/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 07/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 07/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 07/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 07/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 08/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 08/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 08/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 08/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 08/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 08/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 08/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 08/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 09/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 09/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 09/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 09/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 09/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 09/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 09/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 09/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 10/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 10/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 10/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 10/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 10/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 10/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 10/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 10/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 11/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 11/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 11/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 11/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 11/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 11/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 11/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 11/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 12/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 12/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 12/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 12/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 12/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 12/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 12/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 12/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 13/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 13/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 13/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 13/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 13/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 13/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 13/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 13/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 14/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 14/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 14/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 14/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 14/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 14/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 14/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 14/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 15/0 : 3[3] -> 4[4] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 15/0 : 5[5] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 15/0 : 2[2] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 15/0 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 15/0 : 4[4] -> 5[5] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 15/0 : 6[6] -> 7[7] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 15/0 : 1[1] -> 2[2] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Channel 15/0 : 0[0] -> 1[1] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Connected all rings TENCENT64:79954:80270 [1] NCCL INFO Connected all rings TENCENT64:79953:80265 [0] NCCL INFO Connected all rings TENCENT64:79956:80269 [3] NCCL INFO Connected all rings TENCENT64:79957:80271 [4] NCCL INFO Connected all rings TENCENT64:79960:80336 [7] NCCL INFO Connected all rings TENCENT64:79958:80267 [5] NCCL INFO Connected all rings TENCENT64:79959:80266 [6] NCCL INFO Connected all rings TENCENT64:79960:80336 [7] NCCL INFO Channel 00/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 01/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 02/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 03/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 04/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 05/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 06/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 07/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 00/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 08/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 01/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 09/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 02/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 10/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 03/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 11/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 04/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 12/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 05/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 00/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 13/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 06/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 01/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 00/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 14/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 07/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 02/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 01/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 00/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79960:80336 [7] NCCL INFO Channel 15/0 : 7[7] -> 6[6] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 08/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 00/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 00/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 03/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 02/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 01/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 09/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 01/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 01/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 04/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 03/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 02/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 10/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 02/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 02/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 05/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 04/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 03/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 11/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 03/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 03/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 06/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 05/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 04/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 12/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 04/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 04/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 07/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 06/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 05/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 13/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 05/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 05/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 08/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 07/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 06/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 14/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 06/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 06/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 09/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 08/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 07/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79955:80268 [2] NCCL INFO Channel 15/0 : 2[2] -> 1[1] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 07/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 07/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 10/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 09/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 08/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 08/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 08/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 11/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 10/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 09/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 09/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 09/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 12/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 11/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 10/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 10/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 10/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 13/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 12/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 11/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 11/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 11/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 14/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 13/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 12/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 12/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 12/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79954:80270 [1] NCCL INFO Channel 15/0 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 14/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 13/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 13/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 13/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79956:80269 [3] NCCL INFO Channel 15/0 : 3[3] -> 2[2] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 14/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 14/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 14/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79957:80271 [4] NCCL INFO Channel 15/0 : 4[4] -> 3[3] via P2P/CUMEM/read TENCENT64:79958:80267 [5] NCCL INFO Channel 15/0 : 5[5] -> 4[4] via P2P/CUMEM/read TENCENT64:79959:80266 [6] NCCL INFO Channel 15/0 : 6[6] -> 5[5] via P2P/CUMEM/read TENCENT64:79953:80265 [0] NCCL INFO Connected all trees TENCENT64:79953:80265 [0] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 TENCENT64:79953:80265 [0] NCCL INFO 16 coll channels, 0 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer TENCENT64:79954:80270 [1] NCCL INFO Connected all trees TENCENT64:79954:80270 [1] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 TENCENT64:79954:80270 [1] NCCL INFO 16 coll channels, 0 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer TENCENT64:79955:80268 [2] NCCL INFO Connected all trees TENCENT64:79955:80268 [2] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 TENCENT64:79955:80268 [2] NCCL INFO 16 coll channels, 0 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer TENCENT64:79956:80269 [3] NCCL INFO Connected all trees TENCENT64:79956:80269 [3] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 TENCENT64:79956:80269 [3] NCCL INFO 16 coll channels, 0 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer TENCENT64:79957:80271 [4] NCCL INFO Connected all trees TENCENT64:79957:80271 [4] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 TENCENT64:79957:80271 [4] NCCL INFO 16 coll channels, 0 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer TENCENT64:79958:80267 [5] NCCL INFO Connected all trees TENCENT64:79959:80266 [6] NCCL INFO Connected all trees TENCENT64:79960:80336 [7] NCCL INFO Connected all trees TENCENT64:79958:80267 [5] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 TENCENT64:79958:80267 [5] NCCL INFO 16 coll channels, 0 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer TENCENT64:79959:80266 [6] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 TENCENT64:79960:80336 [7] NCCL INFO threadThresholds 8/8/64 | 64/8/64 | 512 | 512 TENCENT64:79959:80266 [6] NCCL INFO 16 coll channels, 0 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer TENCENT64:79960:80336 [7] NCCL INFO 16 coll channels, 0 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer TENCENT64:79957:80271 [4] NCCL INFO comm 0xf018850 rank 4 nranks 8 cudaDev 4 nvmlDev 4 busId 89000 commId 0x402b4d0350a71a1b - Init COMPLETE TENCENT64:79960:80336 [7] NCCL INFO comm 0xe630280 rank 7 nranks 8 cudaDev 7 nvmlDev 7 busId c9000 commId 0x402b4d0350a71a1b - Init COMPLETE TENCENT64:79958:80267 [5] NCCL INFO comm 0x11e9ea30 rank 5 nranks 8 cudaDev 5 nvmlDev 5 busId 8e000 commId 0x402b4d0350a71a1b - Init COMPLETE TENCENT64:79956:80269 [3] NCCL INFO comm 0xcd6dc30 rank 3 nranks 8 cudaDev 3 nvmlDev 3 busId 4d000 commId 0x402b4d0350a71a1b - Init COMPLETE TENCENT64:79959:80266 [6] NCCL INFO comm 0xf3037b0 rank 6 nranks 8 cudaDev 6 nvmlDev 6 busId c5000 commId 0x402b4d0350a71a1b - Init COMPLETE TENCENT64:79955:80268 [2] NCCL INFO comm 0xb2ff1f0 rank 2 nranks 8 cudaDev 2 nvmlDev 2 busId 49000 commId 0x402b4d0350a71a1b - Init COMPLETE TENCENT64:79953:80265 [0] NCCL INFO comm 0xd3ff140 rank 0 nranks 8 cudaDev 0 nvmlDev 0 busId 10000 commId 0x402b4d0350a71a1b - Init COMPLETE TENCENT64:79954:80270 [1] NCCL INFO comm 0x1d2d43f0 rank 1 nranks 8 cudaDev 1 nvmlDev 1 busId 16000 commId 0x402b4d0350a71a1b - Init COMPLETE 2025-05-19 01:45:13.639 | INFO  | lmms_eval.api.task:build_all_requests:425 - Building contexts for mmbench_en_test on rank 0... 2025-05-19 01:45:13.640 | INFO  | lmms_eval.api.task:build_all_requests:425 - Building contexts for mmbench_en_test on rank 7... 2025-05-19 01:45:13.640 | INFO  | lmms_eval.api.task:build_all_requests:425 - Building contexts for mmbench_en_test on rank 2... 2025-05-19 01:45:13.640 | INFO  | lmms_eval.api.task:build_all_requests:425 - Building contexts for mmbench_en_test on rank 6... 2025-05-19 01:45:13.640 | INFO  | lmms_eval.api.task:build_all_requests:425 - Building contexts for mmbench_en_test on rank 3... 2025-05-19 01:45:13.640 | INFO  | lmms_eval.api.task:build_all_requests:425 - Building contexts for mmbench_en_test on rank 4... 2025-05-19 01:45:13.640 | INFO  | lmms_eval.api.task:build_all_requests:425 - Building contexts for mmbench_en_test on rank 1... 2025-05-19 01:45:13.640 | INFO  | lmms_eval.api.task:build_all_requests:425 - Building contexts for mmbench_en_test on rank 5... 0%| | 0/833 [00:00 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 00/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 00/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 00/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 00/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 00/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 00/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 01/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 01/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 01/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 01/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 01/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 01/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 01/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 02/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 02/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 02/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 02/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 02/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 02/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 02/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 03/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 03/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 03/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 03/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 03/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 03/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 03/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 04/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 04/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 04/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 04/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 04/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 04/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 04/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 05/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 05/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 05/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 05/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 05/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 05/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 05/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 06/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 06/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 06/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 06/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 06/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 06/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 06/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 07/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 07/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 07/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 07/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 07/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 07/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 07/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 08/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 08/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 08/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 08/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 08/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 08/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 08/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 09/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 09/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 09/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 09/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 09/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 09/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 09/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 10/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 10/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 10/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 10/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 10/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 10/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 10/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 11/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 11/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 11/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 11/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 11/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 11/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 11/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 12/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 12/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 12/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 12/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 12/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 12/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 12/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 13/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 13/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 13/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 13/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 13/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 13/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 13/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 14/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 14/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 14/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 14/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 14/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 14/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 14/1 : 1[1] -> 0[0] via P2P/CUMEM/read TENCENT64:79959:80373 [6] NCCL INFO Channel 15/1 : 6[6] -> 0[0] via P2P/CUMEM/read TENCENT64:79956:80374 [3] NCCL INFO Channel 15/1 : 3[3] -> 0[0] via P2P/CUMEM/read TENCENT64:79958:80375 [5] NCCL INFO Channel 15/1 : 5[5] -> 0[0] via P2P/CUMEM/read TENCENT64:79957:80377 [4] NCCL INFO Channel 15/1 : 4[4] -> 0[0] via P2P/CUMEM/read TENCENT64:79955:80376 [2] NCCL INFO Channel 15/1 : 2[2] -> 0[0] via P2P/CUMEM/read TENCENT64:79960:80378 [7] NCCL INFO Channel 15/1 : 7[7] -> 0[0] via P2P/CUMEM/read TENCENT64:79954:80380 [1] NCCL INFO Channel 15/1 : 1[1] -> 0[0] via P2P/CUMEM/read ============= MMBench-EN(Dev) Detailed Results ============= Category Acc. (dev): action_recognition: 88.889 attribute_comparison: 63.636 attribute_recognition: 89.189 celebrity_recognition: 80.808 function_reasoning: 73.418 future_prediction: 42.500 identity_reasoning: 97.778 image_emotion: 80.000 image_quality: 52.830 image_scene: 96.154 image_style: 86.792 image_topic: 77.778 nature_relation: 56.250 object_localization: 46.914 ocr: 66.667 physical_property_reasoning: 49.333 physical_relation: 41.667 social_relation: 93.023 spatial_relationship: 20.000 structuralized_imagetext_understanding: 30.769 L2-category Acc. (dev): attribute_reasoning: 69.849 coarse_perception: 81.757 finegrained_perception (cross-instance): 59.441 finegrained_perception (instance-level): 71.672 logic_reasoning: 34.746 relation_reasoning: 66.957 2025-05-19 01:50:54.491 | INFO  | en_utils:mmbench_aggregate_dev_results_submission:121 - Saved results to /jizhicfs/bojoli/mmpe/mmpe-main/eval/logs/submissions/mmbench_en_dev_results.xlsx 2025-05-19 01:50:55.933 | INFO  | en_utils:mmbench_aggregate_test_results:129 - Saved results to /jizhicfs/bojoli/mmpe/mmpe-main/eval/logs/submissions/mmbench_en_test_results.xlsx fatal: not a git repository (or any parent up to mount point /) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). 2025-05-19 01:50:56.096 | INFO  | lmms_eval.loggers.evaluation_tracker:save_results_aggregated:188 - Saving results aggregated 2025-05-19 01:50:56.105 | INFO  | lmms_eval.loggers.evaluation_tracker:save_results_samples:255 - Saving per-sample results for: mmbench_en_dev 2025-05-19 01:50:57.564 | INFO  | lmms_eval.loggers.evaluation_tracker:save_results_samples:255 - Saving per-sample results for: mmbench_en_test llava (pretrained=../checkpoints/final_mmpe_finetune_vicuna-7b-1.5_clip-vit-large-patch14-336,conv_template=v1), gen_kwargs: (), limit: None, num_fewshot: None, batch_size: 1 | Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr| |------------------|-------|------|-----:|--------------|---|-------|---|------| |mmbench_en | N/A| | | | | | | | | - mmbench_en_dev |Yaml |none | 0|gpt_eval_score|↑ |68.2131|± | N/A| | - mmbench_en_dev |Yaml |none | 0|submission |↑ |N/A |± | N/A| | - mmbench_en_test|Yaml |none | 0|submission |↑ |N/A |± | N/A|