synthesis / ObjectRecognition /logs /object_0_40_mira.log

Upload folder using huggingface_hub

55500d6 verified 12 months ago

110 kB

	/share/liangzy/miniconda3/envs/vllm/lib/python3.10/site-packages/_distutils_hack/__init__.py:54: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
	warnings.warn(
	Total Video Size: 31436
	0%\| \| 0/31436 [00:00<?, ?it/s] 100%\|██████████████████████████████████████████████████████████████████████\| 31436/31436 [00:00<00:00, 801488.92it/s]
	Total Clips Size: 37658
	Start: 0, End: 40
	to process size: 40
	Total size: 40
	Sample show: <\|im_start\|>system
	You are an AI assistant tasked with generating high-quality object recognition questions based on a video snippet description from a long video.

	## TASK:
	Generate one high-quality object recognition question that requires identifying visible objects, such as people, vehicles, animals, furniture, tools, electronic devices, clothing, food, household items, etc.

	You must also provide 4 answer options (A–D), with only one correct answer, which is clearly supported by the visual or narrative content of the video description.

	## INSTRUCTIONS:
	- Focus on Visual Entities: The question must test the model’s ability to recognize objects.
	- Ground in Visuals: All answers must be verifiable by pausing a single frame or short clip. Avoid actions, motivations, or temporal reasoning.
	- Ground in the Description: Ensure that the answer is grounded in the video's description, not general knowledge or external information.
	- Avoid Extraneous Information: Do not rely on subtitles, voiceovers, or audio cues unless explicitly mentioned in the description.
	- Clear and Logical Phrasing: Keep the question clear, specific, and logically phrased to avoid ambiguity.
	- Output Format: Format the output as a list of dictionaries with the following keys:
	- `'Q'`: The question.
	- `'Options'`: A list of four answer options labeled 'A', 'B', 'C', and 'D'.
	- `'Answer'`: The correct answer (e.g., `'A'`, `'B'`, etc.).

	## EXAMPLES:
	1. {'Q': 'What does Jon Snow use to fight with Ramsay Bolton?',
	'Options': [
	'A. A shield.',
	'B. A sword.',
	'C. An Axe.',
	'D. A spear.'
	],
	'Answer': 'A. A shield'}

	2. {'Q': 'Which cellular structure is responsible for receiving proteins according to the video?',
	'Options': [
	'A. Golgi apparatus (Golgi body).',
	'B. Nucleus.',
	'C. Ribosome.',
	'D. Mitochondrion.'
	],
	'Answer': 'A. Golgi apparatus (Golgi body).'}

	3. {'Q': 'What sport are the two teams of athletes playing?',
	'Options': [
	'A. Ice hockey.',
	'B. Soccer.',
	'C. Rugby.',
	'D. Basketball.'
	],
	'Answer': 'C'}

	## GUIDELINES FOR CREATING QUESTIONS:
	- Specificity: Ask about singular, clearly defined object.
	- Visual Certainty: Ensure the correct answer is unambiguous.
	- Description Grounding: Base all questions and answers on the video description.
	- Plausible Distractors: Wrong options should be visually similar (e.g., other kitchen tools if asking about a pan).
	- No Implicit Knowledge: Avoid questions requiring domain knowledge (e.g., 'What brand is the car?' is invalid unless the logo is visible).

	## OUTPUT FORMAT:
	[{'Q': 'Your question here...', 'Options': ['A. ...', 'B. ...', 'C. ...', 'D. ...'], 'Answer': 'Correct answer here...'}]<\|im_end\|>
	<\|im_start\|>user
	I have provided you with three different aspect description of a specific clip in a video. Below is these description:

	Dense Description:
	A person interacting with a collection of toy minions, focusing on one particular minion holding a guitar. The setting is a well-lit indoor environment, possibly a room or a studio, where the person demonstrates the features of the toy. The minion is detailed, with vibrant colors and a playful design, featuring a single large eye and a cheerful expression. Throughout the video, the person manipulates the toy, showing its flexibility and the different poses it can achieve, enhancing the viewer's understanding of the toy's design and functionality.

	Background Description:
	Simple and unobtrusive, featuring a reflective glass table that provides a clear view of the toy and its reflections. Other minions are arranged in the background, creating a playful and colorful array that adds depth to the scene. The indoor lighting is bright, ensuring that all details of the toy are visible and enhancing the visual appeal of the colorful toys.

	Main Object Description:
	A person's hands, is seen handling a toy minion with a guitar. Initially, the toy is picked up from a glass surface reflecting an array of other minions. The person rotates the toy, displaying its front and back, and adjusts its limbs and guitar, showcasing its articulation and design features. The actions are gentle and precise, indicating a demonstration or review of the toy's characteristics.

	Based on these description and the system instructions, generate one high-quality object recognition question-and-answer pair.

	## REQUIREMENTS:
	- The question must focus on identifying visible objects, such as people, vehicles, animals, furniture, tools, electronic devices, clothing, food, household items, etc.
	- The answer must be directly observable in the description without any reasoning or inference.

	## OUTPUT FORMAT:
	[{'Q': 'Your question here...', 'Options': ['A. ...', 'B. ...', 'C. ...', 'D. ...'], 'Answer': 'Correct answer here...'}]

	Only return the QA pair in the specified JSON list format.<\|im_end\|>
	<\|im_start\|>assistant

	DP rank 0 needs to process 10 prompts
	INFO 04-29 00:48:20 __init__.py:207] Automatically detected platform cuda.
	DP rank 1 needs to process 10 prompts
	INFO 04-29 00:48:20 __init__.py:207] Automatically detected platform cuda.
	DP rank 3 needs to process 10 prompts
	INFO 04-29 00:48:20 __init__.py:207] Automatically detected platform cuda.
	DP rank 2 needs to process 10 prompts
	INFO 04-29 00:48:20 __init__.py:207] Automatically detected platform cuda.
	INFO 04-29 00:48:37 config.py:549] This model supports multiple tasks: {'classify', 'embed', 'generate', 'reward', 'score'}. Defaulting to 'generate'.
	INFO 04-29 00:48:37 config.py:549] This model supports multiple tasks: {'classify', 'embed', 'generate', 'reward', 'score'}. Defaulting to 'generate'.
	INFO 04-29 00:48:37 config.py:549] This model supports multiple tasks: {'classify', 'embed', 'generate', 'reward', 'score'}. Defaulting to 'generate'.
	INFO 04-29 00:48:37 config.py:549] This model supports multiple tasks: {'classify', 'embed', 'generate', 'reward', 'score'}. Defaulting to 'generate'.
	INFO 04-29 00:48:39 gptq_marlin.py:143] The model is convertible to gptq_marlin during runtime. Using gptq_marlin kernel.
	INFO 04-29 00:48:39 gptq_marlin.py:143] The model is convertible to gptq_marlin during runtime. Using gptq_marlin kernel.
	INFO 04-29 00:48:39 gptq_marlin.py:143] The model is convertible to gptq_marlin during runtime. Using gptq_marlin kernel.
	INFO 04-29 00:48:39 gptq_marlin.py:143] The model is convertible to gptq_marlin during runtime. Using gptq_marlin kernel.
	INFO 04-29 00:48:39 config.py:1382] Defaulting to use mp for distributed inference
	WARNING 04-29 00:48:39 cuda.py:95] To see benefits of async output processing, enable CUDA graph. Since, enforce-eager is enabled, async output processor cannot be used
	WARNING 04-29 00:48:39 config.py:685] Async output processing is not supported on the current platform type cuda.
	INFO 04-29 00:48:39 config.py:1382] Defaulting to use mp for distributed inference
	WARNING 04-29 00:48:39 cuda.py:95] To see benefits of async output processing, enable CUDA graph. Since, enforce-eager is enabled, async output processor cannot be used
	WARNING 04-29 00:48:39 config.py:685] Async output processing is not supported on the current platform type cuda.
	INFO 04-29 00:48:39 config.py:1382] Defaulting to use mp for distributed inference
	WARNING 04-29 00:48:39 cuda.py:95] To see benefits of async output processing, enable CUDA graph. Since, enforce-eager is enabled, async output processor cannot be used
	WARNING 04-29 00:48:39 config.py:685] Async output processing is not supported on the current platform type cuda.
	INFO 04-29 00:48:39 config.py:1382] Defaulting to use mp for distributed inference
	WARNING 04-29 00:48:39 cuda.py:95] To see benefits of async output processing, enable CUDA graph. Since, enforce-eager is enabled, async output processor cannot be used
	WARNING 04-29 00:48:39 config.py:685] Async output processing is not supported on the current platform type cuda.
	INFO 04-29 00:48:39 llm_engine.py:234] Initializing a V0 LLM engine (v0.7.3) with config: model='/share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8', speculative_config=None, tokenizer='/share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=32768, download_dir=None, load_format=auto, tensor_parallel_size=2, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=gptq_marlin, enforce_eager=True, kv_cache_dtype=auto, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='xgrammar'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=/share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8, num_scheduler_steps=1, multi_step_stream_outputs=True, enable_prefix_caching=False, chunked_prefill_enabled=False, use_async_output_proc=False, disable_mm_preprocessor_cache=False, mm_processor_kwargs=None, pooler_config=None, compilation_config={"splitting_ops":[],"compile_sizes":[],"cudagraph_capture_sizes":[],"max_capture_size":0}, use_cached_outputs=False,
	INFO 04-29 00:48:39 llm_engine.py:234] Initializing a V0 LLM engine (v0.7.3) with config: model='/share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8', speculative_config=None, tokenizer='/share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=32768, download_dir=None, load_format=auto, tensor_parallel_size=2, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=gptq_marlin, enforce_eager=True, kv_cache_dtype=auto, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='xgrammar'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=/share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8, num_scheduler_steps=1, multi_step_stream_outputs=True, enable_prefix_caching=False, chunked_prefill_enabled=False, use_async_output_proc=False, disable_mm_preprocessor_cache=False, mm_processor_kwargs=None, pooler_config=None, compilation_config={"splitting_ops":[],"compile_sizes":[],"cudagraph_capture_sizes":[],"max_capture_size":0}, use_cached_outputs=False,
	INFO 04-29 00:48:39 llm_engine.py:234] Initializing a V0 LLM engine (v0.7.3) with config: model='/share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8', speculative_config=None, tokenizer='/share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=32768, download_dir=None, load_format=auto, tensor_parallel_size=2, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=gptq_marlin, enforce_eager=True, kv_cache_dtype=auto, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='xgrammar'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=/share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8, num_scheduler_steps=1, multi_step_stream_outputs=True, enable_prefix_caching=False, chunked_prefill_enabled=False, use_async_output_proc=False, disable_mm_preprocessor_cache=False, mm_processor_kwargs=None, pooler_config=None, compilation_config={"splitting_ops":[],"compile_sizes":[],"cudagraph_capture_sizes":[],"max_capture_size":0}, use_cached_outputs=False,
	INFO 04-29 00:48:39 llm_engine.py:234] Initializing a V0 LLM engine (v0.7.3) with config: model='/share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8', speculative_config=None, tokenizer='/share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=32768, download_dir=None, load_format=auto, tensor_parallel_size=2, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=gptq_marlin, enforce_eager=True, kv_cache_dtype=auto, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='xgrammar'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=/share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8, num_scheduler_steps=1, multi_step_stream_outputs=True, enable_prefix_caching=False, chunked_prefill_enabled=False, use_async_output_proc=False, disable_mm_preprocessor_cache=False, mm_processor_kwargs=None, pooler_config=None, compilation_config={"splitting_ops":[],"compile_sizes":[],"cudagraph_capture_sizes":[],"max_capture_size":0}, use_cached_outputs=False,
	WARNING 04-29 00:48:39 multiproc_worker_utils.py:300] Reducing Torch parallelism from 48 threads to 1 to avoid unnecessary CPU contention. Set OMP_NUM_THREADS in the external environment to tune this value as needed.
	INFO 04-29 00:48:39 custom_cache_manager.py:19] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager
	WARNING 04-29 00:48:39 multiproc_worker_utils.py:300] Reducing Torch parallelism from 48 threads to 1 to avoid unnecessary CPU contention. Set OMP_NUM_THREADS in the external environment to tune this value as needed.
	WARNING 04-29 00:48:39 multiproc_worker_utils.py:300] Reducing Torch parallelism from 48 threads to 1 to avoid unnecessary CPU contention. Set OMP_NUM_THREADS in the external environment to tune this value as needed.
	WARNING 04-29 00:48:40 multiproc_worker_utils.py:300] Reducing Torch parallelism from 48 threads to 1 to avoid unnecessary CPU contention. Set OMP_NUM_THREADS in the external environment to tune this value as needed.
	INFO 04-29 00:48:40 custom_cache_manager.py:19] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager
	[1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:48:40 multiproc_worker_utils.py:229] Worker ready; awaiting tasks
	INFO 04-29 00:48:40 custom_cache_manager.py:19] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager
	INFO 04-29 00:48:40 custom_cache_manager.py:19] Setting Triton cache manager to: vllm.triton_utils.custom_cache_manager:CustomCacheManager
	[1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:48:40 multiproc_worker_utils.py:229] Worker ready; awaiting tasks
	[1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:48:40 multiproc_worker_utils.py:229] Worker ready; awaiting tasks
	[1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:48:40 multiproc_worker_utils.py:229] Worker ready; awaiting tasks
	INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend.
	INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend.
	INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend.
	INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend.
	[1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend.
	[1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend.
	[1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend.
	[1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:48:40 cuda.py:229] Using Flash Attention backend.
	[1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2
	INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2
	[1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5
	INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Bootstrap : Using net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so)
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO NET/Plugin: Using internal network plugin.
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO cudaDriverVersion 12040
	NCCL version 2.21.5+cuda12.4
	[1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2
	INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2
	[1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5
	INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Bootstrap : Using net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so)
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO NET/Plugin: Using internal network plugin.
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO cudaDriverVersion 12040
	NCCL version 2.21.5+cuda12.4
	INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2
	INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5
	[1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2
	[1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Bootstrap : Using net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so)
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO NET/Plugin: Using internal network plugin.
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO cudaDriverVersion 12040
	NCCL version 2.21.5+cuda12.4
	INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2
	[1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:48:41 utils.py:916] Found nccl from library libnccl.so.2
	INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5
	[1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:48:41 pynccl.py:69] vLLM is using nccl==2.21.5
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Bootstrap : Using net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so)
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO NET/Plugin: Using internal network plugin.
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO cudaDriverVersion 12040
	NCCL version 2.21.5+cuda12.4
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO ncclCommInitRank comm 0xfcadc20 rank 0 nranks 2 cudaDev 0 nvmlDev 0 busId 70 commId 0x107ef7d20723ace8 - Init START
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0.
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Setting affinity for GPU 0 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO comm 0xfcadc20 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 00/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 01/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 02/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 03/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 04/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 05/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 06/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 07/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 08/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 09/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 10/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 11/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 12/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 13/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 14/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 15/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 00/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 01/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 02/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 03/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 04/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 05/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 06/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 07/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 08/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 09/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 10/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 11/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 12/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 13/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-66dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO cudaDriverVersion 12040
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Bootstrap : Using net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so)
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO NET/Plugin: Using internal network plugin.
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO ncclCommInitRank comm 0xfcaf480 rank 1 nranks 2 cudaDev 1 nvmlDev 1 busId 80 commId 0x107ef7d20723ace8 - Init START
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0.
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Setting affinity for GPU 1 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO comm 0xfcaf480 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 00/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 01/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 02/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 03/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 04/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 05/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 06/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 07/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 08/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 09/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 10/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 11/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 12/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 13/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 14/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Channel 15/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Connected all rings
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO Connected all trees
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO threadThresholds 8/8/64 \| 16/8/64 \| 512 \| 512
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared objINFO 04-29 00:48:41 custom_all_reduce_utils.py:206] generating GPU P2P access cache in /root/.cache/vllm/gpu_p2p_access_cache_for_0,1.json
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO ncclCommInitRank comm 0xfcae060 rank 0 nranks 2 cudaDev 0 nvmlDev 2 busId 90 commId 0x2c8677096eda1f49 - Init START
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0.
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Setting affinity for GPU 2 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO comm 0xfcae060 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 00/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 01/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 02/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 03/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 04/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 05/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 06/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 07/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 08/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 09/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 10/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 11/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 12/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 13/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 14/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 15/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 00/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 01/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 02/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 03/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 04/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 05/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 06/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 07/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 08/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 09/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 10/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 11/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 12/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 13/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-66dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO ncclCommInitRank comm 0xfcae910 rank 0 nranks 2 cudaDev 0 nvmlDev 4 busId b0 commId 0x9c435407d5d72319 - Init START
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0.
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Setting affinity for GPU 4 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO comm 0xfcae910 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 00/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 01/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 02/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 03/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 04/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 05/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 06/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 07/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 08/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 09/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 10/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 11/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 12/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 13/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 14/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 15/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 00/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 01/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 02/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 03/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 04/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 05/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 06/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 07/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 08/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 09/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 10/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 11/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 12/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 13/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-66dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO cudaDriverVersion 12040
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Bootstrap : Using net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so)
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO NET/Plugin: Using internal network plugin.
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO ncclCommInitRank comm 0xfce82a0 rank 1 nranks 2 cudaDev 1 nvmlDev 3 busId a0 commId 0x2c8677096eda1f49 - Init START
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0.
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Setting affinity for GPU 3 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO comm 0xfce82a0 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 00/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 01/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 02/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 03/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 04/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 05/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 06/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 07/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 08/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 09/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 10/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 11/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 12/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 13/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 14/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Channel 15/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Connected all rings
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO Connected all trees
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO threadThresholds 8/8/64 \| 16/8/64 \| 512 \| 512
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared objINFO 04-29 00:48:41 custom_all_reduce_utils.py:206] generating GPU P2P access cache in /root/.cache/vllm/gpu_p2p_access_cache_for_2,3.json
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO ncclCommInitRank comm 0xfce9680 rank 0 nranks 2 cudaDev 0 nvmlDev 6 busId d0 commId 0xb7bbad76a9287dbc - Init START
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0.
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Setting affinity for GPU 6 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO comm 0xfce9680 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 00/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 01/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 02/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 03/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 04/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 05/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 06/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 07/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 08/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 09/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 10/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 11/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 12/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 13/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 14/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 15/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 00/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 01/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 02/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 03/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 04/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 05/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 06/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 07/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 08/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 09/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 10/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 11/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 12/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 13/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-66dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO cudaDriverVersion 12040
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Bootstrap : Using net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so)
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO NET/Plugin: Using internal network plugin.
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO ncclCommInitRank comm 0xfcb0800 rank 1 nranks 2 cudaDev 1 nvmlDev 5 busId c0 commId 0x9c435407d5d72319 - Init START
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0.
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Setting affinity for GPU 5 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO comm 0xfcb0800 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 00/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 01/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 02/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 03/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 04/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 05/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 06/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 07/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 08/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 09/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 10/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 11/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 12/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 13/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 14/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Channel 15/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Connected all rings
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO Connected all trees
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO threadThresholds 8/8/64 \| 16/8/64 \| 512 \| 512
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared objINFO 04-29 00:48:41 custom_all_reduce_utils.py:206] generating GPU P2P access cache in /root/.cache/vllm/gpu_p2p_access_cache_for_4,5.json
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO cudaDriverVersion 12040
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Bootstrap : Using net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO NET/Plugin: No plugin found (libnccl-net.so)
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO NET/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-net.so
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO NET/Plugin: Using internal network plugin.
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO NET/IB : Using [0]mlx5_0:1/RoCE [1]mlx5_1:1/RoCE [2]mlx5_2:1/RoCE [3]mlx5_3:1/RoCE [RO]; OOB net0:192.168.16.244<0>
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO ncclCommInitRank comm 0xfce99f0 rank 1 nranks 2 cudaDev 1 nvmlDev 7 busId e0 commId 0xb7bbad76a9287dbc - Init START
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO NCCL_CUMEM_ENABLE set by environment to 0.
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Setting affinity for GPU 7 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO comm 0xfce99f0 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 00/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 01/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 02/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 03/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 04/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 05/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 06/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 07/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 08/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 09/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 10/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 11/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 12/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 13/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 14/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Channel 15/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Connected all rings
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO Connected all trees
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO threadThresholds 8/8/64 \| 16/8/64 \| 512 \| 512
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared objINFO 04-29 00:48:41 custom_all_reduce_utils.py:206] generating GPU P2P access cache in /root/.cache/vllm/gpu_p2p_access_cache_for_6,7.json
	INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_4,5.json
	[1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_4,5.json
	INFO 04-29 00:49:11 shm_broadcast.py:258] vLLM message queue communication handle: Handle(connect_ip='127.0.0.1', local_reader_ranks=[1], buffer_handle=(1, 4194304, 6, 'psm_8a09a9f1'), local_subscribe_port=36381, remote_subscribe_port=None)
	INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8...
	[1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8...
	INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_6,7.json
	[1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_6,7.json
	INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod
	[1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod
	INFO 04-29 00:49:11 shm_broadcast.py:258] vLLM message queue communication handle: Handle(connect_ip='127.0.0.1', local_reader_ranks=[1], buffer_handle=(1, 4194304, 6, 'psm_61344bd8'), local_subscribe_port=56743, remote_subscribe_port=None)
	INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8...
	[1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8...
	INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod
	[1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod
	INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_2,3.json
	[1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_2,3.json
	INFO 04-29 00:49:11 shm_broadcast.py:258] vLLM message queue communication handle: Handle(connect_ip='127.0.0.1', local_reader_ranks=[1], buffer_handle=(1, 4194304, 6, 'psm_133de269'), local_subscribe_port=40037, remote_subscribe_port=None)
	INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_0,1.json
	[1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:49:11 custom_all_reduce_utils.py:244] reading GPU P2P access cache from /root/.cache/vllm/gpu_p2p_access_cache_for_0,1.json
	INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8...
	[1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8...
	INFO 04-29 00:49:11 shm_broadcast.py:258] vLLM message queue communication handle: Handle(connect_ip='127.0.0.1', local_reader_ranks=[1], buffer_handle=(1, 4194304, 6, 'psm_27adf207'), local_subscribe_port=51079, remote_subscribe_port=None)
	INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod
	[1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod
	INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8...
	[1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:49:11 model_runner.py:1110] Starting to load model /share/minghao/Models/Qwen2.5-72B-Instruct-GPTQ-Int8...
	INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod
	[1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:49:11 gptq_marlin.py:235] Using MarlinLinearKernel for GPTQMarlinLinearMethod
	Loading safetensors checkpoint shards: 0% Completed \| 0/20 [00:00<?, ?it/s]
	Loading safetensors checkpoint shards: 0% Completed \| 0/20 [00:00<?, ?it/s]
	Loading safetensors checkpoint shards: 0% Completed \| 0/20 [00:00<?, ?it/s]
	Loading safetensors checkpoint shards: 0% Completed \| 0/20 [00:00<?, ?it/s]
	Loading safetensors checkpoint shards: 5% Completed \| 1/20 [00:02<00:44, 2.36s/it]
	Loading safetensors checkpoint shards: 10% Completed \| 2/20 [00:04<00:42, 2.39s/it]
	Loading safetensors checkpoint shards: 5% Completed \| 1/20 [00:04<01:32, 4.89s/it]
	Loading safetensors checkpoint shards: 5% Completed \| 1/20 [00:04<01:34, 4.98s/it]
	Loading safetensors checkpoint shards: 5% Completed \| 1/20 [00:04<01:32, 4.86s/it]
	Loading safetensors checkpoint shards: 15% Completed \| 3/20 [00:07<00:41, 2.46s/it]
	Loading safetensors checkpoint shards: 10% Completed \| 2/20 [00:07<01:09, 3.84s/it]
	Loading safetensors checkpoint shards: 10% Completed \| 2/20 [00:08<01:09, 3.86s/it]
	Loading safetensors checkpoint shards: 10% Completed \| 2/20 [00:08<01:10, 3.90s/it]
	Loading safetensors checkpoint shards: 20% Completed \| 4/20 [00:09<00:38, 2.43s/it]
	Loading safetensors checkpoint shards: 15% Completed \| 3/20 [00:10<00:55, 3.25s/it]
	Loading safetensors checkpoint shards: 15% Completed \| 3/20 [00:10<00:54, 3.23s/it]
	Loading safetensors checkpoint shards: 15% Completed \| 3/20 [00:10<00:54, 3.23s/it]
	Loading safetensors checkpoint shards: 25% Completed \| 5/20 [00:12<00:36, 2.47s/it]
	Loading safetensors checkpoint shards: 20% Completed \| 4/20 [00:13<00:47, 2.98s/it]
	Loading safetensors checkpoint shards: 20% Completed \| 4/20 [00:13<00:47, 2.97s/it]
	Loading safetensors checkpoint shards: 20% Completed \| 4/20 [00:13<00:47, 2.97s/it]
	Loading safetensors checkpoint shards: 30% Completed \| 6/20 [00:14<00:34, 2.46s/it]
	Loading safetensors checkpoint shards: 25% Completed \| 5/20 [00:15<00:42, 2.83s/it]
	Loading safetensors checkpoint shards: 25% Completed \| 5/20 [00:15<00:42, 2.84s/it]
	Loading safetensors checkpoint shards: 25% Completed \| 5/20 [00:15<00:42, 2.83s/it]
	Loading safetensors checkpoint shards: 35% Completed \| 7/20 [00:17<00:32, 2.50s/it]
	Loading safetensors checkpoint shards: 30% Completed \| 6/20 [00:18<00:37, 2.71s/it]
	Loading safetensors checkpoint shards: 30% Completed \| 6/20 [00:18<00:37, 2.70s/it]
	Loading safetensors checkpoint shards: 30% Completed \| 6/20 [00:18<00:37, 2.70s/it]
	Loading safetensors checkpoint shards: 40% Completed \| 8/20 [00:19<00:30, 2.51s/it]
	Loading safetensors checkpoint shards: 35% Completed \| 7/20 [00:20<00:34, 2.65s/it]
	Loading safetensors checkpoint shards: 35% Completed \| 7/20 [00:20<00:34, 2.65s/it]
	Loading safetensors checkpoint shards: 35% Completed \| 7/20 [00:20<00:34, 2.65s/it]
	Loading safetensors checkpoint shards: 45% Completed \| 9/20 [00:20<00:22, 2.08s/it]
	Loading safetensors checkpoint shards: 40% Completed \| 8/20 [00:23<00:33, 2.81s/it]
	Loading safetensors checkpoint shards: 40% Completed \| 8/20 [00:23<00:33, 2.81s/it]
	Loading safetensors checkpoint shards: 40% Completed \| 8/20 [00:23<00:33, 2.81s/it]
	Loading safetensors checkpoint shards: 50% Completed \| 10/20 [00:23<00:23, 2.38s/it]
	Loading safetensors checkpoint shards: 45% Completed \| 9/20 [00:25<00:25, 2.30s/it]
	Loading safetensors checkpoint shards: 45% Completed \| 9/20 [00:24<00:25, 2.30s/it]
	Loading safetensors checkpoint shards: 45% Completed \| 9/20 [00:24<00:25, 2.30s/it]
	Loading safetensors checkpoint shards: 55% Completed \| 11/20 [00:26<00:22, 2.50s/it]
	Loading safetensors checkpoint shards: 50% Completed \| 10/20 [00:27<00:22, 2.29s/it]
	Loading safetensors checkpoint shards: 50% Completed \| 10/20 [00:27<00:22, 2.29s/it]
	Loading safetensors checkpoint shards: 50% Completed \| 10/20 [00:27<00:22, 2.29s/it]
	Loading safetensors checkpoint shards: 60% Completed \| 12/20 [00:29<00:20, 2.58s/it]
	Loading safetensors checkpoint shards: 55% Completed \| 11/20 [00:29<00:21, 2.35s/it]
	Loading safetensors checkpoint shards: 55% Completed \| 11/20 [00:29<00:21, 2.34s/it]
	Loading safetensors checkpoint shards: 55% Completed \| 11/20 [00:29<00:21, 2.35s/it]
	Loading safetensors checkpoint shards: 65% Completed \| 13/20 [00:32<00:18, 2.57s/it]
	Loading safetensors checkpoint shards: 60% Completed \| 12/20 [00:32<00:20, 2.51s/it]
	Loading safetensors checkpoint shards: 60% Completed \| 12/20 [00:32<00:20, 2.51s/it]
	Loading safetensors checkpoint shards: 60% Completed \| 12/20 [00:32<00:20, 2.51s/it]
	Loading safetensors checkpoint shards: 70% Completed \| 14/20 [00:34<00:15, 2.54s/it]
	Loading safetensors checkpoint shards: 65% Completed \| 13/20 [00:35<00:17, 2.54s/it]
	Loading safetensors checkpoint shards: 65% Completed \| 13/20 [00:35<00:17, 2.54s/it]
	Loading safetensors checkpoint shards: 65% Completed \| 13/20 [00:35<00:17, 2.54s/it]
	Loading safetensors checkpoint shards: 75% Completed \| 15/20 [00:35<00:10, 2.19s/it]
	Loading safetensors checkpoint shards: 70% Completed \| 14/20 [00:37<00:15, 2.55s/it]
	Loading safetensors checkpoint shards: 70% Completed \| 14/20 [00:37<00:15, 2.55s/it]
	Loading safetensors checkpoint shards: 70% Completed \| 14/20 [00:37<00:15, 2.55s/it]
	Loading safetensors checkpoint shards: 80% Completed \| 16/20 [00:38<00:08, 2.24s/it]
	Loading safetensors checkpoint shards: 75% Completed \| 15/20 [00:39<00:11, 2.25s/it]
	Loading safetensors checkpoint shards: 75% Completed \| 15/20 [00:39<00:11, 2.25s/it]
	Loading safetensors checkpoint shards: 75% Completed \| 15/20 [00:39<00:11, 2.25s/it]
	Loading safetensors checkpoint shards: 85% Completed \| 17/20 [00:40<00:06, 2.33s/it]
	Loading safetensors checkpoint shards: 80% Completed \| 16/20 [00:41<00:09, 2.31s/it]
	Loading safetensors checkpoint shards: 80% Completed \| 16/20 [00:41<00:09, 2.31s/it]
	Loading safetensors checkpoint shards: 80% Completed \| 16/20 [00:41<00:09, 2.31s/it]
	Loading safetensors checkpoint shards: 90% Completed \| 18/20 [00:43<00:04, 2.33s/it]
	Loading safetensors checkpoint shards: 85% Completed \| 17/20 [00:44<00:07, 2.41s/it]
	Loading safetensors checkpoint shards: 85% Completed \| 17/20 [00:44<00:07, 2.41s/it]
	Loading safetensors checkpoint shards: 85% Completed \| 17/20 [00:44<00:07, 2.41s/it]
	Loading safetensors checkpoint shards: 95% Completed \| 19/20 [00:45<00:02, 2.38s/it]
	Loading safetensors checkpoint shards: 90% Completed \| 18/20 [00:46<00:04, 2.42s/it]
	Loading safetensors checkpoint shards: 90% Completed \| 18/20 [00:46<00:04, 2.42s/it]
	Loading safetensors checkpoint shards: 90% Completed \| 18/20 [00:46<00:04, 2.42s/it]
	Loading safetensors checkpoint shards: 100% Completed \| 20/20 [00:48<00:00, 2.41s/it]
	Loading safetensors checkpoint shards: 100% Completed \| 20/20 [00:48<00:00, 2.41s/it]

	[1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:50:00 model_runner.py:1115] Loading model weights took 35.6627 GB
	INFO 04-29 00:50:00 model_runner.py:1115] Loading model weights took 35.6627 GB
	[1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:50:00 model_runner.py:1115] Loading model weights took 35.6627 GB
	Loading safetensors checkpoint shards: 95% Completed \| 19/20 [00:49<00:02, 2.43s/it]
	Loading safetensors checkpoint shards: 95% Completed \| 19/20 [00:49<00:02, 2.43s/it]
	Loading safetensors checkpoint shards: 95% Completed \| 19/20 [00:49<00:02, 2.43s/it]
	Loading safetensors checkpoint shards: 100% Completed \| 20/20 [00:51<00:00, 2.49s/it]
	Loading safetensors checkpoint shards: 100% Completed \| 20/20 [00:52<00:00, 2.49s/it]
	Loading safetensors checkpoint shards: 100% Completed \| 20/20 [00:51<00:00, 2.49s/it]
	Loading safetensors checkpoint shards: 100% Completed \| 20/20 [00:51<00:00, 2.60s/it]

	Loading safetensors checkpoint shards: 100% Completed \| 20/20 [00:52<00:00, 2.60s/it]

	Loading safetensors checkpoint shards: 100% Completed \| 20/20 [00:51<00:00, 2.60s/it]

	[1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:50:04 model_runner.py:1115] Loading model weights took 35.6627 GB
	INFO 04-29 00:50:04 model_runner.py:1115] Loading model weights took 35.6627 GB
	INFO 04-29 00:50:04 model_runner.py:1115] Loading model weights took 35.6627 GB
	INFO 04-29 00:50:04 model_runner.py:1115] Loading model weights took 35.6627 GB
	[1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:50:04 model_runner.py:1115] Loading model weights took 35.6627 GB
	8f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 14/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Channel 15/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Connected all rings
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO Connected all trees
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO threadThresholds 8/8/64 \| 16/8/64 \| 512 \| 512
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-tuner.so
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO TUNER/Plugin: Using internal tuner plugin.
	dsw-222255-668f79686f-2vnl5:3571749:3571749 [0] NCCL INFO ncclCommInitRank comm 0xfce9680 rank 0 nranks 2 cudaDev 0 nvmlDev 6 busId d0 commId 0xb7bbad76a9287dbc - Init COMPLETE
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO ncclCommInitRank comm 0x36f82500 rank 0 nranks 2 cudaDev 0 nvmlDev 6 busId d0 commId 0x67ee5f83dd395178 - Init START
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Setting affinity for GPU 6 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO comm 0x36f82500 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 00/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 01/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 02/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 03/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 04/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 05/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 06/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 07/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 08/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 09/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 10/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 11/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 12/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 13/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 14/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 15/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 00/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 01/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 02/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 03/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 04/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571749:3573052 [0] NCCL INFO Channel 05/0 : 0[6] -> 1[7] via P2P/IPC/read
	dsw-222255-668f79686ect file: No such file or directory : when loading libnccl-tuner.so
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO TUNER/Plugin: Using internal tuner plugin.
	dsw-222255-668f79686f-2vnl5:3572216:3572216 [1] NCCL INFO ncclCommInitRank comm 0xfce99f0 rank 1 nranks 2 cudaDev 1 nvmlDev 7 busId e0 commId 0xb7bbad76a9287dbc - Init COMPLETE
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO ncclCommInitRank comm 0x36f8b2f0 rank 1 nranks 2 cudaDev 1 nvmlDev 7 busId e0 commId 0x67ee5f83dd395178 - Init START
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Setting affinity for GPU 7 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO comm 0x36f8b2f0 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 00/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 01/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 02/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 03/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 04/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 05/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 06/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 07/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 08/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 09/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 10/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 11/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 12/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 13/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 14/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Channel 15/0 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Connected all rings
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO Connected all trees
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO threadThresholds 8/8/64 \| 16/8/64 \| 512 \| 512
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer
	dsw-222255-668f79686f-2vnl5:3572216:3573053 [1] NCCL INFO ncclCommInitRank comm 0x36f8b2f0 rank 1 nranks 2 cudaDev 1 nvmlDev 7 busId e0 commId 0x67ee5f83dd395178 - Init COMPLETE
	dsw-222255-668f79686f-2vnl5:3572216:3573061 [1] NCCL INFO Channel 00/1 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573061 [1] NCCL INFO Channel 01/1 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573061 [1] NCCL INFO Channel 02/1 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573061 [1] NCCL INFO Channel 03/1 : 1[7] -> 0[6] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572216:3573061 [1] NCCL INF[1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:50:23 worker.py:267] Memory profiling takes 22.61 seconds
	[1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:50:23 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
	[1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:50:23 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB.
	INFO 04-29 00:50:23 worker.py:267] Memory profiling takes 22.65 seconds
	INFO 04-29 00:50:23 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
	INFO 04-29 00:50:23 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB.
	INFO 04-29 00:50:23 executor_base.py:111] # cuda blocks: 11901, # CPU blocks: 1638
	INFO 04-29 00:50:23 executor_base.py:116] Maximum concurrency for 32768 tokens per request: 5.81x
	8f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 14/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Channel 15/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Connected all rings
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO Connected all trees
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO threadThresholds 8/8/64 \| 16/8/64 \| 512 \| 512
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-tuner.so
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO TUNER/Plugin: Using internal tuner plugin.
	dsw-222255-668f79686f-2vnl5:3571748:3571748 [0] NCCL INFO ncclCommInitRank comm 0xfcae910 rank 0 nranks 2 cudaDev 0 nvmlDev 4 busId b0 commId 0x9c435407d5d72319 - Init COMPLETE
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO ncclCommInitRank comm 0x36e33d50 rank 0 nranks 2 cudaDev 0 nvmlDev 4 busId b0 commId 0x9e29b5fcef03aeae - Init START
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Setting affinity for GPU 4 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO comm 0x36e33d50 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 00/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 01/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 02/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 03/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 04/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 05/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 06/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 07/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 08/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 09/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 10/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 11/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 12/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 13/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 14/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 15/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 00/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 01/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 02/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 03/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 04/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571748:3573063 [0] NCCL INFO Channel 05/0 : 0[4] -> 1[5] via P2P/IPC/read
	dsw-222255-668f79686ect file: No such file or directory : when loading libnccl-tuner.so
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO TUNER/Plugin: Using internal tuner plugin.
	dsw-222255-668f79686f-2vnl5:3572223:3572223 [1] NCCL INFO ncclCommInitRank comm 0xfcb0800 rank 1 nranks 2 cudaDev 1 nvmlDev 5 busId c0 commId 0x9c435407d5d72319 - Init COMPLETE
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO ncclCommInitRank comm 0x36e3ac50 rank 1 nranks 2 cudaDev 1 nvmlDev 5 busId c0 commId 0x9e29b5fcef03aeae - Init START
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Setting affinity for GPU 5 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO comm 0x36e3ac50 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 00/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 01/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 02/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 03/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 04/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 05/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 06/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 07/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 08/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 09/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 10/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 11/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 12/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 13/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 14/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Channel 15/0 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Connected all rings
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO Connected all trees
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO threadThresholds 8/8/64 \| 16/8/64 \| 512 \| 512
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer
	dsw-222255-668f79686f-2vnl5:3572223:3573064 [1] NCCL INFO ncclCommInitRank comm 0x36e3ac50 rank 1 nranks 2 cudaDev 1 nvmlDev 5 busId c0 commId 0x9e29b5fcef03aeae - Init COMPLETE
	dsw-222255-668f79686f-2vnl5:3572223:3573071 [1] NCCL INFO Channel 00/1 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573071 [1] NCCL INFO Channel 01/1 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573071 [1] NCCL INFO Channel 02/1 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573071 [1] NCCL INFO Channel 03/1 : 1[5] -> 0[4] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572223:3573071 [1] NCCL INF8f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 14/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Channel 15/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Connected all rings
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO Connected all trees
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO threadThresholds 8/8/64 \| 16/8/64 \| 512 \| 512
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-tuner.so
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO TUNER/Plugin: Using internal tuner plugin.
	dsw-222255-668f79686f-2vnl5:3571746:3571746 [0] NCCL INFO ncclCommInitRank comm 0xfcadc20 rank 0 nranks 2 cudaDev 0 nvmlDev 0 busId 70 commId 0x107ef7d20723ace8 - Init COMPLETE
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO ncclCommInitRank comm 0x36e33830 rank 0 nranks 2 cudaDev 0 nvmlDev 0 busId 70 commId 0x54df945661a68a33 - Init START
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Setting affinity for GPU 0 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO comm 0x36e33830 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 00/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 01/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 02/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 03/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 04/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 05/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 06/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 07/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 08/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 09/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 10/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 11/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 12/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 13/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 14/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 15/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 00/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 01/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 02/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 03/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 04/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571746:3573074 [0] NCCL INFO Channel 05/0 : 0[0] -> 1[1] via P2P/IPC/read
	dsw-222255-668f79686[1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:50:24 worker.py:267] Memory profiling takes 19.39 seconds
	[1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:50:24 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
	[1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:50:24 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB.
	INFO 04-29 00:50:24 worker.py:267] Memory profiling takes 19.46 seconds
	INFO 04-29 00:50:24 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
	INFO 04-29 00:50:24 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB.
	ect file: No such file or directory : when loading libnccl-tuner.so
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO TUNER/Plugin: Using internal tuner plugin.
	dsw-222255-668f79686f-2vnl5:3572220:3572220 [1] NCCL INFO ncclCommInitRank comm 0xfcaf480 rank 1 nranks 2 cudaDev 1 nvmlDev 1 busId 80 commId 0x107ef7d20723ace8 - Init COMPLETE
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO ncclCommInitRank comm 0x36e388d0 rank 1 nranks 2 cudaDev 1 nvmlDev 1 busId 80 commId 0x54df945661a68a33 - Init START
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Setting affinity for GPU 1 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO comm 0x36e388d0 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 00/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 01/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 02/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 03/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 04/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 05/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 06/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 07/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 08/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 09/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 10/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 11/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 12/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 13/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 14/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Channel 15/0 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Connected all rings
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO Connected all trees
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO threadThresholds 8/8/64 \| 16/8/64 \| 512 \| 512
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer
	dsw-222255-668f79686f-2vnl5:3572220:3573075 [1] NCCL INFO ncclCommInitRank comm 0x36e388d0 rank 1 nranks 2 cudaDev 1 nvmlDev 1 busId 80 commId 0x54df945661a68a33 - Init COMPLETE
	dsw-222255-668f79686f-2vnl5:3572220:3573090 [1] NCCL INFO Channel 00/1 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573090 [1] NCCL INFO Channel 01/1 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573090 [1] NCCL INFO Channel 02/1 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573090 [1] NCCL INFO Channel 03/1 : 1[1] -> 0[0] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572220:3573090 [1] NCCL INF8f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 14/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Channel 15/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Connected all rings
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO Connected all trees
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO threadThresholds 8/8/64 \| 16/8/64 \| 512 \| 512
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO TUNER/Plugin: Plugin load returned 2 : libnccl-net.so: cannot open shared object file: No such file or directory : when loading libnccl-tuner.so
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO TUNER/Plugin: Using internal tuner plugin.
	dsw-222255-668f79686f-2vnl5:3571747:3571747 [0] NCCL INFO ncclCommInitRank comm 0xfcae060 rank 0 nranks 2 cudaDev 0 nvmlDev 2 busId 90 commId 0x2c8677096eda1f49 - Init COMPLETE
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO ncclCommInitRank comm 0x36e330a0 rank 0 nranks 2 cudaDev 0 nvmlDev 2 busId 90 commId 0xacaabfeeee980308 - Init START
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Setting affinity for GPU 2 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO comm 0x36e330a0 rank 0 nRanks 2 nNodes 1 localRanks 2 localRank 0 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 00/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 01/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 02/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 03/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 04/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 05/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 06/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 07/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 08/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 09/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 10/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 11/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 12/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 13/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 14/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 15/16 : 0 1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Trees [0] 1/-1/-1->0->-1 [1] 1/-1/-1->0->-1 [2] 1/-1/-1->0->-1 [3] 1/-1/-1->0->-1 [4] -1/-1/-1->0->1 [5] -1/-1/-1->0->1 [6] -1/-1/-1->0->1 [7] -1/-1/-1->0->1 [8] 1/-1/-1->0->-1 [9] 1/-1/-1->0->-1 [10] 1/-1/-1->0->-1 [11] 1/-1/-1->0->-1 [12] -1/-1/-1->0->1 [13] -1/-1/-1->0->1 [14] -1/-1/-1->0->1 [15] -1/-1/-1->0->1
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 00/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 01/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 02/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 03/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 04/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3571747:3573081 [0] NCCL INFO Channel 05/0 : 0[2] -> 1[3] via P2P/IPC/read
	dsw-222255-668f79686ect file: No such file or directory : when loading libnccl-tuner.so
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO TUNER/Plugin: Using internal tuner plugin.
	dsw-222255-668f79686f-2vnl5:3572112:3572112 [1] NCCL INFO ncclCommInitRank comm 0xfce82a0 rank 1 nranks 2 cudaDev 1 nvmlDev 3 busId a0 commId 0x2c8677096eda1f49 - Init COMPLETE
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Using non-device net plugin version 0
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Using network IB
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO ncclCommInitRank comm 0x36e3b2d0 rank 1 nranks 2 cudaDev 1 nvmlDev 3 busId a0 commId 0xacaabfeeee980308 - Init START
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Setting affinity for GPU 3 to ffff,ffffffff
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO comm 0x36e3b2d0 rank 1 nRanks 2 nNodes 1 localRanks 2 localRank 1 MNNVL 0
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Trees [0] -1/-1/-1->1->0 [1] -1/-1/-1->1->0 [2] -1/-1/-1->1->0 [3] -1/-1/-1->1->0 [4] 0/-1/-1->1->-1 [5] 0/-1/-1->1->-1 [6] 0/-1/-1->1->-1 [7] 0/-1/-1->1->-1 [8] -1/-1/-1->1->0 [9] -1/-1/-1->1->0 [10] -1/-1/-1->1->0 [11] -1/-1/-1->1->0 [12] 0/-1/-1->1->-1 [13] 0/-1/-1->1->-1 [14] 0/-1/-1->1->-1 [15] 0/-1/-1->1->-1
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO P2P Chunksize set to 524288
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 00/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 01/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 02/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 03/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 04/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 05/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 06/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 07/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 08/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 09/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 10/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 11/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 12/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 13/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 14/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Channel 15/0 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Connected all rings
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO Connected all trees
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO threadThresholds 8/8/64 \| 16/8/64 \| 512 \| 512
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO 16 coll channels, 16 collnet channels, 0 nvls channels, 16 p2p channels, 16 p2p channels per peer
	dsw-222255-668f79686f-2vnl5:3572112:3573082 [1] NCCL INFO ncclCommInitRank comm 0x36e3b2d0 rank 1 nranks 2 cudaDev 1 nvmlDev 3 busId a0 commId 0xacaabfeeee980308 - Init COMPLETE
	dsw-222255-668f79686f-2vnl5:3572112:3573093 [1] NCCL INFO Channel 00/1 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573093 [1] NCCL INFO Channel 01/1 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573093 [1] NCCL INFO Channel 02/1 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573093 [1] NCCL INFO Channel 03/1 : 1[3] -> 0[2] via P2P/IPC/read
	dsw-222255-668f79686f-2vnl5:3572112:3573093 [1] NCCL INFINFO 04-29 00:50:24 executor_base.py:111] # cuda blocks: 11901, # CPU blocks: 1638
	INFO 04-29 00:50:24 executor_base.py:116] Maximum concurrency for 32768 tokens per request: 5.81x
	[1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:50:24 worker.py:267] Memory profiling takes 19.75 seconds
	[1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:50:24 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
	[1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:50:24 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB.
	INFO 04-29 00:50:24 worker.py:267] Memory profiling takes 19.79 seconds
	INFO 04-29 00:50:24 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
	INFO 04-29 00:50:24 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB.
	[1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:50:24 worker.py:267] Memory profiling takes 19.54 seconds
	[1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:50:24 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
	[1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:50:24 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB.
	INFO 04-29 00:50:24 worker.py:267] Memory profiling takes 19.58 seconds
	INFO 04-29 00:50:24 worker.py:267] the current vLLM instance can use total_gpu_memory (79.32GiB) x gpu_memory_utilization (0.90) = 71.39GiB
	INFO 04-29 00:50:24 worker.py:267] model weights take 35.66GiB; non_torch_memory takes 1.01GiB; PyTorch activation peak memory takes 5.66GiB; the rest of the memory reserved for KV Cache is 29.06GiB.
	INFO 04-29 00:50:24 executor_base.py:111] # cuda blocks: 11901, # CPU blocks: 1638
	INFO 04-29 00:50:24 executor_base.py:116] Maximum concurrency for 32768 tokens per request: 5.81x
	INFO 04-29 00:50:24 executor_base.py:111] # cuda blocks: 11901, # CPU blocks: 1638
	INFO 04-29 00:50:24 executor_base.py:116] Maximum concurrency for 32768 tokens per request: 5.81x
	INFO 04-29 00:50:26 llm_engine.py:436] init engine (profile, create kv cache, warmup model) took 26.30 seconds
	Processed prompts: 0%\| \| 0/10 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]INFO 04-29 00:50:27 llm_engine.py:436] init engine (profile, create kv cache, warmup model) took 23.47 seconds
	Processed prompts: 0%\| \| 0/10 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s]INFO 04-29 00:50:28 llm_engine.py:436] init engine (profile, create kv cache, warmup model) took 23.97 seconds
	INFO 04-29 00:50:28 llm_engine.py:436] init engine (profile, create kv cache, warmup model) took 23.77 seconds
	Processed prompts: 0%\| \| 0/10 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s] Processed prompts: 0%\| \| 0/10 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s] Processed prompts: 10%\|▊ \| 1/10 [00:07<01:11, 7.93s/it, est. speed input: 139.56 toks/s, output: 6.05 toks/s] Processed prompts: 40%\|██▊ \| 4/10 [00:08<00:09, 1.56s/it, est. speed input: 543.18 toks/s, output: 24.62 toks/s] Processed prompts: 60%\|████▏ \| 6/10 [00:08<00:03, 1.03it/s, est. speed input: 774.07 toks/s, output: 37.05 toks/s] Processed prompts: 70%\|████▉ \| 7/10 [00:08<00:02, 1.28it/s, est. speed input: 887.12 toks/s, output: 43.85 toks/s] Processed prompts: 10%\|▊ \| 1/10 [00:08<01:13, 8.11s/it, est. speed input: 136.45 toks/s, output: 6.16 toks/s] Processed prompts: 20%\|█▍ \| 2/10 [00:08<00:27, 3.50s/it, est. speed input: 265.96 toks/s, output: 12.65 toks/s] Processed prompts: 10%\|▊ \| 1/10 [00:07<01:10, 7.85s/it, est. speed input: 140.59 toks/s, output: 5.86 toks/s] Processed prompts: 30%\|██ \| 3/10 [00:08<00:13, 1.96s/it, est. speed input: 399.32 toks/s, output: 19.38 toks/s] Processed prompts: 30%\|██ \| 3/10 [00:08<00:14, 2.10s/it, est. speed input: 426.80 toks/s, output: 17.95 toks/s] Processed prompts: 10%\|▊ \| 1/10 [00:08<01:13, 8.12s/it, est. speed input: 132.71 toks/s, output: 6.40 toks/s] Processed prompts: 70%\|████▉ \| 7/10 [00:08<00:01, 1.76it/s, est. speed input: 904.74 toks/s, output: 47.29 toks/s] Processed prompts: 100%\|█████\| 10/10 [00:09<00:00, 1.83it/s, est. speed input: 1154.39 toks/s, output: 61.43 toks/s] Processed prompts: 100%\|█████\| 10/10 [00:09<00:00, 1.03it/s, est. speed input: 1154.39 toks/s, output: 61.43 toks/s]
	Processed prompts: 60%\|████▏ \| 6/10 [00:08<00:03, 1.18it/s, est. speed input: 823.58 toks/s, output: 36.85 toks/s] Processed prompts: 30%\|██ \| 3/10 [00:08<00:15, 2.16s/it, est. speed input: 398.04 toks/s, output: 19.50 toks/s] Processed prompts: 60%\|████▏ \| 6/10 [00:08<00:03, 1.15it/s, est. speed input: 787.71 toks/s, output: 39.37 toks/s]推理完成 Total Finish:10
	batch time cost: 9.761106252670288s
	[Memory] CPU: 7136.72 MB
	[Memory] GPU: 66294.75 MB
	INFO 04-29 00:50:37 multiproc_worker_utils.py:141] Terminating local vLLM worker processes
	[1;36m(VllmWorkerProcess pid=3572216)[0;0m INFO 04-29 00:50:37 multiproc_worker_utils.py:253] Worker exiting
	Processed prompts: 80%\|████▊ \| 8/10 [00:08<00:01, 1.71it/s, est. speed input: 1068.07 toks/s, output: 49.79 toks/s] Processed prompts: 90%\|█████▍\| 9/10 [00:08<00:00, 2.06it/s, est. speed input: 1189.21 toks/s, output: 56.41 toks/s] Processed prompts: 100%\|█████\| 10/10 [00:08<00:00, 1.17it/s, est. speed input: 1314.94 toks/s, output: 63.64 toks/s]
	Processed prompts: 90%\|█████▍\| 9/10 [00:09<00:00, 2.13it/s, est. speed input: 1099.14 toks/s, output: 60.35 toks/s] Processed prompts: 80%\|████▊ \| 8/10 [00:08<00:01, 1.64it/s, est. speed input: 1011.19 toks/s, output: 52.29 toks/s]推理完成 Total Finish:10
	batch time cost: 8.617689609527588s
	[Memory] CPU: 7125.53 MB
	[Memory] GPU: 66294.76 MB
	INFO 04-29 00:50:37 multiproc_worker_utils.py:141] Terminating local vLLM worker processes
	[1;36m(VllmWorkerProcess pid=3572112)[0;0m INFO 04-29 00:50:37 multiproc_worker_utils.py:253] Worker exiting
	/share/liangzy/miniconda3/envs/vllm/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
	warnings.warn('resource_tracker: There appear to be %d '
	Processed prompts: 100%\|█████\| 10/10 [00:10<00:00, 1.05s/it, est. speed input: 1073.40 toks/s, output: 60.81 toks/s]
	Processed prompts: 100%\|█████\| 10/10 [00:10<00:00, 1.55it/s, est. speed input: 1099.19 toks/s, output: 60.35 toks/s] Processed prompts: 100%\|█████\| 10/10 [00:10<00:00, 1.02s/it, est. speed input: 1099.19 toks/s, output: 60.35 toks/s]
	推理完成 Total Finish:10
	batch time cost: 10.546858549118042s
	[Memory] CPU: 7134.02 MB
	[Memory] GPU: 66294.75 MB
	INFO 04-29 00:50:38 multiproc_worker_utils.py:141] Terminating local vLLM worker processes
	[1;36m(VllmWorkerProcess pid=3572223)[0;0m INFO 04-29 00:50:38 multiproc_worker_utils.py:253] Worker exiting
	推理完成 Total Finish:10
	batch time cost: 10.227976083755493s
	[Memory] CPU: 7132.92 MB
	[Memory] GPU: 66294.75 MB
	INFO 04-29 00:50:39 multiproc_worker_utils.py:141] Terminating local vLLM worker processes
	[1;36m(VllmWorkerProcess pid=3572220)[0;0m INFO 04-29 00:50:39 multiproc_worker_utils.py:253] Worker exiting
	/share/liangzy/miniconda3/envs/vllm/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
	warnings.warn('resource_tracker: There appear to be %d '
	/share/liangzy/miniconda3/envs/vllm/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
	warnings.warn('resource_tracker: There appear to be %d '
	/share/liangzy/miniconda3/envs/vllm/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown
	warnings.warn('resource_tracker: There appear to be %d '
	OOM了没有？
	Total size: 40 Total time cost: 142.67037987709045s