nohup: ignoring input ***************************************** Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. ***************************************** Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/syxin_old/Specforge/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 Set TORCH_CUDA_ARCH_LIST to 9.0 Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/syxin_old/Specforge/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/syxin_old/Specforge/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( /workspace/hanrui/syxin_old/Specforge/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( /workspace/hanrui/syxin_old/Specforge/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( /workspace/hanrui/syxin_old/Specforge/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( Set TORCH_CUDA_ARCH_LIST to 9.0 Set TORCH_CUDA_ARCH_LIST to 9.0 /workspace/hanrui/syxin_old/Specforge/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( /workspace/hanrui/syxin_old/Specforge/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. warnings.warn( INFO:specforge.utils:rank 6: bind to device 6 INFO:specforge.utils:rank 4: bind to device 4 INFO:specforge.utils:rank 2: bind to device 2 INFO:specforge.utils:rank 5: bind to device 5 INFO:specforge.utils:rank 4: device mesh: DeviceMesh((dp=8, tp=1), device: 'cuda', stride: (1, 1)) INFO:specforge.utils:rank 3: bind to device 3 INFO:specforge.utils:rank 6: device mesh: DeviceMesh((dp=8, tp=1), device: 'cuda', stride: (1, 1)) INFO:specforge.utils:rank 4: Initialized distributed INFO:specforge.utils:rank 2: device mesh: DeviceMesh((dp=8, tp=1), device: 'cuda', stride: (1, 1)) INFO:specforge.utils:rank 6: Initialized distributed INFO:specforge.utils:rank 2: Initialized distributed INFO:specforge.utils:rank 1: bind to device 1 INFO:specforge.utils:rank 7: bind to device 7 INFO:specforge.utils:rank 5: device mesh: DeviceMesh((dp=8, tp=1), device: 'cuda', stride: (1, 1)) `torch_dtype` is deprecated! Use `dtype` instead! INFO:specforge.utils:rank 0: bind to device 0 `torch_dtype` is deprecated! Use `dtype` instead! `torch_dtype` is deprecated! Use `dtype` instead! INFO:specforge.utils:rank 5: Initialized distributed `torch_dtype` is deprecated! Use `dtype` instead! INFO:specforge.utils:rank 3: device mesh: DeviceMesh((dp=8, tp=1), device: 'cuda', stride: (1, 1)) INFO:specforge.utils:rank 3: Initialized distributed `torch_dtype` is deprecated! Use `dtype` instead! INFO:specforge.utils:rank 7: device mesh: DeviceMesh((dp=8, tp=1), device: 'cuda', stride: (1, 1)) INFO:specforge.utils:rank 1: device mesh: DeviceMesh((dp=8, tp=1), device: 'cuda', stride: (1, 1)) The following generation flags are not valid and may be ignored: ['output_hidden_states']. Set `TRANSFORMERS_VERBOSITY=info` for more details. INFO:specforge.utils:rank 0: device mesh: DeviceMesh((dp=8, tp=1), device: 'cuda', stride: (1, 1)) The following generation flags are not valid and may be ignored: ['output_hidden_states']. Set `TRANSFORMERS_VERBOSITY=info` for more details. The following generation flags are not valid and may be ignored: ['output_hidden_states']. Set `TRANSFORMERS_VERBOSITY=info` for more details. The following generation flags are not valid and may be ignored: ['output_hidden_states']. Set `TRANSFORMERS_VERBOSITY=info` for more details. INFO:specforge.utils:rank 7: Initialized distributed INFO:specforge.utils:rank 1: Initialized distributed INFO:specforge.utils:rank 0: Initialized distributed INFO:specforge.utils:Loading target model from /workspace/models/Qwen3-8B using hf backend The following generation flags are not valid and may be ignored: ['output_hidden_states']. Set `TRANSFORMERS_VERBOSITY=info` for more details. `torch_dtype` is deprecated! Use `dtype` instead! `torch_dtype` is deprecated! Use `dtype` instead! `torch_dtype` is deprecated! Use `dtype` instead! The following generation flags are not valid and may be ignored: ['output_hidden_states']. Set `TRANSFORMERS_VERBOSITY=info` for more details. The following generation flags are not valid and may be ignored: ['output_hidden_states']. Set `TRANSFORMERS_VERBOSITY=info` for more details. The following generation flags are not valid and may be ignored: ['output_hidden_states']. Set `TRANSFORMERS_VERBOSITY=info` for more details. Loading checkpoint shards: 0%| | 0/5 [00:00