| nohup: ignoring input |
|
|
| ============================================ |
| Running DFlash eval: denoise_steps=1 |
| GPUs: 8, Samples: 500 |
| ============================================ |
| W0405 13:06:29.225000 14266 site-packages/torch/distributed/run.py:803] |
| W0405 13:06:29.225000 14266 site-packages/torch/distributed/run.py:803] ***************************************** |
| W0405 13:06:29.225000 14266 site-packages/torch/distributed/run.py:803] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. |
| W0405 13:06:29.225000 14266 site-packages/torch/distributed/run.py:803] ***************************************** |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 50.90it/s] |
| ============================================================ |
| DFlash Evaluation (Multi-GPU Data Parallel) |
| ============================================================ |
| Target model: /workspace/models/Qwen3-8B |
| Draft model: /workspace/models/Qwen3-8B-DFlash-b16 |
| Dataset: math500 |
| Max samples: 500 |
| Max new tokens: 512 |
| Denoise steps: 1 |
| Temperature: 0.0 |
| GPUs: 8 |
| Dtype: bfloat16 |
| ============================================================ |
|
|
| [1/4] Loading tokenizer... |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 60%|ββββββ | 3/5 [00:00<00:00, 29.79it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 45.18it/s] |
| [2/4] Loading target model on 8 GPUs... |
| `torch_dtype` is deprecated! Use `dtype` instead! |
| `torch_dtype` is deprecated! Use `dtype` instead! |
| `torch_dtype` is deprecated! Use `dtype` instead! |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]`torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 50.65it/s] |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:00<00:00, 39.05it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 48.56it/s] |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 50.65it/s] |
|
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:00<00:00, 39.23it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 48.77it/s] |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:00<00:00, 38.48it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 47.86it/s] |
|
Loading checkpoint shards: 80%|ββββββββ | 4/5 [00:00<00:00, 38.16it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 47.47it/s] |
| W0405 13:07:11.623000 14266 site-packages/torch/distributed/elastic/agent/server/api.py:725] Received 15 death signal, shutting down workers |
| W0405 13:07:11.628000 14266 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 14438 closing signal SIGTERM |
| W0405 13:07:11.628000 14266 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 14439 closing signal SIGTERM |
| W0405 13:07:11.628000 14266 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 14440 closing signal SIGTERM |
| W0405 13:07:11.628000 14266 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 14441 closing signal SIGTERM |
| W0405 13:07:11.628000 14266 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 14442 closing signal SIGTERM |
| W0405 13:07:11.629000 14266 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 14443 closing signal SIGTERM |
| W0405 13:07:11.629000 14266 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 14444 closing signal SIGTERM |
| W0405 13:07:11.629000 14266 site-packages/torch/distributed/elastic/multiprocessing/api.py:908] Sending process 14445 closing signal SIGTERM |
| Traceback (most recent call last): |
| File "/workspace/miniconda3/envs/specforge/bin/torchrun", line 6, in <module> |
| sys.exit(main()) |
| ^^^^^^ |
| File "/workspace/miniconda3/envs/specforge/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 357, in wrapper |
| return f(*args, **kwargs) |
| ^^^^^^^^^^^^^^^^^^ |
| File "/workspace/miniconda3/envs/specforge/lib/python3.11/site-packages/torch/distributed/run.py", line 936, in main |
| run(args) |
| File "/workspace/miniconda3/envs/specforge/lib/python3.11/site-packages/torch/distributed/run.py", line 927, in run |
| elastic_launch( |
| File "/workspace/miniconda3/envs/specforge/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 156, in __call__ |
| return launch_agent(self._config, self._entrypoint, list(args)) |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| File "/workspace/miniconda3/envs/specforge/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 284, in launch_agent |
| result = agent.run() |
| ^^^^^^^^^^^ |
| File "/workspace/miniconda3/envs/specforge/lib/python3.11/site-packages/torch/distributed/elastic/metrics/api.py", line 138, in wrapper |
| result = f(*args, **kwargs) |
| ^^^^^^^^^^^^^^^^^^ |
| File "/workspace/miniconda3/envs/specforge/lib/python3.11/site-packages/torch/distributed/elastic/agent/server/api.py", line 717, in run |
| result = self._invoke_run(role) |
| ^^^^^^^^^^^^^^^^^^^^^^ |
| File "/workspace/miniconda3/envs/specforge/lib/python3.11/site-packages/torch/distributed/elastic/agent/server/api.py", line 881, in _invoke_run |
| time.sleep(monitor_interval) |
| File "/workspace/miniconda3/envs/specforge/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/api.py", line 85, in _terminate_process_handler |
| raise SignalException(f"Process {os.getpid()} got signal: {sigval}", sigval=sigval) |
| torch.distributed.elastic.multiprocessing.api.SignalException: Process 14266 got signal: 15 |
|
|
| ============================================ |
| Running DFlash eval: denoise_steps=2 |
| GPUs: 8, Samples: 500 |
| ============================================ |
|
|
| ============================================ |
| Running DFlash eval: denoise_steps=3 |
| GPUs: 8, Samples: 500 |
| ============================================ |
| W0405 13:07:18.843000 14859 site-packages/torch/distributed/run.py:803] |
| W0405 13:07:18.843000 14859 site-packages/torch/distributed/run.py:803] ***************************************** |
| W0405 13:07:18.843000 14859 site-packages/torch/distributed/run.py:803] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. |
| W0405 13:07:18.843000 14859 site-packages/torch/distributed/run.py:803] ***************************************** |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| `torch_dtype` is deprecated! Use `dtype` instead! |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 58.03it/s] |
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 145.22it/s] |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 66.98it/s] |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 63.53it/s] |
| ============================================================ |
| DFlash Evaluation (Multi-GPU Data Parallel) |
| ============================================================ |
| Target model: /workspace/models/Qwen3-8B |
| Draft model: /workspace/models/Qwen3-8B-DFlash-b16 |
| Dataset: math500 |
| Max samples: 500 |
| Max new tokens: 512 |
| Denoise steps: 3 |
| Temperature: 0.0 |
| GPUs: 8 |
| Dtype: bfloat16 |
| ============================================================ |
|
|
| [1/4] Loading tokenizer... |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 141.15it/s] |
| [2/4] Loading target model on 8 GPUs... |
| `torch_dtype` is deprecated! Use `dtype` instead! |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 57.57it/s] |
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 59.77it/s] |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 57.44it/s] |
| [3/4] Loading draft model on 8 GPUs... |
| TORCH_CUDA_ARCH_LIST to 9.0 |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| Set TORCH_CUDA_ARCH_LIST to 9.0 |
| /workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend. |
| warnings.warn( |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 57.86it/s] |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 58.16it/s] |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 142.41it/s] |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead. |
| <frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead. |
| `torch_dtype` is deprecated! Use `dtype` instead! |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 57.29it/s] |
| `torch_dtype` is deprecated! Use `dtype` instead! |
|
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 55.72it/s] |
|
Loading checkpoint shards: 100%|ββββββββββ| 5/5 [00:00<00:00, 60.22it/s] |
| [3/4] Loading draft model on 8 GPUs... |
| |