/workspace/miniconda3/envs/dflash/bin/python3: can't open file '/workspace/hanrui/ ': [Errno 2] No such file or directory E0317 16:57:14.100000 140364991186752 torch/distributed/elastic/multiprocessing/api.py:833] failed (exitcode: 2) local_rank: 0 (pid: 14058) of binary: /workspace/miniconda3/envs/dflash/bin/python3 Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/distributed/run.py", line 905, in main() File "/workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 348, in wrapper return f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "/workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/distributed/run.py", line 901, in main run(args) File "/workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/distributed/run.py", line 892, in run elastic_launch( File "/workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 133, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/miniconda3/envs/dflash/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ FAILED ------------------------------------------------------------ Failures: ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2026-03-17_16:57:14 host : job-006ce80a7c47-20260302193512-5dcd4c9bbd-gfjsn rank : 0 (local_rank: 0) exitcode : 2 (pid: 14058) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ usage: run.py [-h] [--nnodes NNODES] [--nproc-per-node NPROC_PER_NODE] [--rdzv-backend RDZV_BACKEND] [--rdzv-endpoint RDZV_ENDPOINT] [--rdzv-id RDZV_ID] [--rdzv-conf RDZV_CONF] [--standalone] [--max-restarts MAX_RESTARTS] [--monitor-interval MONITOR_INTERVAL] [--start-method {spawn,fork,forkserver}] [--role ROLE] [-m] [--no-python] [--run-path] [--log-dir LOG_DIR] [-r REDIRECTS] [-t TEE] [--local-ranks-filter LOCAL_RANKS_FILTER] [--node-rank NODE_RANK] [--master-addr MASTER_ADDR] [--master-port MASTER_PORT] [--local-addr LOCAL_ADDR] [--logs-specs LOGS_SPECS] training_script ... run.py: error: the following arguments are required: training_script, training_script_args