Delete lm_eval/test_eval2.log
Browse files- lm_eval/test_eval2.log +0 -130
lm_eval/test_eval2.log
DELETED
|
@@ -1,130 +0,0 @@
|
|
| 1 |
-
The following values were not passed to `accelerate launch` and had defaults used instead:
|
| 2 |
-
`--num_processes` was set to a value of `2`
|
| 3 |
-
More than one GPU was found, enabling multi-GPU training.
|
| 4 |
-
If this was unintended please pass in `--num_processes=1`.
|
| 5 |
-
`--num_machines` was set to a value of `1`
|
| 6 |
-
`--mixed_precision` was set to a value of `'no'`
|
| 7 |
-
`--dynamo_backend` was set to a value of `'no'`
|
| 8 |
-
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
|
| 9 |
-
2026-03-19:16:52:56 INFO [_cli.run:375] Including path: ./
|
| 10 |
-
2026-03-19:16:52:56 INFO [_cli.run:376] Selected Tasks: ['arc_easy_mi', 'arc_challenge_mi', 'hellaswag', 'piqa']
|
| 11 |
-
2026-03-19:16:52:56 INFO [evaluator:211] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | Setting fewshot manual seed to 1234
|
| 12 |
-
2026-03-19:16:52:56 INFO [evaluator:236] Initializing cloverlm model, with arguments: {'pretrained': 'daslab-testing/CloverLM', 'dtype': 'bfloat16', 'quartet_2_impl': 'quartet2', 'attn_backend': 'pytorch', 'trust_remote_code': True}
|
| 13 |
-
2026-03-19:16:52:56 INFO [models.huggingface:178] Using `accelerate launch` or `parallelize=True`, device 'cuda:0' will be overridden when placing model.
|
| 14 |
-
2026-03-19:16:52:56 INFO [_cli.run:375] Including path: ./
|
| 15 |
-
2026-03-19:16:52:56 INFO [_cli.run:376] Selected Tasks: ['arc_easy_mi', 'arc_challenge_mi', 'hellaswag', 'piqa']
|
| 16 |
-
2026-03-19:16:52:56 INFO [evaluator:211] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | Setting fewshot manual seed to 1234
|
| 17 |
-
2026-03-19:16:52:56 INFO [evaluator:236] Initializing cloverlm model, with arguments: {'pretrained': 'daslab-testing/CloverLM', 'dtype': 'bfloat16', 'quartet_2_impl': 'quartet2', 'attn_backend': 'pytorch', 'trust_remote_code': True}
|
| 18 |
-
2026-03-19:16:52:57 INFO [models.huggingface:178] Using `accelerate launch` or `parallelize=True`, device 'cuda:0' will be overridden when placing model.
|
| 19 |
-
Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
|
| 20 |
-
2026-03-19:16:52:57 INFO [models.huggingface:548] Model type cannot be determined. Using default model type 'causal'
|
| 21 |
-
2026-03-19:16:52:57 INFO [models.huggingface:548] Model type cannot be determined. Using default model type 'causal'
|
| 22 |
-
Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
|
| 23 |
-
[rank1]: Traceback (most recent call last):
|
| 24 |
-
[rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/eval.py", line 65, in <module>
|
| 25 |
-
[rank1]: cli_evaluate()
|
| 26 |
-
[rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/__main__.py", line 10, in cli_evaluate
|
| 27 |
-
[rank1]: parser.execute(args)
|
| 28 |
-
[rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/_cli/harness.py", line 60, in execute
|
| 29 |
-
[rank1]: args.func(args)
|
| 30 |
-
[rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/_cli/run.py", line 379, in _execute
|
| 31 |
-
[rank1]: results = simple_evaluate(
|
| 32 |
-
[rank1]: ^^^^^^^^^^^^^^^^
|
| 33 |
-
[rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/utils.py", line 498, in _wrapper
|
| 34 |
-
[rank1]: return fn(*args, **kwargs)
|
| 35 |
-
[rank1]: ^^^^^^^^^^^^^^^^^^^
|
| 36 |
-
[rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/evaluator.py", line 239, in simple_evaluate
|
| 37 |
-
[rank1]: lm = lm_eval.api.registry.get_model(model).create_from_arg_obj(
|
| 38 |
-
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
| 39 |
-
[rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/api/model.py", line 180, in create_from_arg_obj
|
| 40 |
-
[rank1]: return cls(**arg_dict, **additional_config)
|
| 41 |
-
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
| 42 |
-
[rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/eval.py", line 11, in __init__
|
| 43 |
-
[rank1]: super().__init__(**kwargs)
|
| 44 |
-
[rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/models/huggingface.py", line 204, in __init__
|
| 45 |
-
[rank1]: self._create_tokenizer(
|
| 46 |
-
[rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/models/huggingface.py", line 793, in _create_tokenizer
|
| 47 |
-
[rank1]: self.tokenizer = transformers.AutoTokenizer.from_pretrained(
|
| 48 |
-
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
| 49 |
-
[rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 732, in from_pretrained
|
| 50 |
-
[rank1]: tokenizer_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs)
|
| 51 |
-
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
| 52 |
-
[rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/transformers/dynamic_module_utils.py", line 567, in get_class_from_dynamic_module
|
| 53 |
-
[rank1]: module_file, class_name = class_reference.split(".")
|
| 54 |
-
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^
|
| 55 |
-
[rank1]: ValueError: not enough values to unpack (expected 2, got 1)
|
| 56 |
-
[rank0]: Traceback (most recent call last):
|
| 57 |
-
[rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/eval.py", line 65, in <module>
|
| 58 |
-
[rank0]: cli_evaluate()
|
| 59 |
-
[rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/__main__.py", line 10, in cli_evaluate
|
| 60 |
-
[rank0]: parser.execute(args)
|
| 61 |
-
[rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/_cli/harness.py", line 60, in execute
|
| 62 |
-
[rank0]: args.func(args)
|
| 63 |
-
[rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/_cli/run.py", line 379, in _execute
|
| 64 |
-
[rank0]: results = simple_evaluate(
|
| 65 |
-
[rank0]: ^^^^^^^^^^^^^^^^
|
| 66 |
-
[rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/utils.py", line 498, in _wrapper
|
| 67 |
-
[rank0]: return fn(*args, **kwargs)
|
| 68 |
-
[rank0]: ^^^^^^^^^^^^^^^^^^^
|
| 69 |
-
[rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/evaluator.py", line 239, in simple_evaluate
|
| 70 |
-
[rank0]: lm = lm_eval.api.registry.get_model(model).create_from_arg_obj(
|
| 71 |
-
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
| 72 |
-
[rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/api/model.py", line 180, in create_from_arg_obj
|
| 73 |
-
[rank0]: return cls(**arg_dict, **additional_config)
|
| 74 |
-
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
| 75 |
-
[rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/eval.py", line 11, in __init__
|
| 76 |
-
[rank0]: super().__init__(**kwargs)
|
| 77 |
-
[rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/models/huggingface.py", line 204, in __init__
|
| 78 |
-
[rank0]: self._create_tokenizer(
|
| 79 |
-
[rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/models/huggingface.py", line 793, in _create_tokenizer
|
| 80 |
-
[rank0]: self.tokenizer = transformers.AutoTokenizer.from_pretrained(
|
| 81 |
-
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
| 82 |
-
[rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 732, in from_pretrained
|
| 83 |
-
[rank0]: tokenizer_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs)
|
| 84 |
-
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
| 85 |
-
[rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/transformers/dynamic_module_utils.py", line 567, in get_class_from_dynamic_module
|
| 86 |
-
[rank0]: module_file, class_name = class_reference.split(".")
|
| 87 |
-
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^
|
| 88 |
-
[rank0]: ValueError: not enough values to unpack (expected 2, got 1)
|
| 89 |
-
[rank0]:[W319 16:52:58.069226968 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
|
| 90 |
-
W0319 16:52:59.444000 1490612 torch/distributed/elastic/multiprocessing/api.py:1010] Sending process 1490848 closing signal SIGTERM
|
| 91 |
-
E0319 16:52:59.508000 1490612 torch/distributed/elastic/multiprocessing/api.py:984] failed (exitcode: 1) local_rank: 0 (pid: 1490847) of binary: /home/matin/convert_dir/CloverLM/lm_eval/.venv/bin/python
|
| 92 |
-
Traceback (most recent call last):
|
| 93 |
-
File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/bin/accelerate", line 10, in <module>
|
| 94 |
-
sys.exit(main())
|
| 95 |
-
^^^^^^
|
| 96 |
-
File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/accelerate/commands/accelerate_cli.py", line 50, in main
|
| 97 |
-
args.func(args)
|
| 98 |
-
File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 1396, in launch_command
|
| 99 |
-
multi_gpu_launcher(args)
|
| 100 |
-
File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 1023, in multi_gpu_launcher
|
| 101 |
-
distrib_run.run(args)
|
| 102 |
-
File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/torch/distributed/run.py", line 982, in run
|
| 103 |
-
elastic_launch(
|
| 104 |
-
File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 170, in __call__
|
| 105 |
-
return launch_agent(self._config, self._entrypoint, list(args))
|
| 106 |
-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
| 107 |
-
File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 317, in launch_agent
|
| 108 |
-
raise ChildFailedError(
|
| 109 |
-
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
|
| 110 |
-
============================================================
|
| 111 |
-
eval.py FAILED
|
| 112 |
-
------------------------------------------------------------
|
| 113 |
-
Failures:
|
| 114 |
-
[1]:
|
| 115 |
-
time : 2026-03-19_16:52:59
|
| 116 |
-
host : b300-eval.datacrunch.io
|
| 117 |
-
rank : 1 (local_rank: 1)
|
| 118 |
-
exitcode : 1 (pid: 1490848)
|
| 119 |
-
error_file: <N/A>
|
| 120 |
-
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
|
| 121 |
-
------------------------------------------------------------
|
| 122 |
-
Root Cause (first observed failure):
|
| 123 |
-
[0]:
|
| 124 |
-
time : 2026-03-19_16:52:59
|
| 125 |
-
host : b300-eval.datacrunch.io
|
| 126 |
-
rank : 0 (local_rank: 0)
|
| 127 |
-
exitcode : 1 (pid: 1490847)
|
| 128 |
-
error_file: <N/A>
|
| 129 |
-
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
|
| 130 |
-
============================================================
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|