mansaripo commited on
Commit
ac62373
·
verified ·
1 Parent(s): 5de1d42

Delete lm_eval/test_eval2.log

Browse files
Files changed (1) hide show
  1. lm_eval/test_eval2.log +0 -130
lm_eval/test_eval2.log DELETED
@@ -1,130 +0,0 @@
1
- The following values were not passed to `accelerate launch` and had defaults used instead:
2
- `--num_processes` was set to a value of `2`
3
- More than one GPU was found, enabling multi-GPU training.
4
- If this was unintended please pass in `--num_processes=1`.
5
- `--num_machines` was set to a value of `1`
6
- `--mixed_precision` was set to a value of `'no'`
7
- `--dynamo_backend` was set to a value of `'no'`
8
- To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
9
- 2026-03-19:16:52:56 INFO [_cli.run:375] Including path: ./
10
- 2026-03-19:16:52:56 INFO [_cli.run:376] Selected Tasks: ['arc_easy_mi', 'arc_challenge_mi', 'hellaswag', 'piqa']
11
- 2026-03-19:16:52:56 INFO [evaluator:211] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | Setting fewshot manual seed to 1234
12
- 2026-03-19:16:52:56 INFO [evaluator:236] Initializing cloverlm model, with arguments: {'pretrained': 'daslab-testing/CloverLM', 'dtype': 'bfloat16', 'quartet_2_impl': 'quartet2', 'attn_backend': 'pytorch', 'trust_remote_code': True}
13
- 2026-03-19:16:52:56 INFO [models.huggingface:178] Using `accelerate launch` or `parallelize=True`, device 'cuda:0' will be overridden when placing model.
14
- 2026-03-19:16:52:56 INFO [_cli.run:375] Including path: ./
15
- 2026-03-19:16:52:56 INFO [_cli.run:376] Selected Tasks: ['arc_easy_mi', 'arc_challenge_mi', 'hellaswag', 'piqa']
16
- 2026-03-19:16:52:56 INFO [evaluator:211] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | Setting fewshot manual seed to 1234
17
- 2026-03-19:16:52:56 INFO [evaluator:236] Initializing cloverlm model, with arguments: {'pretrained': 'daslab-testing/CloverLM', 'dtype': 'bfloat16', 'quartet_2_impl': 'quartet2', 'attn_backend': 'pytorch', 'trust_remote_code': True}
18
- 2026-03-19:16:52:57 INFO [models.huggingface:178] Using `accelerate launch` or `parallelize=True`, device 'cuda:0' will be overridden when placing model.
19
- Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
20
- 2026-03-19:16:52:57 INFO [models.huggingface:548] Model type cannot be determined. Using default model type 'causal'
21
- 2026-03-19:16:52:57 INFO [models.huggingface:548] Model type cannot be determined. Using default model type 'causal'
22
- Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
23
- [rank1]: Traceback (most recent call last):
24
- [rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/eval.py", line 65, in <module>
25
- [rank1]: cli_evaluate()
26
- [rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/__main__.py", line 10, in cli_evaluate
27
- [rank1]: parser.execute(args)
28
- [rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/_cli/harness.py", line 60, in execute
29
- [rank1]: args.func(args)
30
- [rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/_cli/run.py", line 379, in _execute
31
- [rank1]: results = simple_evaluate(
32
- [rank1]: ^^^^^^^^^^^^^^^^
33
- [rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/utils.py", line 498, in _wrapper
34
- [rank1]: return fn(*args, **kwargs)
35
- [rank1]: ^^^^^^^^^^^^^^^^^^^
36
- [rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/evaluator.py", line 239, in simple_evaluate
37
- [rank1]: lm = lm_eval.api.registry.get_model(model).create_from_arg_obj(
38
- [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
39
- [rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/api/model.py", line 180, in create_from_arg_obj
40
- [rank1]: return cls(**arg_dict, **additional_config)
41
- [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
42
- [rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/eval.py", line 11, in __init__
43
- [rank1]: super().__init__(**kwargs)
44
- [rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/models/huggingface.py", line 204, in __init__
45
- [rank1]: self._create_tokenizer(
46
- [rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/models/huggingface.py", line 793, in _create_tokenizer
47
- [rank1]: self.tokenizer = transformers.AutoTokenizer.from_pretrained(
48
- [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
49
- [rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 732, in from_pretrained
50
- [rank1]: tokenizer_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs)
51
- [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
52
- [rank1]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/transformers/dynamic_module_utils.py", line 567, in get_class_from_dynamic_module
53
- [rank1]: module_file, class_name = class_reference.split(".")
54
- [rank1]: ^^^^^^^^^^^^^^^^^^^^^^^
55
- [rank1]: ValueError: not enough values to unpack (expected 2, got 1)
56
- [rank0]: Traceback (most recent call last):
57
- [rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/eval.py", line 65, in <module>
58
- [rank0]: cli_evaluate()
59
- [rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/__main__.py", line 10, in cli_evaluate
60
- [rank0]: parser.execute(args)
61
- [rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/_cli/harness.py", line 60, in execute
62
- [rank0]: args.func(args)
63
- [rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/_cli/run.py", line 379, in _execute
64
- [rank0]: results = simple_evaluate(
65
- [rank0]: ^^^^^^^^^^^^^^^^
66
- [rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/utils.py", line 498, in _wrapper
67
- [rank0]: return fn(*args, **kwargs)
68
- [rank0]: ^^^^^^^^^^^^^^^^^^^
69
- [rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/evaluator.py", line 239, in simple_evaluate
70
- [rank0]: lm = lm_eval.api.registry.get_model(model).create_from_arg_obj(
71
- [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
72
- [rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/api/model.py", line 180, in create_from_arg_obj
73
- [rank0]: return cls(**arg_dict, **additional_config)
74
- [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
75
- [rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/eval.py", line 11, in __init__
76
- [rank0]: super().__init__(**kwargs)
77
- [rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/models/huggingface.py", line 204, in __init__
78
- [rank0]: self._create_tokenizer(
79
- [rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/lm_eval/models/huggingface.py", line 793, in _create_tokenizer
80
- [rank0]: self.tokenizer = transformers.AutoTokenizer.from_pretrained(
81
- [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
82
- [rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/transformers/models/auto/tokenization_auto.py", line 732, in from_pretrained
83
- [rank0]: tokenizer_class = get_class_from_dynamic_module(class_ref, pretrained_model_name_or_path, **kwargs)
84
- [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
85
- [rank0]: File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/transformers/dynamic_module_utils.py", line 567, in get_class_from_dynamic_module
86
- [rank0]: module_file, class_name = class_reference.split(".")
87
- [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^
88
- [rank0]: ValueError: not enough values to unpack (expected 2, got 1)
89
- [rank0]:[W319 16:52:58.069226968 ProcessGroupNCCL.cpp:1553] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
90
- W0319 16:52:59.444000 1490612 torch/distributed/elastic/multiprocessing/api.py:1010] Sending process 1490848 closing signal SIGTERM
91
- E0319 16:52:59.508000 1490612 torch/distributed/elastic/multiprocessing/api.py:984] failed (exitcode: 1) local_rank: 0 (pid: 1490847) of binary: /home/matin/convert_dir/CloverLM/lm_eval/.venv/bin/python
92
- Traceback (most recent call last):
93
- File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/bin/accelerate", line 10, in <module>
94
- sys.exit(main())
95
- ^^^^^^
96
- File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/accelerate/commands/accelerate_cli.py", line 50, in main
97
- args.func(args)
98
- File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 1396, in launch_command
99
- multi_gpu_launcher(args)
100
- File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/accelerate/commands/launch.py", line 1023, in multi_gpu_launcher
101
- distrib_run.run(args)
102
- File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/torch/distributed/run.py", line 982, in run
103
- elastic_launch(
104
- File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 170, in __call__
105
- return launch_agent(self._config, self._entrypoint, list(args))
106
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
107
- File "/home/matin/convert_dir/CloverLM/lm_eval/.venv/lib/python3.11/site-packages/torch/distributed/launcher/api.py", line 317, in launch_agent
108
- raise ChildFailedError(
109
- torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
110
- ============================================================
111
- eval.py FAILED
112
- ------------------------------------------------------------
113
- Failures:
114
- [1]:
115
- time : 2026-03-19_16:52:59
116
- host : b300-eval.datacrunch.io
117
- rank : 1 (local_rank: 1)
118
- exitcode : 1 (pid: 1490848)
119
- error_file: <N/A>
120
- traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
121
- ------------------------------------------------------------
122
- Root Cause (first observed failure):
123
- [0]:
124
- time : 2026-03-19_16:52:59
125
- host : b300-eval.datacrunch.io
126
- rank : 0 (local_rank: 0)
127
- exitcode : 1 (pid: 1490847)
128
- error_file: <N/A>
129
- traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
130
- ============================================================