| INFO 10-26 08:02:51 [__init__.py:235] Automatically detected platform cuda. | |
| [2025-10-26 08:02:53,740] [[32m INFO[0m]: --- INIT SEEDS --- (pipeline.py:249)[0m | |
| [2025-10-26 08:02:53,741] [[32m INFO[0m]: --- LOADING TASKS --- (pipeline.py:210)[0m | |
| [2025-10-26 08:02:58,493] [[32m INFO[0m]: --- LOADING MODEL --- (pipeline.py:177)[0m | |
| `torch_dtype` is deprecated! Use `dtype` instead! | |
| [2025-10-26 08:03:06,104] [[32m INFO[0m]: Using max model len 32768 (config.py:1604)[0m | |
| [2025-10-26 08:03:06,859] [[32m INFO[0m]: Chunked prefill is enabled with max_num_batched_tokens=2048. (config.py:2434)[0m | |
| INFO 10-26 08:03:11 [__init__.py:235] Automatically detected platform cuda. | |
| INFO 10-26 08:03:13 [core.py:572] Waiting for init message from front-end. | |
| INFO 10-26 08:03:13 [core.py:71] Initializing a V1 LLM engine (v0.10.0) with config: model='/mnt/public/wucanhui/outputs/Qwen3-4B-math-reasoning/checkpoint-2562', speculative_config=None, tokenizer='/mnt/public/wucanhui/outputs/Qwen3-4B-math-reasoning/checkpoint-2562', skip_tokenizer_init=False, tokenizer_mode=auto, revision=main, override_neuron_config={}, tokenizer_revision=main, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=32768, download_dir=None, load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=True, kv_cache_dtype=auto, device_config=cuda, decoding_config=DecodingConfig(backend='auto', disable_fallback=False, disable_any_whitespace=False, disable_additional_properties=False, reasoning_backend=''), observability_config=ObservabilityConfig(show_hidden_metrics_for_version=None, otlp_traces_endpoint=None, collect_detailed_traces=None), seed=1234, served_model_name=/mnt/public/wucanhui/outputs/Qwen3-4B-math-reasoning/checkpoint-2562, num_scheduler_steps=1, multi_step_stream_outputs=True, enable_prefix_caching=True, chunked_prefill_enabled=True, use_async_output_proc=True, pooler_config=None, compilation_config={"level":0,"debug_dump_path":"","cache_dir":"","backend":"","custom_ops":[],"splitting_ops":[],"use_inductor":true,"compile_sizes":[],"inductor_compile_config":{"enable_auto_functionalized_v2":false},"inductor_passes":{},"use_cudagraph":true,"cudagraph_num_of_warmups":0,"cudagraph_capture_sizes":[],"cudagraph_copy_inputs":false,"full_cuda_graph":false,"max_capture_size":0,"local_cache_dir":null} | |
| INFO 10-26 08:03:17 [parallel_state.py:1102] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0, EP rank 0 | |
| WARNING 10-26 08:03:17 [topk_topp_sampler.py:59] FlashInfer is not available. Falling back to the PyTorch-native implementation of top-p & top-k sampling. For the best performance, please install FlashInfer. | |
| INFO 10-26 08:03:17 [gpu_model_runner.py:1843] Starting to load model /mnt/public/wucanhui/outputs/Qwen3-4B-math-reasoning/checkpoint-2562... | |
| INFO 10-26 08:03:17 [gpu_model_runner.py:1875] Loading model from scratch... | |
| INFO 10-26 08:03:18 [cuda.py:290] Using Flash Attention backend on V1 engine. | |
| Loading safetensors checkpoint shards: 0% Completed | 0/2 [00:00<?, ?it/s] | |
| Loading safetensors checkpoint shards: 50% Completed | 1/2 [00:31<00:31, 31.46s/it] | |
| Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:52<00:00, 25.50s/it] | |
| Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:52<00:00, 26.40s/it] | |
| INFO 10-26 08:04:11 [default_loader.py:262] Loading weights took 53.22 seconds | |
| INFO 10-26 08:04:11 [gpu_model_runner.py:1892] Model loading took 7.5552 GiB and 53.331223 seconds | |
| INFO 10-26 08:04:12 [gpu_worker.py:255] Available KV cache memory: 117.60 GiB | |
| INFO 10-26 08:04:12 [kv_cache_utils.py:833] GPU KV cache size: 856,336 tokens | |
| INFO 10-26 08:04:12 [kv_cache_utils.py:837] Maximum concurrency for 32,768 tokens per request: 26.13x | |
| INFO 10-26 08:04:13 [core.py:193] init engine (profile, create kv cache, warmup model) took 1.39 seconds | |
| [2025-10-26 08:04:13,629] [[32m INFO[0m]: [CACHING] Initializing data cache (cache_management.py:105)[0m | |
| [2025-10-26 08:04:13,636] [[32m INFO[0m]: --- RUNNING MODEL --- (pipeline.py:330)[0m | |
| [2025-10-26 08:04:13,638] [[32m INFO[0m]: Running SamplingMethod.GENERATIVE requests (pipeline.py:313)[0m | |
| [2025-10-26 08:04:53,073] [[32m INFO[0m]: Cache: Starting to process 1319/1319 samples (not found in cache) for tasks lighteval|gsm8k|0 (f61acff0a464569c, GENERATIVE) (cache_management.py:399)[0m | |
| [2025-10-26 08:04:53,076] [[33m WARNING[0m]: You cannot select the number of dataset splits for a generative evaluation at the moment. Automatically inferring. (data.py:206)[0m | |
| Splits: 0%| | 0/1 [00:00<?, ?it/s][2025-10-26 08:04:53,293] [[33m WARNING[0m]: context_size + max_new_tokens=32975 which is greater than self.max_length=32768. Truncating context to 0 tokens. (vllm_model.py:367)[0m | |
| Adding requests: 0%| | 0/1319 [00:00<?, ?it/s][A Adding requests: 100%|ββββββββββ| 1319/1319 [00:00<00:00, 19148.16it/s] | |
| Processed prompts: 0%| | 0/1319 [00:00<?, ?it/s, est. speed input: 0.00 toks/s, output: 0.00 toks/s][A | |
| Processed prompts: 0%| | 1/1319 [00:02<48:07, 2.19s/it, est. speed input: 61.62 toks/s, output: 54.77 toks/s][A | |
| Processed prompts: 0%| | 3/1319 [00:02<16:49, 1.30it/s, est. speed input: 140.81 toks/s, output: 147.02 toks/s][A | |
| Processed prompts: 0%| | 6/1319 [00:02<07:24, 2.95it/s, est. speed input: 249.74 toks/s, output: 301.84 toks/s][A | |
| Processed prompts: 1%| | 9/1319 [00:03<04:27, 4.90it/s, est. speed input: 361.00 toks/s, output: 458.49 toks/s][A | |
| Processed prompts: 1%| | 14/1319 [00:03<02:22, 9.17it/s, est. speed input: 520.27 toks/s, output: 732.42 toks/s][A | |
| Processed prompts: 1%|β | 17/1319 [00:03<01:55, 11.28it/s, est. speed input: 610.02 toks/s, output: 878.05 toks/s][A | |
| Processed prompts: 2%|β | 22/1319 [00:03<01:20, 16.13it/s, est. speed input: 754.84 toks/s, output: 1136.93 toks/s][A | |
| Processed prompts: 2%|β | 28/1319 [00:03<00:57, 22.40it/s, est. speed input: 922.96 toks/s, output: 1450.77 toks/s][A | |
| Processed prompts: 3%|β | 34/1319 [00:03<00:48, 26.65it/s, est. speed input: 1079.40 toks/s, output: 1741.97 toks/s][A | |
| Processed prompts: 3%|β | 38/1319 [00:03<00:45, 28.46it/s, est. speed input: 1182.89 toks/s, output: 1930.61 toks/s][A | |
| Processed prompts: 3%|β | 44/1319 [00:03<00:36, 34.51it/s, est. speed input: 1328.82 toks/s, output: 2236.35 toks/s][A | |
| Processed prompts: 4%|β | 52/1319 [00:04<00:28, 44.73it/s, est. speed input: 1544.09 toks/s, output: 2660.20 toks/s][A | |
| Processed prompts: 4%|β | 58/1319 [00:04<00:32, 38.63it/s, est. speed input: 1643.65 toks/s, output: 2885.11 toks/s][A | |
| Processed prompts: 5%|β | 64/1319 [00:04<00:29, 42.68it/s, est. speed input: 1772.41 toks/s, output: 3179.11 toks/s][A | |
| Processed prompts: 5%|β | 69/1319 [00:04<00:35, 35.25it/s, est. speed input: 1835.21 toks/s, output: 3341.77 toks/s][A | |
| Processed prompts: 6%|β | 74/1319 [00:04<00:34, 36.22it/s, est. speed input: 1920.89 toks/s, output: 3559.51 toks/s][A | |
| Processed prompts: 6%|β | 81/1319 [00:04<00:29, 42.01it/s, est. speed input: 2039.62 toks/s, output: 3901.54 toks/s][A | |
| Processed prompts: 7%|β | 86/1319 [00:05<00:37, 33.28it/s, est. speed input: 2076.79 toks/s, output: 4028.31 toks/s][A | |
| Processed prompts: 7%|β | 90/1319 [00:05<00:38, 31.54it/s, est. speed input: 2122.62 toks/s, output: 4162.78 toks/s][A | |
| Processed prompts: 7%|β | 94/1319 [00:05<00:39, 31.37it/s, est. speed input: 2163.88 toks/s, output: 4314.44 toks/s][A | |
| Processed prompts: 7%|β | 98/1319 [00:05<00:44, 27.57it/s, est. speed input: 2163.43 toks/s, output: 4300.98 toks/s][A | |
| Processed prompts: 8%|β | 101/1319 [00:05<00:57, 21.28it/s, est. speed input: 2122.36 toks/s, output: 4243.65 toks/s][A | |
| Processed prompts: 9%|β | 115/1319 [00:06<00:36, 32.82it/s, est. speed input: 2299.37 toks/s, output: 4755.29 toks/s][A | |
| Processed prompts: 9%|β | 121/1319 [00:06<00:35, 33.56it/s, est. speed input: 2348.82 toks/s, output: 4877.64 toks/s][A | |
| Processed prompts: 10%|β | 126/1319 [00:06<00:47, 24.88it/s, est. speed input: 2304.85 toks/s, output: 4796.13 toks/s][A | |
| Processed prompts: 10%|β | 129/1319 [00:06<00:50, 23.60it/s, est. speed input: 2302.72 toks/s, output: 4810.41 toks/s][A | |
| Processed prompts: 10%|β | 133/1319 [00:06<00:48, 24.40it/s, est. speed input: 2322.28 toks/s, output: 4864.33 toks/s][A | |
| Processed prompts: 10%|β | 137/1319 [00:07<00:44, 26.32it/s, est. speed input: 2346.93 toks/s, output: 4930.95 toks/s][A | |
| Processed prompts: 11%|β | 141/1319 [00:07<00:41, 28.35it/s, est. speed input: 2378.29 toks/s, output: 4995.68 toks/s][A | |
| Processed prompts: 11%|β | 147/1319 [00:07<00:33, 34.56it/s, est. speed input: 2438.69 toks/s, output: 5165.15 toks/s][A | |
| Processed prompts: 12%|ββ | 152/1319 [00:07<00:31, 37.12it/s, est. speed input: 2496.13 toks/s, output: 5343.32 toks/s][A | |
| Processed prompts: 12%|ββ | 157/1319 [00:07<00:34, 33.23it/s, est. speed input: 2500.46 toks/s, output: 5363.18 toks/s][A | |
| Processed prompts: 12%|ββ | 161/1319 [00:07<00:38, 30.01it/s, est. speed input: 2496.60 toks/s, output: 5374.45 toks/s][A | |
| Processed prompts: 13%|ββ | 167/1319 [00:07<00:32, 35.54it/s, est. speed input: 2548.06 toks/s, output: 5537.51 toks/s][A | |
| Processed prompts: 13%|ββ | 173/1319 [00:08<00:37, 30.62it/s, est. speed input: 2549.46 toks/s, output: 5584.33 toks/s][A | |
| Processed prompts: 14%|ββ | 180/1319 [00:08<00:30, 37.18it/s, est. speed input: 2605.55 toks/s, output: 5785.82 toks/s][A | |
| Processed prompts: 14%|ββ | 185/1319 [00:08<00:29, 38.91it/s, est. speed input: 2629.10 toks/s, output: 5861.73 toks/s][A | |
| Processed prompts: 14%|ββ | 190/1319 [00:08<00:28, 39.51it/s, est. speed input: 2650.10 toks/s, output: 5910.30 toks/s][A | |
| Processed prompts: 15%|ββ | 195/1319 [00:08<00:38, 29.45it/s, est. speed input: 2622.47 toks/s, output: 5868.89 toks/s][A | |
| Processed prompts: 15%|ββ | 200/1319 [00:08<00:34, 32.00it/s, est. speed input: 2646.37 toks/s, output: 5931.61 toks/s][A | |
| Processed prompts: 15%|ββ | 204/1319 [00:08<00:35, 31.24it/s, est. speed input: 2651.59 toks/s, output: 5985.15 toks/s][A | |
| Processed prompts: 16%|ββ | 208/1319 [00:09<00:43, 25.39it/s, est. speed input: 2626.76 toks/s, output: 5929.73 toks/s][A | |
| Processed prompts: 16%|ββ | 214/1319 [00:09<00:34, 31.91it/s, est. speed input: 2657.95 toks/s, output: 6000.79 toks/s][A | |
| Processed prompts: 17%|ββ | 220/1319 [00:09<00:29, 37.39it/s, est. speed input: 2689.68 toks/s, output: 6112.97 toks/s][A | |
| Processed prompts: 17%|ββ | 225/1319 [00:09<00:29, 36.58it/s, est. speed input: 2700.77 toks/s, output: 6148.75 toks/s][A | |
| Processed prompts: 17%|ββ | 230/1319 [00:09<00:38, 28.37it/s, est. speed input: 2672.56 toks/s, output: 6067.68 toks/s][A | |
| Processed prompts: 18%|ββ | 237/1319 [00:09<00:31, 34.61it/s, est. speed input: 2712.05 toks/s, output: 6184.78 toks/s][A | |
| Processed prompts: 18%|ββ | 243/1319 [00:10<00:27, 38.88it/s, est. speed input: 2740.98 toks/s, output: 6256.30 toks/s][A | |
| Processed prompts: 19%|ββ | 248/1319 [00:10<00:30, 35.64it/s, est. speed input: 2747.29 toks/s, output: 6314.54 toks/s][A | |
| Processed prompts: 19%|ββ | 257/1319 [00:10<00:24, 43.71it/s, est. speed input: 2793.31 toks/s, output: 6476.28 toks/s][A | |
| Processed prompts: 20%|ββ | 262/1319 [00:10<00:27, 37.90it/s, est. speed input: 2794.37 toks/s, output: 6522.35 toks/s][A | |
| Processed prompts: 20%|ββ | 267/1319 [00:10<00:28, 36.78it/s, est. speed input: 2800.18 toks/s, output: 6538.39 toks/s][A | |
| Processed prompts: 21%|ββ | 271/1319 [00:10<00:28, 36.74it/s, est. speed input: 2806.03 toks/s, output: 6549.62 toks/s][A | |
| Processed prompts: 21%|ββ | 275/1319 [00:10<00:29, 34.83it/s, est. speed input: 2805.41 toks/s, output: 6562.24 toks/s][A | |
| Processed prompts: 21%|ββ | 279/1319 [00:11<00:29, 35.35it/s, est. speed input: 2815.32 toks/s, output: 6614.44 toks/s][A | |
| Processed prompts: 22%|βββ | 286/1319 [00:11<00:23, 43.24it/s, est. speed input: 2843.77 toks/s, output: 6666.43 toks/s][A | |
| Processed prompts: 22%|βββ | 291/1319 [00:11<00:32, 31.70it/s, est. speed input: 2820.07 toks/s, output: 6618.35 toks/s][A | |
| Processed prompts: 22%|βββ | 296/1319 [00:11<00:28, 35.43it/s, est. speed input: 2838.23 toks/s, output: 6687.31 toks/s][A | |
| Processed prompts: 23%|βββ | 301/1319 [00:11<00:29, 34.00it/s, est. speed input: 2836.34 toks/s, output: 6687.20 toks/s][A | |
| Processed prompts: 23%|βββ | 305/1319 [00:11<00:35, 28.86it/s, est. speed input: 2822.37 toks/s, output: 6689.71 toks/s][A | |
| Processed prompts: 23%|βββ | 309/1319 [00:12<00:33, 30.55it/s, est. speed input: 2823.60 toks/s, output: 6691.25 toks/s][A | |
| Processed prompts: 24%|βββ | 313/1319 [00:12<00:36, 27.57it/s, est. speed input: 2810.81 toks/s, output: 6681.56 toks/s][A | |
| Processed prompts: 24%|βββ | 319/1319 [00:12<00:29, 33.50it/s, est. speed input: 2829.34 toks/s, output: 6726.75 toks/s][A | |
| Processed prompts: 25%|βββ | 324/1319 [00:12<00:37, 26.71it/s, est. speed input: 2803.47 toks/s, output: 6687.10 toks/s][A | |
| Processed prompts: 26%|βββ | 338/1319 [00:12<00:20, 48.12it/s, est. speed input: 2881.52 toks/s, output: 6883.89 toks/s][A | |
| Processed prompts: 26%|βββ | 345/1319 [00:12<00:24, 40.36it/s, est. speed input: 2875.26 toks/s, output: 6886.85 toks/s][A | |
| Processed prompts: 27%|βββ | 352/1319 [00:13<00:25, 38.59it/s, est. speed input: 2875.43 toks/s, output: 6878.47 toks/s][A | |
| Processed prompts: 27%|βββ | 357/1319 [00:13<00:24, 38.92it/s, est. speed input: 2880.76 toks/s, output: 6912.33 toks/s][A | |
| Processed prompts: 28%|βββ | 364/1319 [00:13<00:21, 44.87it/s, est. speed input: 2906.18 toks/s, output: 6984.45 toks/s][A | |
| Processed prompts: 28%|βββ | 370/1319 [00:13<00:23, 41.19it/s, est. speed input: 2912.90 toks/s, output: 7065.00 toks/s][A | |
| Processed prompts: 28%|βββ | 375/1319 [00:13<00:22, 42.32it/s, est. speed input: 2922.45 toks/s, output: 7093.77 toks/s][A | |
| Processed prompts: 29%|βββ | 380/1319 [00:13<00:23, 40.37it/s, est. speed input: 2922.40 toks/s, output: 7088.87 toks/s][A | |
| Processed prompts: 29%|βββ | 387/1319 [00:13<00:21, 42.99it/s, est. speed input: 2938.58 toks/s, output: 7171.28 toks/s][A | |
| Processed prompts: 30%|βββ | 392/1319 [00:14<00:26, 35.24it/s, est. speed input: 2925.00 toks/s, output: 7152.91 toks/s][A | |
| Processed prompts: 30%|βββ | 398/1319 [00:14<00:25, 35.92it/s, est. speed input: 2928.07 toks/s, output: 7187.86 toks/s][A | |
| Processed prompts: 30%|βββ | 402/1319 [00:14<00:27, 33.43it/s, est. speed input: 2921.91 toks/s, output: 7187.78 toks/s][A | |
| Processed prompts: 31%|βββ | 406/1319 [00:14<00:26, 34.02it/s, est. speed input: 2924.02 toks/s, output: 7202.38 toks/s][A | |
| Processed prompts: 31%|βββ | 411/1319 [00:14<00:26, 34.72it/s, est. speed input: 2928.24 toks/s, output: 7227.15 toks/s][A | |
| Processed prompts: 32%|ββββ | 417/1319 [00:14<00:23, 39.16it/s, est. speed input: 2939.87 toks/s, output: 7255.81 toks/s][A | |
| Processed prompts: 32%|ββββ | 422/1319 [00:14<00:26, 34.47it/s, est. speed input: 2931.10 toks/s, output: 7236.17 toks/s][A | |
| Processed prompts: 32%|ββββ | 427/1319 [00:15<00:24, 36.98it/s, est. speed input: 2940.45 toks/s, output: 7257.26 toks/s][A | |
| Processed prompts: 33%|ββββ | 431/1319 [00:15<00:23, 37.05it/s, est. speed input: 2941.91 toks/s, output: 7301.63 toks/s][A | |
| Processed prompts: 33%|ββββ | 435/1319 [00:15<00:26, 32.81it/s, est. speed input: 2933.39 toks/s, output: 7269.71 toks/s][A | |
| Processed prompts: 33%|ββββ | 440/1319 [00:15<00:24, 36.52it/s, est. speed input: 2941.59 toks/s, output: 7284.30 toks/s][A | |
| Processed prompts: 34%|ββββ | 444/1319 [00:15<00:27, 32.23it/s, est. speed input: 2932.74 toks/s, output: 7269.66 toks/s][A | |
| Processed prompts: 34%|ββββ | 449/1319 [00:15<00:25, 34.65it/s, est. speed input: 2938.09 toks/s, output: 7293.43 toks/s][A | |
| Processed prompts: 34%|ββββ | 454/1319 [00:15<00:22, 38.12it/s, est. speed input: 2946.43 toks/s, output: 7317.36 toks/s][A | |
| Processed prompts: 35%|ββββ | 460/1319 [00:16<00:21, 39.23it/s, est. speed input: 2952.01 toks/s, output: 7329.60 toks/s][A | |
| Processed prompts: 35%|ββββ | 465/1319 [00:16<00:23, 36.22it/s, est. speed input: 2948.10 toks/s, output: 7325.41 toks/s][A | |
| Processed prompts: 36%|ββββ | 469/1319 [00:16<00:26, 32.64it/s, est. speed input: 2940.54 toks/s, output: 7316.97 toks/s][A | |
| Processed prompts: 36%|ββββ | 477/1319 [00:16<00:20, 41.92it/s, est. speed input: 2960.89 toks/s, output: 7385.15 toks/s][A | |
| Processed prompts: 37%|ββββ | 482/1319 [00:16<00:21, 39.25it/s, est. speed input: 2959.34 toks/s, output: 7420.73 toks/s][A | |
| Processed prompts: 37%|ββββ | 489/1319 [00:16<00:18, 45.94it/s, est. speed input: 2975.84 toks/s, output: 7469.57 toks/s][A | |
| Processed prompts: 37%|ββββ | 494/1319 [00:16<00:19, 43.30it/s, est. speed input: 2978.01 toks/s, output: 7482.26 toks/s][A | |
| Processed prompts: 38%|ββββ | 499/1319 [00:17<00:23, 34.50it/s, est. speed input: 2962.23 toks/s, output: 7446.37 toks/s][A | |
| Processed prompts: 38%|ββββ | 504/1319 [00:17<00:24, 33.25it/s, est. speed input: 2959.13 toks/s, output: 7438.90 toks/s][A | |
| Processed prompts: 39%|ββββ | 508/1319 [00:17<00:23, 34.16it/s, est. speed input: 2960.11 toks/s, output: 7454.03 toks/s][A | |
| Processed prompts: 39%|ββββ | 512/1319 [00:17<00:25, 31.80it/s, est. speed input: 2953.50 toks/s, output: 7454.93 toks/s][A | |
| Processed prompts: 39%|ββββ | 516/1319 [00:17<00:28, 28.28it/s, est. speed input: 2941.38 toks/s, output: 7425.98 toks/s][A | |
| Processed prompts: 40%|ββββ | 523/1319 [00:17<00:21, 36.26it/s, est. speed input: 2955.96 toks/s, output: 7456.96 toks/s][A | |
| Processed prompts: 40%|ββββ | 527/1319 [00:17<00:24, 32.71it/s, est. speed input: 2946.76 toks/s, output: 7430.88 toks/s][A | |
| Processed prompts: 40%|ββββ | 532/1319 [00:18<00:21, 36.25it/s, est. speed input: 2952.03 toks/s, output: 7447.65 toks/s][A | |
| Processed prompts: 41%|ββββ | 539/1319 [00:18<00:18, 42.89it/s, est. speed input: 2963.54 toks/s, output: 7471.48 toks/s][A | |
| Processed prompts: 41%|ββββ | 544/1319 [00:18<00:19, 40.17it/s, est. speed input: 2964.25 toks/s, output: 7504.53 toks/s][A | |
| Processed prompts: 42%|βββββ | 549/1319 [00:18<00:18, 42.06it/s, est. speed input: 2969.05 toks/s, output: 7512.16 toks/s][A | |
| Processed prompts: 42%|βββββ | 554/1319 [00:18<00:20, 38.22it/s, est. speed input: 2966.00 toks/s, output: 7514.14 toks/s][A | |
| Processed prompts: 42%|βββββ | 560/1319 [00:18<00:18, 40.13it/s, est. speed input: 2970.56 toks/s, output: 7533.35 toks/s][A | |
| Processed prompts: 43%|βββββ | 566/1319 [00:18<00:17, 43.98it/s, est. speed input: 2979.11 toks/s, output: 7549.78 toks/s][A | |
| Processed prompts: 43%|βββββ | 571/1319 [00:18<00:17, 43.34it/s, est. speed input: 2981.77 toks/s, output: 7567.33 toks/s][A | |
| Processed prompts: 44%|βββββ | 578/1319 [00:19<00:15, 48.89it/s, est. speed input: 2992.53 toks/s, output: 7601.39 toks/s][A | |
| Processed prompts: 44%|βββββ | 584/1319 [00:19<00:16, 44.28it/s, est. speed input: 2991.83 toks/s, output: 7612.09 toks/s][A | |
| Processed prompts: 45%|βββββ | 589/1319 [00:19<00:17, 42.09it/s, est. speed input: 2990.45 toks/s, output: 7615.30 toks/s][A | |
| Processed prompts: 45%|βββββ | 594/1319 [00:19<00:18, 39.54it/s, est. speed input: 2987.85 toks/s, output: 7621.38 toks/s][A | |
| Processed prompts: 46%|βββββ | 601/1319 [00:19<00:16, 44.46it/s, est. speed input: 2998.76 toks/s, output: 7684.18 toks/s][A | |
| Processed prompts: 46%|βββββ | 606/1319 [00:19<00:16, 44.50it/s, est. speed input: 3002.42 toks/s, output: 7728.37 toks/s][A | |
| Processed prompts: 46%|βββββ | 612/1319 [00:19<00:15, 46.88it/s, est. speed input: 3009.11 toks/s, output: 7756.60 toks/s][A | |
| Processed prompts: 47%|βββββ | 617/1319 [00:19<00:16, 42.74it/s, est. speed input: 3006.34 toks/s, output: 7753.38 toks/s][A | |
| Processed prompts: 47%|βββββ | 622/1319 [00:20<00:30, 22.78it/s, est. speed input: 2954.81 toks/s, output: 7623.75 toks/s][A | |
| Processed prompts: 48%|βββββ | 636/1319 [00:20<00:21, 31.96it/s, est. speed input: 2969.43 toks/s, output: 7691.63 toks/s][A | |
| Processed prompts: 49%|βββββ | 644/1319 [00:20<00:17, 38.23it/s, est. speed input: 2982.41 toks/s, output: 7735.46 toks/s][A | |
| Processed prompts: 49%|βββββ | 650/1319 [00:20<00:16, 40.51it/s, est. speed input: 2986.21 toks/s, output: 7734.75 toks/s][A | |
| Processed prompts: 50%|βββββ | 655/1319 [00:21<00:18, 36.20it/s, est. speed input: 2977.92 toks/s, output: 7709.23 toks/s][A | |
| Processed prompts: 50%|βββββ | 661/1319 [00:21<00:16, 40.65it/s, est. speed input: 2984.91 toks/s, output: 7741.97 toks/s][A | |
| Processed prompts: 50%|βββββ | 666/1319 [00:21<00:15, 41.63it/s, est. speed input: 2987.28 toks/s, output: 7759.16 toks/s][A | |
| Processed prompts: 51%|βββββ | 671/1319 [00:21<00:14, 43.53it/s, est. speed input: 2990.35 toks/s, output: 7767.19 toks/s][A | |
| Processed prompts: 51%|ββββββ | 676/1319 [00:21<00:15, 41.73it/s, est. speed input: 2988.46 toks/s, output: 7759.44 toks/s][A | |
| Processed prompts: 52%|ββββββ | 683/1319 [00:21<00:13, 48.46it/s, est. speed input: 2997.78 toks/s, output: 7788.50 toks/s][A | |
| Processed prompts: 52%|ββββββ | 690/1319 [00:21<00:11, 52.84it/s, est. speed input: 3007.25 toks/s, output: 7814.09 toks/s][A | |
| Processed prompts: 53%|ββββββ | 696/1319 [00:22<00:15, 41.28it/s, est. speed input: 2997.32 toks/s, output: 7789.41 toks/s][A | |
| Processed prompts: 53%|ββββββ | 704/1319 [00:22<00:13, 45.04it/s, est. speed input: 3004.23 toks/s, output: 7818.56 toks/s][A | |
| Processed prompts: 54%|ββββββ | 709/1319 [00:22<00:15, 39.80it/s, est. speed input: 2997.81 toks/s, output: 7795.36 toks/s][A | |
| Processed prompts: 54%|ββββββ | 716/1319 [00:22<00:13, 44.24it/s, est. speed input: 3005.97 toks/s, output: 7874.86 toks/s][A | |
| Processed prompts: 55%|ββββββ | 721/1319 [00:22<00:14, 40.94it/s, est. speed input: 3002.17 toks/s, output: 7863.64 toks/s][A | |
| Processed prompts: 55%|ββββββ | 727/1319 [00:22<00:16, 36.42it/s, est. speed input: 2995.53 toks/s, output: 7847.50 toks/s][A | |
| Processed prompts: 55%|ββββββ | 732/1319 [00:22<00:16, 36.42it/s, est. speed input: 2993.54 toks/s, output: 7849.97 toks/s][A | |
| Processed prompts: 56%|ββββββ | 737/1319 [00:23<00:15, 38.74it/s, est. speed input: 2995.56 toks/s, output: 7855.38 toks/s][A | |
| Processed prompts: 57%|ββββββ | 746/1319 [00:23<00:11, 49.46it/s, est. speed input: 3009.42 toks/s, output: 7904.65 toks/s][A | |
| Processed prompts: 57%|ββββββ | 752/1319 [00:23<00:11, 49.08it/s, est. speed input: 3011.53 toks/s, output: 7910.88 toks/s][A | |
| Processed prompts: 57%|ββββββ | 758/1319 [00:23<00:11, 50.89it/s, est. speed input: 3017.14 toks/s, output: 7921.66 toks/s][A | |
| Processed prompts: 58%|ββββββ | 764/1319 [00:23<00:10, 52.15it/s, est. speed input: 3020.95 toks/s, output: 7930.69 toks/s][A | |
| Processed prompts: 58%|ββββββ | 771/1319 [00:23<00:09, 56.06it/s, est. speed input: 3027.47 toks/s, output: 7948.21 toks/s][A | |
| Processed prompts: 59%|ββββββ | 777/1319 [00:23<00:09, 56.41it/s, est. speed input: 3032.62 toks/s, output: 7965.60 toks/s][A | |
| Processed prompts: 59%|ββββββ | 784/1319 [00:23<00:09, 54.37it/s, est. speed input: 3036.54 toks/s, output: 7977.84 toks/s][A | |
| Processed prompts: 60%|ββββββ | 791/1319 [00:24<00:09, 54.73it/s, est. speed input: 3040.72 toks/s, output: 7977.98 toks/s][A | |
| Processed prompts: 60%|ββββββ | 797/1319 [00:24<00:11, 44.72it/s, est. speed input: 3033.98 toks/s, output: 7955.84 toks/s][A | |
| Processed prompts: 61%|ββββββ | 802/1319 [00:24<00:11, 45.25it/s, est. speed input: 3035.36 toks/s, output: 7980.86 toks/s][A | |
| Processed prompts: 61%|ββββββ | 807/1319 [00:24<00:14, 36.56it/s, est. speed input: 3024.77 toks/s, output: 7963.47 toks/s][A | |
| Processed prompts: 62%|βββββββ | 812/1319 [00:24<00:15, 33.04it/s, est. speed input: 3015.19 toks/s, output: 7937.72 toks/s][A | |
| Processed prompts: 62%|βββββββ | 816/1319 [00:24<00:14, 34.02it/s, est. speed input: 3014.08 toks/s, output: 7940.67 toks/s][A | |
| Processed prompts: 62%|βββββββ | 823/1319 [00:24<00:12, 41.27it/s, est. speed input: 3020.54 toks/s, output: 7960.18 toks/s][A | |
| Processed prompts: 63%|βββββββ | 830/1319 [00:25<00:10, 47.12it/s, est. speed input: 3026.97 toks/s, output: 7976.69 toks/s][A | |
| Processed prompts: 63%|βββββββ | 836/1319 [00:25<00:10, 47.36it/s, est. speed input: 3029.13 toks/s, output: 8028.00 toks/s][A | |
| Processed prompts: 64%|βββββββ | 842/1319 [00:25<00:09, 49.62it/s, est. speed input: 3032.35 toks/s, output: 8033.91 toks/s][A | |
| Processed prompts: 64%|βββββββ | 848/1319 [00:25<00:09, 49.24it/s, est. speed input: 3033.96 toks/s, output: 8041.92 toks/s][A | |
| Processed prompts: 65%|βββββββ | 858/1319 [00:25<00:07, 58.93it/s, est. speed input: 3046.71 toks/s, output: 8079.88 toks/s][A | |
| Processed prompts: 66%|βββββββ | 864/1319 [00:25<00:08, 51.27it/s, est. speed input: 3043.15 toks/s, output: 8071.94 toks/s][A | |
| Processed prompts: 66%|βββββββ | 870/1319 [00:25<00:08, 53.28it/s, est. speed input: 3046.88 toks/s, output: 8082.24 toks/s][A | |
| Processed prompts: 66%|βββββββ | 876/1319 [00:25<00:08, 52.22it/s, est. speed input: 3048.50 toks/s, output: 8082.39 toks/s][A | |
| Processed prompts: 67%|βββββββ | 882/1319 [00:26<00:09, 47.76it/s, est. speed input: 3046.25 toks/s, output: 8077.79 toks/s][A | |
| Processed prompts: 67%|βββββββ | 887/1319 [00:26<00:09, 46.97it/s, est. speed input: 3048.23 toks/s, output: 8138.11 toks/s][A | |
| Processed prompts: 68%|βββββββ | 894/1319 [00:26<00:08, 52.05it/s, est. speed input: 3055.22 toks/s, output: 8191.01 toks/s][A | |
| Processed prompts: 68%|βββββββ | 900/1319 [00:26<00:08, 51.30it/s, est. speed input: 3057.23 toks/s, output: 8194.13 toks/s][A | |
| Processed prompts: 69%|βββββββ | 906/1319 [00:26<00:10, 40.91it/s, est. speed input: 3046.45 toks/s, output: 8160.92 toks/s][A | |
| Processed prompts: 69%|βββββββ | 911/1319 [00:26<00:10, 38.87it/s, est. speed input: 3041.78 toks/s, output: 8146.16 toks/s][A | |
| Processed prompts: 69%|βββββββ | 916/1319 [00:26<00:10, 38.15it/s, est. speed input: 3038.52 toks/s, output: 8144.61 toks/s][A | |
| Processed prompts: 70%|βββββββ | 922/1319 [00:26<00:09, 42.42it/s, est. speed input: 3041.57 toks/s, output: 8151.24 toks/s][A | |
| Processed prompts: 70%|βββββββ | 927/1319 [00:27<00:10, 38.71it/s, est. speed input: 3036.23 toks/s, output: 8157.48 toks/s][A | |
| Processed prompts: 71%|βββββββ | 932/1319 [00:27<00:09, 40.22it/s, est. speed input: 3036.39 toks/s, output: 8164.86 toks/s][A | |
| Processed prompts: 71%|ββββββββ | 940/1319 [00:27<00:07, 48.82it/s, est. speed input: 3044.05 toks/s, output: 8190.96 toks/s][A | |
| Processed prompts: 72%|ββββββββ | 948/1319 [00:27<00:07, 50.53it/s, est. speed input: 3046.75 toks/s, output: 8198.87 toks/s][A | |
| Processed prompts: 72%|ββββββββ | 954/1319 [00:27<00:07, 46.05it/s, est. speed input: 3042.87 toks/s, output: 8185.44 toks/s][A | |
| Processed prompts: 73%|ββββββββ | 961/1319 [00:27<00:07, 48.34it/s, est. speed input: 3044.26 toks/s, output: 8183.76 toks/s][A | |
| Processed prompts: 73%|ββββββββ | 966/1319 [00:27<00:08, 41.51it/s, est. speed input: 3037.42 toks/s, output: 8171.55 toks/s][A | |
| Processed prompts: 74%|ββββββββ | 971/1319 [00:28<00:08, 39.96it/s, est. speed input: 3033.91 toks/s, output: 8156.52 toks/s][A | |
| Processed prompts: 74%|ββββββββ | 979/1319 [00:28<00:07, 46.75it/s, est. speed input: 3038.77 toks/s, output: 8174.86 toks/s][A | |
| Processed prompts: 75%|ββββββββ | 988/1319 [00:28<00:05, 55.18it/s, est. speed input: 3046.71 toks/s, output: 8202.08 toks/s][A | |
| Processed prompts: 76%|ββββββββ | 998/1319 [00:28<00:05, 62.66it/s, est. speed input: 3055.93 toks/s, output: 8241.45 toks/s][A | |
| Processed prompts: 76%|ββββββββ | 1005/1319 [00:28<00:05, 54.38it/s, est. speed input: 3053.11 toks/s, output: 8230.27 toks/s][A | |
| Processed prompts: 77%|ββββββββ | 1011/1319 [00:28<00:05, 53.06it/s, est. speed input: 3053.33 toks/s, output: 8231.42 toks/s][A | |
| Processed prompts: 77%|ββββββββ | 1017/1319 [00:28<00:06, 49.30it/s, est. speed input: 3050.00 toks/s, output: 8222.02 toks/s][A | |
| Processed prompts: 78%|ββββββββ | 1023/1319 [00:29<00:06, 47.81it/s, est. speed input: 3049.04 toks/s, output: 8238.77 toks/s][A | |
| Processed prompts: 78%|ββββββββ | 1029/1319 [00:29<00:10, 27.67it/s, est. speed input: 3016.08 toks/s, output: 8154.67 toks/s][A | |
| Processed prompts: 80%|ββββββββ | 1049/1319 [00:29<00:05, 50.35it/s, est. speed input: 3042.94 toks/s, output: 8246.87 toks/s][A | |
| Processed prompts: 80%|ββββββββ | 1056/1319 [00:29<00:05, 50.49it/s, est. speed input: 3043.28 toks/s, output: 8246.06 toks/s][A | |
| Processed prompts: 81%|ββββββββ | 1063/1319 [00:29<00:05, 49.82it/s, est. speed input: 3043.30 toks/s, output: 8246.82 toks/s][A | |
| Processed prompts: 81%|ββββββββ | 1069/1319 [00:30<00:05, 49.81it/s, est. speed input: 3043.43 toks/s, output: 8246.38 toks/s][A | |
| Processed prompts: 82%|βββββββββ | 1077/1319 [00:30<00:04, 54.33it/s, est. speed input: 3047.43 toks/s, output: 8256.70 toks/s][A | |
| Processed prompts: 82%|βββββββββ | 1083/1319 [00:30<00:04, 51.78it/s, est. speed input: 3045.87 toks/s, output: 8261.92 toks/s][A | |
| Processed prompts: 83%|βββββββββ | 1089/1319 [00:30<00:04, 51.35it/s, est. speed input: 3045.63 toks/s, output: 8261.64 toks/s][A | |
| Processed prompts: 83%|βββββββββ | 1096/1319 [00:30<00:04, 54.30it/s, est. speed input: 3047.44 toks/s, output: 8266.67 toks/s][A | |
| Processed prompts: 84%|βββββββββ | 1102/1319 [00:30<00:04, 51.85it/s, est. speed input: 3045.92 toks/s, output: 8256.98 toks/s][A | |
| Processed prompts: 84%|βββββββββ | 1108/1319 [00:30<00:04, 51.33it/s, est. speed input: 3045.57 toks/s, output: 8263.37 toks/s][A | |
| Processed prompts: 84%|βββββββββ | 1114/1319 [00:30<00:04, 48.67it/s, est. speed input: 3042.70 toks/s, output: 8254.30 toks/s][A | |
| Processed prompts: 85%|βββββββββ | 1123/1319 [00:31<00:03, 57.57it/s, est. speed input: 3048.92 toks/s, output: 8289.85 toks/s][A | |
| Processed prompts: 86%|βββββββββ | 1129/1319 [00:31<00:04, 41.17it/s, est. speed input: 3034.45 toks/s, output: 8257.13 toks/s][A | |
| Processed prompts: 86%|βββββββββ | 1134/1319 [00:31<00:04, 37.54it/s, est. speed input: 3027.62 toks/s, output: 8235.19 toks/s][A | |
| Processed prompts: 87%|βββββββββ | 1145/1319 [00:31<00:03, 51.28it/s, est. speed input: 3038.02 toks/s, output: 8287.99 toks/s][A | |
| Processed prompts: 87%|βββββββββ | 1151/1319 [00:31<00:03, 44.81it/s, est. speed input: 3030.52 toks/s, output: 8264.54 toks/s][A | |
| Processed prompts: 88%|βββββββββ | 1158/1319 [00:31<00:03, 46.87it/s, est. speed input: 3030.26 toks/s, output: 8277.76 toks/s][A | |
| Processed prompts: 88%|βββββββββ | 1165/1319 [00:32<00:03, 49.08it/s, est. speed input: 3030.38 toks/s, output: 8282.38 toks/s][A | |
| Processed prompts: 89%|βββββββββ | 1172/1319 [00:32<00:02, 52.32it/s, est. speed input: 3032.40 toks/s, output: 8297.41 toks/s][A | |
| Processed prompts: 89%|βββββββββ | 1178/1319 [00:32<00:02, 53.34it/s, est. speed input: 3033.24 toks/s, output: 8304.34 toks/s][A | |
| Processed prompts: 90%|βββββββββ | 1186/1319 [00:32<00:02, 55.22it/s, est. speed input: 3034.16 toks/s, output: 8303.61 toks/s][A | |
| Processed prompts: 91%|βββββββββ | 1194/1319 [00:32<00:02, 61.29it/s, est. speed input: 3038.79 toks/s, output: 8347.57 toks/s][A | |
| Processed prompts: 91%|βββββββββ | 1202/1319 [00:32<00:01, 64.19it/s, est. speed input: 3041.86 toks/s, output: 8365.20 toks/s][A | |
| Processed prompts: 92%|ββββββββββ| 1209/1319 [00:32<00:02, 49.68it/s, est. speed input: 3034.09 toks/s, output: 8355.57 toks/s][A | |
| Processed prompts: 92%|ββββββββββ| 1215/1319 [00:32<00:02, 44.90it/s, est. speed input: 3028.31 toks/s, output: 8340.13 toks/s][A | |
| Processed prompts: 92%|ββββββββββ| 1220/1319 [00:33<00:02, 44.75it/s, est. speed input: 3026.26 toks/s, output: 8335.05 toks/s][A | |
| Processed prompts: 93%|ββββββββββ| 1225/1319 [00:33<00:02, 45.25it/s, est. speed input: 3024.56 toks/s, output: 8329.66 toks/s][A | |
| Processed prompts: 93%|ββββββββββ| 1230/1319 [00:33<00:02, 43.88it/s, est. speed input: 3021.24 toks/s, output: 8327.52 toks/s][A | |
| Processed prompts: 94%|ββββββββββ| 1235/1319 [00:33<00:01, 44.00it/s, est. speed input: 3019.90 toks/s, output: 8333.91 toks/s][A | |
| Processed prompts: 94%|ββββββββββ| 1243/1319 [00:33<00:01, 50.23it/s, est. speed input: 3021.08 toks/s, output: 8335.48 toks/s][A | |
| Processed prompts: 95%|ββββββββββ| 1249/1319 [00:33<00:01, 43.60it/s, est. speed input: 3014.13 toks/s, output: 8340.97 toks/s][A | |
| Processed prompts: 95%|ββββββββββ| 1256/1319 [00:33<00:01, 48.59it/s, est. speed input: 3014.65 toks/s, output: 8343.43 toks/s][A | |
| Processed prompts: 96%|ββββββββββ| 1266/1319 [00:33<00:00, 60.39it/s, est. speed input: 3019.69 toks/s, output: 8365.51 toks/s][A | |
| Processed prompts: 97%|ββββββββββ| 1273/1319 [00:34<00:00, 53.87it/s, est. speed input: 3016.03 toks/s, output: 8368.69 toks/s][A | |
| Processed prompts: 97%|ββββββββββ| 1279/1319 [00:34<00:00, 52.23it/s, est. speed input: 3014.37 toks/s, output: 8375.32 toks/s][A | |
| Processed prompts: 97%|ββββββββββ| 1285/1319 [00:34<00:00, 52.16it/s, est. speed input: 3014.03 toks/s, output: 8398.41 toks/s][A | |
| Processed prompts: 98%|ββββββββββ| 1291/1319 [00:34<00:00, 51.88it/s, est. speed input: 3012.65 toks/s, output: 8407.01 toks/s][A | |
| Processed prompts: 98%|ββββββββββ| 1297/1319 [00:34<00:00, 40.91it/s, est. speed input: 3001.95 toks/s, output: 8393.91 toks/s][A | |
| Processed prompts: 99%|ββββββββββ| 1302/1319 [00:34<00:00, 41.33it/s, est. speed input: 2999.05 toks/s, output: 8398.72 toks/s][A | |
| Processed prompts: 99%|ββββββββββ| 1307/1319 [00:35<00:00, 32.78it/s, est. speed input: 2985.51 toks/s, output: 8373.86 toks/s][A | |
| Processed prompts: 99%|ββββββββββ| 1311/1319 [00:35<00:00, 23.60it/s, est. speed input: 2964.00 toks/s, output: 8324.81 toks/s][A | |
| Processed prompts: 100%|ββββββββββ| 1314/1319 [00:35<00:00, 19.03it/s, est. speed input: 2946.15 toks/s, output: 8325.84 toks/s][A | |
| Processed prompts: 100%|ββββββββββ| 1317/1319 [00:44<00:01, 1.39it/s, est. speed input: 2350.44 toks/s, output: 6739.34 toks/s][A | |
| Processed prompts: 100%|ββββββββββ| 1317/1319 [00:59<00:01, 1.39it/s, est. speed input: 2350.44 toks/s, output: 6739.34 toks/s][A | |
| Processed prompts: 100%|ββββββββββ| 1318/1319 [04:53<00:24, 24.45s/it, est. speed input: 358.85 toks/s, output: 1139.04 toks/s] [A | |
| Processed prompts: 100%|ββββββββββ| 1319/1319 [04:54<00:00, 21.58s/it, est. speed input: 358.53 toks/s, output: 1247.85 toks/s][A | |
| Processed prompts: 100%|ββββββββββ| 1319/1319 [04:54<00:00, 21.58s/it, est. speed input: 358.53 toks/s, output: 1247.85 toks/s][A Processed prompts: 100%|ββββββββββ| 1319/1319 [04:54<00:00, 4.48it/s, est. speed input: 358.53 toks/s, output: 1247.85 toks/s] | |
| Splits: 100%|ββββββββββ| 1/1 [04:54<00:00, 294.74s/it] Splits: 100%|ββββββββββ| 1/1 [04:54<00:00, 294.74s/it] | |
| Creating parquet from Arrow format: 0%| | 0/1 [00:00<?, ?ba/s] Creating parquet from Arrow format: 100%|ββββββββββ| 1/1 [00:00<00:00, 21.39ba/s] | |
| [2025-10-26 08:10:01,838] [[32m INFO[0m]: Cached 1319 samples of lighteval|gsm8k|0 (f61acff0a464569c, GENERATIVE) at /mnt/public/wucanhui/outputs/Qwen3-4B-math-reasoning/checkpoint-2562/0619260e1176b049/lighteval|gsm8k|0/f61acff0a464569c/GENERATIVE.parquet. (cache_management.py:345)[0m | |
| Generating train split: 0 examples [00:00, ? examples/s] Generating train split: 1319 examples [00:00, 41881.12 examples/s] | |
| [rank0]:[W1026 08:10:17.725679844 ProcessGroupNCCL.cpp:1479] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator()) | |
| [2025-10-26 08:10:18,640] [[32m INFO[0m]: --- POST-PROCESSING MODEL RESPONSES --- (pipeline.py:344)[0m | |
| [2025-10-26 08:10:18,643] [[32m INFO[0m]: --- COMPUTING METRICS --- (pipeline.py:371)[0m | |
| [2025-10-26 08:10:19,808] [[32m INFO[0m]: --- DISPLAYING RESULTS --- (pipeline.py:432)[0m | |
| [2025-10-26 08:10:19,818] [[32m INFO[0m]: --- SAVING AND PUSHING RESULTS --- (pipeline.py:422)[0m | |
| [2025-10-26 08:10:19,819] [[32m INFO[0m]: Saving experiment tracker (evaluation_tracker.py:246)[0m | |
| [2025-10-26 08:10:20,195] [[32m INFO[0m]: Saving results to /mnt/public/wucanhui/lighteval/results/results/mnt/public/wucanhui/outputs/Qwen3-4B-math-reasoning/checkpoint-2562/results_2025-10-26T08-10-19.820518.json (evaluation_tracker.py:310)[0m | |
| | Task |Version| Metric |Value | |Stderr| | |
| |-----------------|-------|----------------|-----:|---|-----:| | |
| |all | |extractive_match|0.9128|Β± |0.0078| | |
| |lighteval:gsm8k:0| |extractive_match|0.9128|Β± |0.0078| | |