Hanrui / idea1 /results /dflash_eval /math500_steps2.log

Add files using upload-large-folder tool

2d67aa6 verified about 1 month ago

3.6 kB

	Set TORCH_CUDA_ARCH_LIST to 9.0
	/workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend.
	warnings.warn(
	<frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead.
	<frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead.
	============================================================
	DFlash Evaluation (Multi-GPU Data Parallel)
	============================================================
	Target model: /workspace/models/Qwen3-8B
	Draft model: /workspace/models/Qwen3-8B-DFlash-b16
	Dataset: math500
	Max samples: 2
	Max new tokens: 64
	Denoise steps: 2
	Temperature: 0.0
	GPUs: 1
	Dtype: bfloat16
	============================================================

	[1/4] Loading tokenizer...
	[2/4] Loading target model on 1 GPUs...
	`torch_dtype` is deprecated! Use `dtype` instead!
	Loading checkpoint shards: 0%\| \| 0/5 [00:00<?, ?it/s] Loading checkpoint shards: 100%\|██████████\| 5/5 [00:00<00:00, 151.62it/s]
	[3/4] Loading draft model on 1 GPUs...
	Draft layers: 5
	Draft block_size: 16
	Draft mask_token: 151669
	Draft layer_ids: [1, 9, 17, 25, 33]
	[4/4] Loading evaluation data...
	Using the latest cached version of the dataset since HuggingFaceH4/MATH-500 couldn't be found on the Hugging Face Hub (offline mode is enabled).
	WARNING:datasets.load:Using the latest cached version of the dataset since HuggingFaceH4/MATH-500 couldn't be found on the Hugging Face Hub (offline mode is enabled).
	Found the latest cached dataset configuration 'default' at /workspace/hanrui/datasets/HuggingFaceH4___math-500/default/0.0.0/6e4ed1a2a79af7d8630a6b768ec859cb5af4d3be (last modified on Tue Mar 17 13:17:15 2026).
	WARNING:datasets.packaged_modules.cache.cache:Found the latest cached dataset configuration 'default' at /workspace/hanrui/datasets/HuggingFaceH4___math-500/default/0.0.0/6e4ed1a2a79af7d8630a6b768ec859cb5af4d3be (last modified on Tue Mar 17 13:17:15 2026).
	Total prompts: 2, ~2 per GPU

	============================================================
	Running evaluation...
	============================================================
	[GPU 0] Sample 1/2 \| tokens=64 \| tau=1.56 \| time=2.2s \| <think> Okay, so I need to convert the rectangular coordinates (0, 3) to polar c...
	[GPU 0] Sample 2/2 \| tokens=64 \| tau=1.76 \| time=1.4s \| <think> Okay, so I need to find a way to express the double sum $\sum_{j = 1}^\i...

	============================================================
	RESULTS SUMMARY
	============================================================
	Denoise steps: 2
	GPUs used: 1
	Samples evaluated: 2
	Total blocks: 78
	Total generated tokens: 128
	Total GPU-time: 3.58s
	Wall-clock time (approx): 2.15s
	---
	Avg acceptance length (tau): 1.65
	Median acceptance length: 1.0
	Per-sample avg tau: ['1.56', '1.76']
	Min per-sample tau: 1.56
	Max per-sample tau: 1.76
	============================================================

	Results saved to /workspace/hanrui/idea1/results/dflash_eval/math500_steps2.json