Hanrui / idea1 /results /dflash_eval /math500_steps2.log
Lekr0's picture
Add files using upload-large-folder tool
2d67aa6 verified
Set TORCH_CUDA_ARCH_LIST to 9.0
/workspace/hanrui/idea1/specforge/modeling/draft/llama3_eagle.py:29: UserWarning: flash_attn is not found, falling back to flex_attention. Please install flash_attn if you want to use the flash attention backend.
warnings.warn(
<frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.cudart module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.runtime module instead.
<frozen importlib._bootstrap_external>:1241: FutureWarning: The cuda.nvrtc module is deprecated and will be removed in a future release, please switch to use the cuda.bindings.nvrtc module instead.
============================================================
DFlash Evaluation (Multi-GPU Data Parallel)
============================================================
Target model: /workspace/models/Qwen3-8B
Draft model: /workspace/models/Qwen3-8B-DFlash-b16
Dataset: math500
Max samples: 2
Max new tokens: 64
Denoise steps: 2
Temperature: 0.0
GPUs: 1
Dtype: bfloat16
============================================================
[1/4] Loading tokenizer...
[2/4] Loading target model on 1 GPUs...
`torch_dtype` is deprecated! Use `dtype` instead!
Loading checkpoint shards: 0%| | 0/5 [00:00<?, ?it/s] Loading checkpoint shards: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 5/5 [00:00<00:00, 151.62it/s]
[3/4] Loading draft model on 1 GPUs...
Draft layers: 5
Draft block_size: 16
Draft mask_token: 151669
Draft layer_ids: [1, 9, 17, 25, 33]
[4/4] Loading evaluation data...
Using the latest cached version of the dataset since HuggingFaceH4/MATH-500 couldn't be found on the Hugging Face Hub (offline mode is enabled).
WARNING:datasets.load:Using the latest cached version of the dataset since HuggingFaceH4/MATH-500 couldn't be found on the Hugging Face Hub (offline mode is enabled).
Found the latest cached dataset configuration 'default' at /workspace/hanrui/datasets/HuggingFaceH4___math-500/default/0.0.0/6e4ed1a2a79af7d8630a6b768ec859cb5af4d3be (last modified on Tue Mar 17 13:17:15 2026).
WARNING:datasets.packaged_modules.cache.cache:Found the latest cached dataset configuration 'default' at /workspace/hanrui/datasets/HuggingFaceH4___math-500/default/0.0.0/6e4ed1a2a79af7d8630a6b768ec859cb5af4d3be (last modified on Tue Mar 17 13:17:15 2026).
Total prompts: 2, ~2 per GPU
============================================================
Running evaluation...
============================================================
[GPU 0] Sample 1/2 | tokens=64 | tau=1.56 | time=2.2s | <think> Okay, so I need to convert the rectangular coordinates (0, 3) to polar c...
[GPU 0] Sample 2/2 | tokens=64 | tau=1.76 | time=1.4s | <think> Okay, so I need to find a way to express the double sum $\sum_{j = 1}^\i...
============================================================
RESULTS SUMMARY
============================================================
Denoise steps: 2
GPUs used: 1
Samples evaluated: 2
Total blocks: 78
Total generated tokens: 128
Total GPU-time: 3.58s
Wall-clock time (approx): 2.15s
---
Avg acceptance length (tau): 1.65
Median acceptance length: 1.0
Per-sample avg tau: ['1.56', '1.76']
Min per-sample tau: 1.56
Max per-sample tau: 1.76
============================================================
Results saved to /workspace/hanrui/idea1/results/dflash_eval/math500_steps2.json