ruipeterpan commited on
Commit
a4de607
·
verified ·
1 Parent(s): 451404a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -59,7 +59,7 @@ python -m sglang.launch_server --model Qwen/Qwen2.5-7B-Instruct \
59
  ```
60
 
61
 
62
- ### Performance Evaluation
63
 
64
  We run our evaluations on two NVIDIA A6000-48GB GPUs connected via PCIe 4.0 x16. We conducted an extensive hyperparameter search of `num_speculative_tokens` from 3 to 20. In each entry, we report the best speedup across different speculation lengths. The following table reports the TPT speedup over vanilla decoding.
65
 
 
59
  ```
60
 
61
 
62
+ ### vLLM Performance Evaluation
63
 
64
  We run our evaluations on two NVIDIA A6000-48GB GPUs connected via PCIe 4.0 x16. We conducted an extensive hyperparameter search of `num_speculative_tokens` from 3 to 20. In each entry, we report the best speedup across different speculation lengths. The following table reports the TPT speedup over vanilla decoding.
65