Update instruction in Readme
Browse files
README.md
CHANGED
|
@@ -45,6 +45,8 @@ vllm serve \
|
|
| 45 |
### Evaluation
|
| 46 |
From vllm-bench, the acceptance length (AL) on HumanEval dataset with different speculation length K is shown as below.
|
| 47 |
|
|
|
|
|
|
|
| 48 |
| K | Acceptance Length |
|
| 49 |
|---|-------------------|
|
| 50 |
| 4 | 4.30 |
|
|
@@ -60,7 +62,7 @@ vllm bench serve \
|
|
| 60 |
--model Qwen/Qwen3-Coder-30B-A3B-Instruct \
|
| 61 |
--dataset-name custom \
|
| 62 |
--dataset-path /home/ubuntu/eval_datasets/humaneval_qwen3coder_bench.jsonl \
|
| 63 |
-
--custom-output-len
|
| 64 |
--num-prompts 80 \
|
| 65 |
--max-concurrency 1 \
|
| 66 |
--temperature 0 \
|
|
|
|
| 45 |
### Evaluation
|
| 46 |
From vllm-bench, the acceptance length (AL) on HumanEval dataset with different speculation length K is shown as below.
|
| 47 |
|
| 48 |
+
We use instruction-formatted prompts following standard practice for instruct models (similar to [DeepSeek-Coder evaluation](https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/Evaluation/HumanEval/eval_instruct.py) and [Llama 3.1 8B instruction evaluation](https://huggingface.co/datasets/meta-llama/Llama-3.1-8B-Instruct-evals/viewer/Llama-3.1-8B-Instruct-evals__human_eval__details?row=1)). The instruction we add in front of each prompt is ```Complete the following Python function. Only output the code, no explanations.```
|
| 49 |
+
|
| 50 |
| K | Acceptance Length |
|
| 51 |
|---|-------------------|
|
| 52 |
| 4 | 4.30 |
|
|
|
|
| 62 |
--model Qwen/Qwen3-Coder-30B-A3B-Instruct \
|
| 63 |
--dataset-name custom \
|
| 64 |
--dataset-path /home/ubuntu/eval_datasets/humaneval_qwen3coder_bench.jsonl \
|
| 65 |
+
--custom-output-len 256 \
|
| 66 |
--num-prompts 80 \
|
| 67 |
--max-concurrency 1 \
|
| 68 |
--temperature 0 \
|