Spaces:

codeparrot
/

code-generation-models

Sleeping

loubnabnl HF Staff commited on May 25, 2022

Commit

339089c

1 Parent(s): 556b022

update

Files changed (1) hide show

evaluation/intro.txt CHANGED Viewed

@@ -19,6 +19,8 @@ In most papers, 200 candidate program completions are sampled, and pass@1, pass@
 We can load HumanEval dataset and pass@k metric from the hub:
 ```python
 human_eval = load_dataset("openai_humaneval")
 code_eval_metric = load_metric("code_eval")
 ```
@@ -26,7 +28,6 @@ code_eval_metric = load_metric("code_eval")
 We can easily compute the pass@k for a problem that asks for the implementation of a function that sums two integers:
 ```python
-from datasets import load_metric
 test_cases = ["assert add(2,3)==5"]
 candidates = [["def add(a,b): return a*b", "def add(a, b): return a+b"]]
 pass_at_k, results = code_eval_metric.compute(references=test_cases, predictions=candidates, k=[1, 2])

 We can load HumanEval dataset and pass@k metric from the hub:
 ```python
+from datasets import load_dataset, load_metric
 human_eval = load_dataset("openai_humaneval")
 code_eval_metric = load_metric("code_eval")
 ```
 We can easily compute the pass@k for a problem that asks for the implementation of a function that sums two integers:
 ```python
 test_cases = ["assert add(2,3)==5"]
 candidates = [["def add(a,b): return a*b", "def add(a, b): return a+b"]]
 pass_at_k, results = code_eval_metric.compute(references=test_cases, predictions=candidates, k=[1, 2])