Spaces:

gabeorlanski
/

bc_eval

Runtime error

gabeorlanski commited on Jul 18, 2023

Commit

9610edf

unverified ·

1 Parent(s): 506dd90

Fix

Files changed (1) hide show

README.md CHANGED Viewed

@@ -60,9 +60,9 @@ metrics, results = metric.compute(
 The `bc_eval` metric outputs two things:
-* `metrics`: a dictionary with the pass rates for each k value defined in the arguments and the mean percent of tests passed per question. The keys are formatted as `{LANGUAGE NAME}/{METRIC NAME}`
-* `results`: a list of dictionaries with the results from each individual prediction.
 #### Values from Popular Papers
 [PaLM-2](https://arxiv.org/pdf/2305.10403.pdf) Performance on BC-HumanEval (`pass@1` with greedy decoding):
@@ -87,7 +87,7 @@ The `bc_eval` metric outputs two things:
 Full example with inputs that fail tests, time out, have an error, and pass.
 #### Passing Example
-```python
 import evaluate
 from datasets import load_dataset
 import os

 The `bc_eval` metric outputs two things:
+`metrics`: a dictionary with the pass rates for each k value defined in the arguments and the mean percent of tests passed per question. The keys are formatted as `{LANGUAGE NAME}/{METRIC NAME}`
+`results`: a list of dictionaries with the results from each individual prediction.
 #### Values from Popular Papers
 [PaLM-2](https://arxiv.org/pdf/2305.10403.pdf) Performance on BC-HumanEval (`pass@1` with greedy decoding):
 Full example with inputs that fail tests, time out, have an error, and pass.
 #### Passing Example
+```Python
 import evaluate
 from datasets import load_dataset
 import os