Spaces:
Sleeping
Sleeping
Kewen Zhao
commited on
Commit
·
86bf33d
1
Parent(s):
f5dea60
update readme
Browse files
README.md
CHANGED
|
@@ -11,16 +11,16 @@ tags:
|
|
| 11 |
- evaluate
|
| 12 |
- metric
|
| 13 |
description: >-
|
| 14 |
-
|
| 15 |
-
described in the paper "Evaluating Large Language Models Trained on Code"
|
| 16 |
-
(https://arxiv.org/abs/2107.03374).
|
| 17 |
---
|
| 18 |
|
| 19 |
# Metric Card for Code Eval StdIO
|
| 20 |
|
| 21 |
## Metric description
|
| 22 |
|
| 23 |
-
|
|
|
|
|
|
|
| 24 |
|
| 25 |
The CodeEval metric estimates the pass@k metric for code synthesis.
|
| 26 |
|
|
|
|
| 11 |
- evaluate
|
| 12 |
- metric
|
| 13 |
description: >-
|
| 14 |
+
The stdio version of of the ["code eval"](https://huggingface.co/spaces/evaluate-metric/code_eval) metrics, which handles python programs that read inputs from STDIN and print answers to STDOUT, which is common in competitive programming (e.g. CodeForce, USACO)
|
|
|
|
|
|
|
| 15 |
---
|
| 16 |
|
| 17 |
# Metric Card for Code Eval StdIO
|
| 18 |
|
| 19 |
## Metric description
|
| 20 |
|
| 21 |
+
This metric implements the evaluation harness for the HumanEval problem solving dataset
|
| 22 |
+
described in the paper "Evaluating Large Language Models Trained on Code"
|
| 23 |
+
(https://arxiv.org/abs/2107.03374).
|
| 24 |
|
| 25 |
The CodeEval metric estimates the pass@k metric for code synthesis.
|
| 26 |
|