Update README.md
Browse files
README.md
CHANGED
|
@@ -64,8 +64,9 @@ Debugged vibecoder dataset
|
|
| 64 |
|-----------------|---------|------------------|--------|--------------|-----------------------------|------------|-------------|
|
| 65 |
| gsm8k_cot | 3 | flexible-extract | 3 | exact_match ↑ | 0.8452+(0.7667) | 0.78 | 0.82 |
|
| 66 |
| humaneval | 1 | create_test | 0 | exact_match ↑ | 0.933+( 0.8) | 0.73 | 0.92 |
|
| 67 |
-
| mmlu_college_biology| 1 | create_test | 0 | exact_match ↑ | 1.0
|
| 68 |
-
| mmlu_high_school_computer_science| 1 | create_test | 0
|
|
|
|
| 69 |
|
| 70 |
## Example Usage
|
| 71 |
|
|
|
|
| 64 |
|-----------------|---------|------------------|--------|--------------|-----------------------------|------------|-------------|
|
| 65 |
| gsm8k_cot | 3 | flexible-extract | 3 | exact_match ↑ | 0.8452+(0.7667) | 0.78 | 0.82 |
|
| 66 |
| humaneval | 1 | create_test | 0 | exact_match ↑ | 0.933+( 0.8) | 0.73 | 0.92 |
|
| 67 |
+
| mmlu_college_biology| 1 | create_test | 0 | exact_match ↑ | 1.0 | | |
|
| 68 |
+
| mmlu_high_school_computer_science| 1 | create_test | 0 | exact_match ↑ | 1.0+(0.9) | | |
|
| 69 |
+
|computer_security| 1 | none | 2 | acc ↑ |0.8528+(0.700)| | |
|
| 70 |
|
| 71 |
## Example Usage
|
| 72 |
|