Commit
·
cefca95
1
Parent(s):
702785a
:pencil: include more evaluations
Browse files
README.md
CHANGED
|
@@ -64,14 +64,17 @@ On December 25, 2022, in New York City, the sun will rise at 07:16 AM and set at
|
|
| 64 |
Evaluations are on part with Qwen3:
|
| 65 |
|
| 66 |
```
|
|
|
|
| 67 |
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|
| 68 |
|---------|------:|------|-----:|--------|---|-----:|---|-----:|
|
| 69 |
-
|
|
| 70 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
```
|
| 72 |
|
| 73 |
-
_Currently running more evaluatiomns_
|
| 74 |
-
|
| 75 |
## Usage
|
| 76 |
|
| 77 |
Suggested use is:
|
|
|
|
| 64 |
Evaluations are on part with Qwen3:
|
| 65 |
|
| 66 |
```
|
| 67 |
+
hf (pretrained=pool-water/script-kiddie,dtype=bfloat16), gen_kwargs: (None), limit: None, num_fewshot: 2, batch_size: auto (40)
|
| 68 |
| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
|
| 69 |
|---------|------:|------|-----:|--------|---|-----:|---|-----:|
|
| 70 |
+
|boolq | 2|none | 2|acc |_ |0.6939|_ |0.0081|
|
| 71 |
+
|hellaswag| 1|none | 2|acc |_ |0.3961|_ |0.0049|
|
| 72 |
+
| | |none | 2|acc_norm|_ |0.4963|_ |0.0050|
|
| 73 |
+
|piqa | 1|none | 2|acc |_ |0.6757|_ |0.0109|
|
| 74 |
+
| | |none | 2|acc_norm|_ |0.6741|_ |0.0109|
|
| 75 |
+
|rte | 1|none | 2|acc |_ |0.6751|_ |0.0282|
|
| 76 |
```
|
| 77 |
|
|
|
|
|
|
|
| 78 |
## Usage
|
| 79 |
|
| 80 |
Suggested use is:
|