Update README.md
Browse files
README.md
CHANGED
|
@@ -310,7 +310,7 @@ This checkpoint has strong zero-shot validation performance on many tasks (e.g.
|
|
| 310 |
| anli/a2 | 47.2 |
|
| 311 |
| anli/a3 | 49.4 |
|
| 312 |
| nli_fever | 79.4 |
|
| 313 |
-
|
|
| 314 |
| ConTRoL-nli | 63.3 |
|
| 315 |
| cladder | 71.1 |
|
| 316 |
| zero-shot-label-nli | 74.4 |
|
|
@@ -318,6 +318,8 @@ This checkpoint has strong zero-shot validation performance on many tasks (e.g.
|
|
| 318 |
| oasst2_pairwise_rlhf_reward | 73.9 |
|
| 319 |
| doc-nli | 90.0 |
|
| 320 |
|
|
|
|
|
|
|
| 321 |
# [ZS] Zero-shot classification pipeline
|
| 322 |
```python
|
| 323 |
from transformers import pipeline
|
|
|
|
| 310 |
| anli/a2 | 47.2 |
|
| 311 |
| anli/a3 | 49.4 |
|
| 312 |
| nli_fever | 79.4 |
|
| 313 |
+
| FOLIO | 61.8 |
|
| 314 |
| ConTRoL-nli | 63.3 |
|
| 315 |
| cladder | 71.1 |
|
| 316 |
| zero-shot-label-nli | 74.4 |
|
|
|
|
| 318 |
| oasst2_pairwise_rlhf_reward | 73.9 |
|
| 319 |
| doc-nli | 90.0 |
|
| 320 |
|
| 321 |
+
Zero-shot GPT-4 scores 61% on FOLIO (logical reasoning), 62% on cladder (probabilistic reasoning) and 56.4% on ConTRoL (long context NLI).
|
| 322 |
+
|
| 323 |
# [ZS] Zero-shot classification pipeline
|
| 324 |
```python
|
| 325 |
from transformers import pipeline
|