Update README.md
Browse files
README.md
CHANGED
|
@@ -119,26 +119,28 @@ for result in results:
|
|
| 119 |
### Benchmarks:
|
| 120 |
Below, you can see the F1 score on several text classification datasets. All tested models were not fine-tuned on those datasets and were tested in a zero-shot setting.
|
| 121 |
#### Multilingual benchmarks
|
| 122 |
-
| Dataset |
|
| 123 |
-
|
| 124 |
-
| FredZhang7/toxi-text-3M | 0.5972
|
| 125 |
-
| SetFit/
|
| 126 |
-
| Davlan/
|
| 127 |
-
| uhhlt/GermEval2017 | 0.3999
|
| 128 |
-
| dolfsai/
|
|
|
|
| 129 |
#### General benchmarks
|
| 130 |
-
| Dataset
|
| 131 |
-
|
| 132 |
-
| SetFit/CR
|
| 133 |
-
| SetFit/sst2
|
| 134 |
-
| SetFit/sst5
|
| 135 |
-
| AmazonScience/massive
|
| 136 |
-
| stanfordnlp/imdb
|
| 137 |
-
| SetFit/
|
| 138 |
-
| SetFit/
|
| 139 |
-
| PolyAI/banking77
|
| 140 |
-
| takala/
|
| 141 |
-
|
|
| 142 |
-
| dair-ai/emotion
|
| 143 |
-
| MoritzLaurer/
|
| 144 |
-
| cornell
|
|
|
|
|
|
| 119 |
### Benchmarks:
|
| 120 |
Below, you can see the F1 score on several text classification datasets. All tested models were not fine-tuned on those datasets and were tested in a zero-shot setting.
|
| 121 |
#### Multilingual benchmarks
|
| 122 |
+
| Dataset | gliclass-x-base | gliclass-base-v3.0 | gliclass-large-v3.0 |
|
| 123 |
+
| ------------------------ | --------------- | ------------------ | ------------------- |
|
| 124 |
+
| FredZhang7/toxi-text-3M | 0.5972 | 0.5072 | 0.6118 |
|
| 125 |
+
| SetFit/xglue\_nc | 0.5014 | 0.5348 | 0.5378 |
|
| 126 |
+
| Davlan/sib200\_14classes | 0.4663 | 0.2867 | 0.3173 |
|
| 127 |
+
| uhhlt/GermEval2017 | 0.3999 | 0.4010 | 0.4299 |
|
| 128 |
+
| dolfsai/toxic\_es | 0.1250 | 0.1399 | 0.1412 |
|
| 129 |
+
| **Average** | **0.41796** | **0.37392** | **0.4076** |
|
| 130 |
#### General benchmarks
|
| 131 |
+
| Dataset | gliclass-x-base | gliclass-base-v3.0 | gliclass-large-v3.0 |
|
| 132 |
+
| ---------------------------- | --------------- | ------------------ | ------------------- |
|
| 133 |
+
| SetFit/CR | 0.8630 | 0.9398 | 0.9400 |
|
| 134 |
+
| SetFit/sst2 | 0.8554 | 0.9192 | 0.9192 |
|
| 135 |
+
| SetFit/sst5 | 0.3287 | 0.4606 | 0.4606 |
|
| 136 |
+
| AmazonScience/massive | 0.2611 | 0.5649 | 0.5650 |
|
| 137 |
+
| stanfordnlp/imdb | 0.8840 | 0.9366 | 0.9366 |
|
| 138 |
+
| SetFit/20\_newsgroups | 0.4116 | 0.5958 | 0.5958 |
|
| 139 |
+
| SetFit/enron\_spam | 0.5929 | 0.7584 | 0.7584 |
|
| 140 |
+
| PolyAI/banking77 | 0.3098 | 0.5574 | 0.5574 |
|
| 141 |
+
| takala/financial\_phrasebank | 0.7851 | 0.9000 | 0.9000 |
|
| 142 |
+
| ag\_news | 0.6815 | 0.7181 | 0.7181 |
|
| 143 |
+
| dair-ai/emotion | 0.3667 | 0.4506 | 0.4510 |
|
| 144 |
+
| MoritzLaurer/cap\_sotu | 0.3935 | 0.4589 | 0.6118 |
|
| 145 |
+
| cornell/rotten\_tomatoes | 0.8411 | 0.8411 | 0.8411 |
|
| 146 |
+
| **Average** | **0.5902** | **0.7001** | **0.7120** |
|