Commit
·
02e1772
1
Parent(s):
779a575
Update README.md
Browse files
README.md
CHANGED
|
@@ -63,7 +63,7 @@ widget:
|
|
| 63 |
- [Intended uses and limitations](#intended-uses-and-limitations)
|
| 64 |
- [How to use](#how-to-use)
|
| 65 |
- [Limitations and bias](#limitations-and-bias)
|
| 66 |
-
- [Training](#training)
|
| 67 |
- [Evaluation](#evaluation)
|
| 68 |
- [Additional information](#additional-information)
|
| 69 |
|
|
@@ -200,12 +200,49 @@ The following is a list of evaluation areas and their respective datasets:
|
|
| 200 |
- Commonsense Reasoning: [COPA](https://people.ict.usc.edu/~gordon/copa.html) and its translation to Catalan ([COPA-ca](https://huggingface.co/datasets/projecte-aina/COPA-ca))
|
| 201 |
- Translation: [FLoRes](https://huggingface.co/datasets/flores)
|
| 202 |
|
| 203 |
-
|
| 204 |
-
|
| 205 |
-
|
| 206 |
-
|
| 207 |
-
|
| 208 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 209 |
|
| 210 |
|
| 211 |
## Additional information
|
|
|
|
| 63 |
- [Intended uses and limitations](#intended-uses-and-limitations)
|
| 64 |
- [How to use](#how-to-use)
|
| 65 |
- [Limitations and bias](#limitations-and-bias)
|
| 66 |
+
- [Training](#training)
|
| 67 |
- [Evaluation](#evaluation)
|
| 68 |
- [Additional information](#additional-information)
|
| 69 |
|
|
|
|
| 200 |
- Commonsense Reasoning: [COPA](https://people.ict.usc.edu/~gordon/copa.html) and its translation to Catalan ([COPA-ca](https://huggingface.co/datasets/projecte-aina/COPA-ca))
|
| 201 |
- Translation: [FLoRes](https://huggingface.co/datasets/flores)
|
| 202 |
|
| 203 |
+
### Reading Comprehension and Questions Answering
|
| 204 |
+
|
| 205 |
+
| Model | Belebele-ca | Belebele-es | Belebele-en | XQuAD-ca | XQuAD-es | XQuAD-en | CatalanQA | CoQCat |
|
| 206 |
+
| ------|:-----------:|:-----------:|:-----------:|:--------:|:--------:|:--------:|:---------:|:------:|
|
| 207 |
+
Random | 25.00 | 25.00 | 25.00 | - | - | - | - | - |
|
| 208 |
+
mGPT-1.3B | 26.64 | 25.82 | 28.07 | 0.33 | 0.67 | 0.17 | 0.65 | 0.78 |
|
| 209 |
+
GPT-Neo-1.3B | 39.55 | 37.50 | 42.83 | 19.75 | 29.77 | 51.53 | 22.34 | 23.57 |
|
| 210 |
+
Pythia-1.4B | 38.32 | 36.89 | 44.26 | 26.19 | 34.13 | 52.98 | 27.47 | 25.38 |
|
| 211 |
+
OPT-1.3B | 35.86 | 37.09 | 45.49 | 23.53 | 31.85 | 52.95 | 26.58 | 20.18 |
|
| 212 |
+
Falcon-rw-1.3B | 34.84 | 35.66 | **50.61** | 5.93 | 19.25 | **58.60** | 6.91 | 15.61 |
|
| 213 |
+
Cerebras-GPT-1.3B | 32.79 | 31.76 | 35.04 | 8.56 | 19.98 | 36.00 | 10.87 | 14.12 |
|
| 214 |
+
BLOOM-1.1B | 39.34 | 38.32 | 41.19 | 36.81 | 36.98 | 44.10 | 44.65 | 34.57 |
|
| 215 |
+
FLOR-760M | **41.19** | **39.55** | 36.68 | **41.10** | **41.11** | 40.20 | **51.01** | **41.34** |
|
| 216 |
+
|
| 217 |
+
|
| 218 |
+
### Natural Language Inference and Paraphrase Identification
|
| 219 |
+
|
| 220 |
+
| Model | XNLI-ca | XNLI-es | XNLI-en | TECA-ca | PAWS-X-ca | PAWS-X-es | PAWS-X-en | Parafraseja |
|
| 221 |
+
| ------|:-------:|:-------:|:-------:|:-------:|:---------:|:---------:|:---------:|:-----------:|
|
| 222 |
+
Random | 33.33 | 33.33 | 33.33 | 33.33 | 50.00 | 50.00 | 50.00 | 50.00 |
|
| 223 |
+
mGPT-1.3B | 40.06 | 43.81 | 45.67 | 37.03 | 51.00 | 52.30 | 56.15 | 51.32 |
|
| 224 |
+
GPT-Neo-1.3B | 41.44 | 45.57 | 49.92 | 35.38 | 54.65 | 53.40 | 54.60 | 51.70 |
|
| 225 |
+
Pythia-1.4B | 42.46 | 45.61 | 51.00 | 37.46 | 54.15 | 52.50 | **57.70** | 55.23 |
|
| 226 |
+
OPT-1.3B | 40.08 | 44.53 | **52.48** | 36.14 | 54.10 | 52.55 | 55.90 | 53.23 |
|
| 227 |
+
Falcon-rw-1.3B | 34.53 | 35.85 | 45.73 | 34.96 | 54.25 | **54.05** | 53.65 | 50.60 |
|
| 228 |
+
Cerebras-GPT-1.3B | 36.83 | 38.88 | 47.25 | 35.62 | 52.40 | 52.20 | 55.95 | 52.05 |
|
| 229 |
+
BLOOM-1.1B | **47.19** | **46.39** | 49.44 | 41.38 | **55.05** | 54.05 | 54.75 | 55.65 |
|
| 230 |
+
FLOR-760M | 46.93 | 46.03 | 46.11 | **42.14** | 52.35 | 52.50 | 54.85 | **56.55** |
|
| 231 |
+
|
| 232 |
+
|
| 233 |
+
### Commonsense Reasoning and Translation
|
| 234 |
+
|
| 235 |
+
| Model | XStoryCloze-es | XStoryCloze-en | COPA-ca | COPA-en | FloRes (ca->es) | FloRes (es->ca) | FloRes (ca->en) | FloRes (en->ca) | FloRes (es->en) | FloRes (en->es) |
|
| 236 |
+
| ------|:--------------:|:--------------:|:-------:|:-------:|:---------------:|:---------------:|:---------------:|:---------------:|:---------------:|:---------------:|
|
| 237 |
+
Random | 50.00 | 50.00 | 50.00 | 50.00 | - | - | - | - | - | - |
|
| 238 |
+
mGPT-1.3B | 55.33 | 60.09 | 52.20 | 63.40 | 3.25 | 2.96 | 9.25 | 3.79 | 17.75 | 15.34 |
|
| 239 |
+
GPT-Neo-1.3B | 51.42 | 66.58 | 53.40 | 74.80 | 3.27 | 3.80 | 17.77 | 5.49 | 17.70 | 12.04 |
|
| 240 |
+
Pythia-1.4B | 54.14 | 68.37 | 52.20 | 78.60 | 9.68 | 5.74 | 24.03 | 11.10 | 21.50 | 15.04 |
|
| 241 |
+
OPT-1.3B | 53.94 | 69.95 | 52.60 | 76.20 | 3.14 | 3.52 | 15.39 | 2.00 | 16.33 | 6.53 |
|
| 242 |
+
Falcon-rw-1.3B | 51.09 | **71.34** | 52.40 | **79.60** | 3.03 | 3.59 | 8.89 | 3.01 | 14.17 | 6.50 |
|
| 243 |
+
Cerebras-GPT-1.3B | 49.11 | 60.62 | 51.40 | 66.80 | 2.42 | 1.81 | 2.69 | 0.82 | 3.36 | 1.77 |
|
| 244 |
+
BLOOM-1.1B | 57.91 | 62.48 | 62.80 | 66.40 | 21.62 | 15.28 | 31.16 | 21.28 | **20.92** | 16.84 |
|
| 245 |
+
FLOR-760M | **61.42** | 61.42 | **65.40** | 64.20 | **22.62** | **15.77** | **32.26** | **26.04** | 20.91 | **18.08** |
|
| 246 |
|
| 247 |
|
| 248 |
## Additional information
|