gonzalez-agirre commited on
Commit
02e1772
·
1 Parent(s): 779a575

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -7
README.md CHANGED
@@ -63,7 +63,7 @@ widget:
63
  - [Intended uses and limitations](#intended-uses-and-limitations)
64
  - [How to use](#how-to-use)
65
  - [Limitations and bias](#limitations-and-bias)
66
- - [Training](#training)
67
  - [Evaluation](#evaluation)
68
  - [Additional information](#additional-information)
69
 
@@ -200,12 +200,49 @@ The following is a list of evaluation areas and their respective datasets:
200
  - Commonsense Reasoning: [COPA](https://people.ict.usc.edu/~gordon/copa.html) and its translation to Catalan ([COPA-ca](https://huggingface.co/datasets/projecte-aina/COPA-ca))
201
  - Translation: [FLoRes](https://huggingface.co/datasets/flores)
202
 
203
-
204
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/635ba692dc371b8f91005172/nKvFF6Ap7ocdAtSBQyD6Q.png)
205
-
206
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/635ba692dc371b8f91005172/OcCNfkKyGB4zXi2pXjbB4.png)
207
-
208
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/635ba692dc371b8f91005172/d3iW68sAubt1uU0-le5hX.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
209
 
210
 
211
  ## Additional information
 
63
  - [Intended uses and limitations](#intended-uses-and-limitations)
64
  - [How to use](#how-to-use)
65
  - [Limitations and bias](#limitations-and-bias)
66
+ - [Training](#training)
67
  - [Evaluation](#evaluation)
68
  - [Additional information](#additional-information)
69
 
 
200
  - Commonsense Reasoning: [COPA](https://people.ict.usc.edu/~gordon/copa.html) and its translation to Catalan ([COPA-ca](https://huggingface.co/datasets/projecte-aina/COPA-ca))
201
  - Translation: [FLoRes](https://huggingface.co/datasets/flores)
202
 
203
+ ### Reading Comprehension and Questions Answering
204
+
205
+ | Model | Belebele-ca | Belebele-es | Belebele-en | XQuAD-ca | XQuAD-es | XQuAD-en | CatalanQA | CoQCat |
206
+ | ------|:-----------:|:-----------:|:-----------:|:--------:|:--------:|:--------:|:---------:|:------:|
207
+ Random | 25.00 | 25.00 | 25.00 | - | - | - | - | - |
208
+ mGPT-1.3B | 26.64 | 25.82 | 28.07 | 0.33 | 0.67 | 0.17 | 0.65 | 0.78 |
209
+ GPT-Neo-1.3B | 39.55 | 37.50 | 42.83 | 19.75 | 29.77 | 51.53 | 22.34 | 23.57 |
210
+ Pythia-1.4B | 38.32 | 36.89 | 44.26 | 26.19 | 34.13 | 52.98 | 27.47 | 25.38 |
211
+ OPT-1.3B | 35.86 | 37.09 | 45.49 | 23.53 | 31.85 | 52.95 | 26.58 | 20.18 |
212
+ Falcon-rw-1.3B | 34.84 | 35.66 | **50.61** | 5.93 | 19.25 | **58.60** | 6.91 | 15.61 |
213
+ Cerebras-GPT-1.3B | 32.79 | 31.76 | 35.04 | 8.56 | 19.98 | 36.00 | 10.87 | 14.12 |
214
+ BLOOM-1.1B | 39.34 | 38.32 | 41.19 | 36.81 | 36.98 | 44.10 | 44.65 | 34.57 |
215
+ FLOR-760M | **41.19** | **39.55** | 36.68 | **41.10** | **41.11** | 40.20 | **51.01** | **41.34** |
216
+
217
+
218
+ ### Natural Language Inference and Paraphrase Identification
219
+
220
+ | Model | XNLI-ca | XNLI-es | XNLI-en | TECA-ca | PAWS-X-ca | PAWS-X-es | PAWS-X-en | Parafraseja |
221
+ | ------|:-------:|:-------:|:-------:|:-------:|:---------:|:---------:|:---------:|:-----------:|
222
+ Random | 33.33 | 33.33 | 33.33 | 33.33 | 50.00 | 50.00 | 50.00 | 50.00 |
223
+ mGPT-1.3B | 40.06 | 43.81 | 45.67 | 37.03 | 51.00 | 52.30 | 56.15 | 51.32 |
224
+ GPT-Neo-1.3B | 41.44 | 45.57 | 49.92 | 35.38 | 54.65 | 53.40 | 54.60 | 51.70 |
225
+ Pythia-1.4B | 42.46 | 45.61 | 51.00 | 37.46 | 54.15 | 52.50 | **57.70** | 55.23 |
226
+ OPT-1.3B | 40.08 | 44.53 | **52.48** | 36.14 | 54.10 | 52.55 | 55.90 | 53.23 |
227
+ Falcon-rw-1.3B | 34.53 | 35.85 | 45.73 | 34.96 | 54.25 | **54.05** | 53.65 | 50.60 |
228
+ Cerebras-GPT-1.3B | 36.83 | 38.88 | 47.25 | 35.62 | 52.40 | 52.20 | 55.95 | 52.05 |
229
+ BLOOM-1.1B | **47.19** | **46.39** | 49.44 | 41.38 | **55.05** | 54.05 | 54.75 | 55.65 |
230
+ FLOR-760M | 46.93 | 46.03 | 46.11 | **42.14** | 52.35 | 52.50 | 54.85 | **56.55** |
231
+
232
+
233
+ ### Commonsense Reasoning and Translation
234
+
235
+ | Model | XStoryCloze-es | XStoryCloze-en | COPA-ca | COPA-en | FloRes (ca->es) | FloRes (es->ca) | FloRes (ca->en) | FloRes (en->ca) | FloRes (es->en) | FloRes (en->es) |
236
+ | ------|:--------------:|:--------------:|:-------:|:-------:|:---------------:|:---------------:|:---------------:|:---------------:|:---------------:|:---------------:|
237
+ Random | 50.00 | 50.00 | 50.00 | 50.00 | - | - | - | - | - | - |
238
+ mGPT-1.3B | 55.33 | 60.09 | 52.20 | 63.40 | 3.25 | 2.96 | 9.25 | 3.79 | 17.75 | 15.34 |
239
+ GPT-Neo-1.3B | 51.42 | 66.58 | 53.40 | 74.80 | 3.27 | 3.80 | 17.77 | 5.49 | 17.70 | 12.04 |
240
+ Pythia-1.4B | 54.14 | 68.37 | 52.20 | 78.60 | 9.68 | 5.74 | 24.03 | 11.10 | 21.50 | 15.04 |
241
+ OPT-1.3B | 53.94 | 69.95 | 52.60 | 76.20 | 3.14 | 3.52 | 15.39 | 2.00 | 16.33 | 6.53 |
242
+ Falcon-rw-1.3B | 51.09 | **71.34** | 52.40 | **79.60** | 3.03 | 3.59 | 8.89 | 3.01 | 14.17 | 6.50 |
243
+ Cerebras-GPT-1.3B | 49.11 | 60.62 | 51.40 | 66.80 | 2.42 | 1.81 | 2.69 | 0.82 | 3.36 | 1.77 |
244
+ BLOOM-1.1B | 57.91 | 62.48 | 62.80 | 66.40 | 21.62 | 15.28 | 31.16 | 21.28 | **20.92** | 16.84 |
245
+ FLOR-760M | **61.42** | 61.42 | **65.40** | 64.20 | **22.62** | **15.77** | **32.26** | **26.04** | 20.91 | **18.08** |
246
 
247
 
248
  ## Additional information