projecte-aina
/

FLOR-760M

@@ -63,7 +63,7 @@ widget:
 - [Intended uses and limitations](#intended-uses-and-limitations)
 - [How to use](#how-to-use)
 - [Limitations and bias](#limitations-and-bias)
-- [Training](#training)
 - [Evaluation](#evaluation)
 - [Additional information](#additional-information)
@@ -200,12 +200,49 @@ The following is a list of evaluation areas and their respective datasets:
 - Commonsense Reasoning: [COPA](https://people.ict.usc.edu/~gordon/copa.html) and its translation to Catalan ([COPA-ca](https://huggingface.co/datasets/projecte-aina/COPA-ca))
 - Translation: [FLoRes](https://huggingface.co/datasets/flores)
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/635ba692dc371b8f91005172/nKvFF6Ap7ocdAtSBQyD6Q.png)
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/635ba692dc371b8f91005172/OcCNfkKyGB4zXi2pXjbB4.png)
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/635ba692dc371b8f91005172/d3iW68sAubt1uU0-le5hX.png)
 ## Additional information

 - [Intended uses and limitations](#intended-uses-and-limitations)
 - [How to use](#how-to-use)
 - [Limitations and bias](#limitations-and-bias)
+- [Training](#training)
 - [Evaluation](#evaluation)
 - [Additional information](#additional-information)
 - Commonsense Reasoning: [COPA](https://people.ict.usc.edu/~gordon/copa.html) and its translation to Catalan ([COPA-ca](https://huggingface.co/datasets/projecte-aina/COPA-ca))
 - Translation: [FLoRes](https://huggingface.co/datasets/flores)
+### Reading Comprehension and Questions Answering
+| Model | Belebele-ca | Belebele-es | Belebele-en | XQuAD-ca | XQuAD-es | XQuAD-en | CatalanQA | CoQCat |
+| ------|:-----------:|:-----------:|:-----------:|:--------:|:--------:|:--------:|:---------:|:------:|
+Random | 25.00 | 25.00 | 25.00 | - | - | - | - | - |
+mGPT-1.3B | 26.64 | 25.82 | 28.07 | 0.33 | 0.67 | 0.17 | 0.65 | 0.78 |
+GPT-Neo-1.3B | 39.55 | 37.50 | 42.83 | 19.75 | 29.77 | 51.53 | 22.34 | 23.57 |
+Pythia-1.4B | 38.32 | 36.89 | 44.26 | 26.19 | 34.13 | 52.98 | 27.47 | 25.38 |
+OPT-1.3B | 35.86 | 37.09 | 45.49 | 23.53 | 31.85 | 52.95 | 26.58 | 20.18 |
+Falcon-rw-1.3B | 34.84 | 35.66 | **50.61** | 5.93 | 19.25 | **58.60** | 6.91 | 15.61 |
+Cerebras-GPT-1.3B | 32.79 | 31.76 | 35.04 | 8.56 | 19.98 | 36.00 | 10.87 | 14.12 |
+BLOOM-1.1B | 39.34 | 38.32 | 41.19 | 36.81 | 36.98 | 44.10 | 44.65 | 34.57 |
+FLOR-760M | **41.19** | **39.55** | 36.68 | **41.10** | **41.11** | 40.20 | **51.01** | **41.34** |
+### Natural Language Inference and Paraphrase Identification
+| Model | XNLI-ca | XNLI-es | XNLI-en | TECA-ca | PAWS-X-ca | PAWS-X-es | PAWS-X-en | Parafraseja |
+| ------|:-------:|:-------:|:-------:|:-------:|:---------:|:---------:|:---------:|:-----------:|
+Random | 33.33 | 33.33 | 33.33 | 33.33 | 50.00 | 50.00 | 50.00 | 50.00 |
+mGPT-1.3B | 40.06 | 43.81 | 45.67 | 37.03 | 51.00 | 52.30 | 56.15 | 51.32 |
+GPT-Neo-1.3B | 41.44 | 45.57 | 49.92 | 35.38 | 54.65 | 53.40 | 54.60 | 51.70 |
+Pythia-1.4B | 42.46 | 45.61 | 51.00 | 37.46 | 54.15 | 52.50 | **57.70** | 55.23 |
+OPT-1.3B | 40.08 | 44.53 | **52.48** | 36.14 | 54.10 | 52.55 | 55.90 | 53.23 |
+Falcon-rw-1.3B | 34.53 | 35.85 | 45.73 | 34.96 | 54.25 | **54.05** | 53.65 | 50.60 |
+Cerebras-GPT-1.3B | 36.83 | 38.88 | 47.25 | 35.62 | 52.40 | 52.20 | 55.95 | 52.05 |
+BLOOM-1.1B | **47.19** | **46.39** | 49.44 | 41.38 | **55.05** | 54.05 | 54.75 | 55.65 |
+FLOR-760M | 46.93 | 46.03 | 46.11 | **42.14** | 52.35 | 52.50 | 54.85 | **56.55** |
+### Commonsense Reasoning and Translation
+| Model | XStoryCloze-es | XStoryCloze-en | COPA-ca | COPA-en | FloRes (ca->es) | FloRes (es->ca) | FloRes (ca->en) | FloRes (en->ca) | FloRes (es->en) | FloRes (en->es) |
+| ------|:--------------:|:--------------:|:-------:|:-------:|:---------------:|:---------------:|:---------------:|:---------------:|:---------------:|:---------------:|
+Random | 50.00 | 50.00 | 50.00 | 50.00 | - | - | - | - | - | - |
+mGPT-1.3B | 55.33 | 60.09 | 52.20 | 63.40 | 3.25 | 2.96 | 9.25 | 3.79 | 17.75 | 15.34 |
+GPT-Neo-1.3B | 51.42 | 66.58 | 53.40 | 74.80 | 3.27 | 3.80 | 17.77 | 5.49 | 17.70 | 12.04 |
+Pythia-1.4B | 54.14 | 68.37 | 52.20 | 78.60 | 9.68 | 5.74 | 24.03 | 11.10 | 21.50 | 15.04 |
+OPT-1.3B | 53.94 | 69.95 | 52.60 | 76.20 | 3.14 | 3.52 | 15.39 | 2.00 | 16.33 | 6.53 |
+Falcon-rw-1.3B | 51.09 | **71.34** | 52.40 | **79.60** | 3.03 | 3.59 | 8.89 | 3.01 | 14.17 | 6.50 |
+Cerebras-GPT-1.3B | 49.11 | 60.62 | 51.40 | 66.80 | 2.42 | 1.81 | 2.69 | 0.82 | 3.36 | 1.77 |
+BLOOM-1.1B | 57.91 | 62.48 | 62.80 | 66.40 | 21.62 | 15.28 | 31.16 | 21.28 | **20.92** | 16.84 |
+FLOR-760M | **61.42** | 61.42 | **65.40** | 64.20 | **22.62** | **15.77** | **32.26** | **26.04** | 20.91 | **18.08** |
 ## Additional information