mapama247 commited on
Commit
54fee6e
·
1 Parent(s): aa98172

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -12
README.md CHANGED
@@ -18,7 +18,7 @@ widget:
18
  - text: "Catalunya és una referència en <mask> a nivell europeu."
19
  ---
20
 
21
- # [TODO: MODEL_NAME]
22
 
23
  ## Table of Contents
24
  <details>
@@ -42,10 +42,10 @@ widget:
42
  </details>
43
 
44
  ## Overview
45
- - **Architecture:** [TODO: roberta-base/roberta-large/bart...]
46
- - **Language:** [TODO: Catalan/Spanish...]
47
- - **Task:** [TODO: Text Classification / Extractive QA...]
48
- - **Data:** [TODO: BNE/Tecla/Teca/Pharmaconer...]
49
 
50
 
51
  ## Model description
@@ -61,8 +61,7 @@ widget:
61
  ## How to use
62
 
63
  ```python
64
- from transformers import AutoModel, AutoTokenizer, pipeline
65
- from datasets import load_dataset
66
  [TODO: Add minimal code here]
67
  ```
68
 
@@ -105,24 +104,25 @@ The training corpus consists of several corpora gathered from web crawling and p
105
 
106
  ### Evaluation results
107
 
 
 
108
  | Task | NER (F1) | POS (F1) | STS-ca (Comb) | TeCla (Acc.) | TEca (Acc.) | VilaQuAD (F1/EM)| ViquiQuAD (F1/EM) | CatalanQA (F1/EM) | XQuAD-ca <sup>1</sup> (F1/EM) |
109
  | ------------|:-------------:| -----:|:------|:------|:-------|:------|:----|:----|:----|
110
- | RoBERTa-large-ca-v2 | **89.82** | **99.02** | **83.41** | **75.46** | **83.61** | **89.34/75.50** | **89.20**/75.77 | **90.72/79.06** | **73.79**/55.34 |
111
- | RoBERTa-base-ca-v2 | 89.29 | 98.96 | 79.07 | 74.26 | 83.14 | 87.74/72.58 | 88.72/**75.91** | 89.50/76.63 | 73.64/**55.42** |
112
  | DistilRoBERTa-base-ca-v2| xx.xx | xx.xx | xx.xx | xx.xx | xx.xx | xx.xx/xx.xx | xx.xx/xx.xx | xx.xx/xx.xx | xx.xx/xx.xx |
113
 
114
  <sup>1</sup> : Trained on CatalanQA, tested on XQuAD-ca.
115
 
116
-
117
  ## Additional Information
118
 
119
  ### Authors
120
 
121
- Text Mining Unit (TeMU) at the Barcelona Supercomputing Center (bsc-temu@bsc.es).
122
 
123
  ### Contact information
124
 
125
- For further information, send an email to aina@bsc.es.
126
 
127
  ## Copyright
128
 
 
18
  - text: "Catalunya és una referència en <mask> a nivell europeu."
19
  ---
20
 
21
+ # DistilBerta-base
22
 
23
  ## Table of Contents
24
  <details>
 
42
  </details>
43
 
44
  ## Overview
45
+ - **Architecture:** DistilRoBERTa
46
+ - **Language:** Catalan
47
+ - **Task:** Fill-Mask
48
+ - **Data:** Crawling
49
 
50
 
51
  ## Model description
 
61
  ## How to use
62
 
63
  ```python
64
+ from transformers import pipeline
 
65
  [TODO: Add minimal code here]
66
  ```
67
 
 
104
 
105
  ### Evaluation results
106
 
107
+ This model has been fine-tuned on the downstream tasks of the Catalan Language Understanding Evaluation benchmark (CLUB).
108
+
109
  | Task | NER (F1) | POS (F1) | STS-ca (Comb) | TeCla (Acc.) | TEca (Acc.) | VilaQuAD (F1/EM)| ViquiQuAD (F1/EM) | CatalanQA (F1/EM) | XQuAD-ca <sup>1</sup> (F1/EM) |
110
  | ------------|:-------------:| -----:|:------|:------|:-------|:------|:----|:----|:----|
111
+ | RoBERTa-large-ca-v2 | 89.82 | 99.02 | 83.41 | 75.46 | 83.61 | 89.34/75.50 | 89.20/75.77 | 90.72/79.06 | 73.79/55.34 |
112
+ | RoBERTa-base-ca-v2 | 89.29 | 98.96 | 79.07 | 74.26 | 83.14 | 87.74/72.58 | 88.72/75.91 | 89.50/76.63 | 73.64/55.42 |
113
  | DistilRoBERTa-base-ca-v2| xx.xx | xx.xx | xx.xx | xx.xx | xx.xx | xx.xx/xx.xx | xx.xx/xx.xx | xx.xx/xx.xx | xx.xx/xx.xx |
114
 
115
  <sup>1</sup> : Trained on CatalanQA, tested on XQuAD-ca.
116
 
 
117
  ## Additional Information
118
 
119
  ### Authors
120
 
121
+ The Text Mining Unit (TeMU) from Barcelona Supercomputing Center ([bsc-temu@bsc.es](bsc-temu@bsc.es)).
122
 
123
  ### Contact information
124
 
125
+ For further information, send an email to [aina@bsc.es](aina@bsc.es).
126
 
127
  ## Copyright
128