mapama247
/

DistilBERTa

Model card Files Files and versions

mapama247 commited on Dec 28, 2022

Commit

be87ee2

·

1 Parent(s): 92070db

Update README.md

Files changed (1) hide show

README.md +0 -7

README.md CHANGED Viewed

@@ -32,7 +32,6 @@ widget:
   - [Training data](#training-data)
   - [Training procedure](#training-procedure)
 - [Evaluation](#evaluation)
-  - [Variable and metrics](#variable-and-metrics)
   - [Evaluation benchmark](#evaluation-benchmark)
   - [Evaluation results](#evaluation-results)
 - [Additional information](#additional-information)
@@ -114,16 +113,11 @@ As an example, the distilled version of BERT has 40% fewer parameters and runs 6
 ## Evaluation
-### Variable and metrics
-[TODO]
 ### Evaluation benchmark
 This model has been fine-tuned on the downstream tasks of the Catalan Language Understanding Evaluation benchmark (CLUB).
 Here are the train/dev/test splits of each dataset:
 | Dataset | Task | Total | Train | Dev  | Test |
 |:--|:--|:--|:--|:--|:--|
 | Ancora | NER |13,581 | 10,628 | 1,427 | 1,526 |
@@ -138,7 +132,6 @@ Here are the train/dev/test splits of each dataset:
 ### Evaluation results
 This is how it compares to the teacher model when fine-tuned on the same downstream tasks:
 | Model \ Task| NER (F1)      | POS (F1)   | STS-ca (Comb)   | TeCla (Acc.) | TEca (Acc.) | VilaQuAD (F1/EM)| ViquiQuAD (F1/EM) | CatalanQA (F1/EM) | XQuAD-ca <sup>1</sup> (F1/EM) |
 | ------------|:-------------:| -----:|:------|:------|:-------|:------|:----|:----|:----|
 | RoBERTa-large-ca-v2     | 89.82 | 99.02 | 83.41 | 75.46 | 83.61 | 89.34/75.50 | 89.20/75.77 | 90.72/79.06 | 73.79/55.34 |

   - [Training data](#training-data)
   - [Training procedure](#training-procedure)
 - [Evaluation](#evaluation)
   - [Evaluation benchmark](#evaluation-benchmark)
   - [Evaluation results](#evaluation-results)
 - [Additional information](#additional-information)
 ## Evaluation
 ### Evaluation benchmark
 This model has been fine-tuned on the downstream tasks of the Catalan Language Understanding Evaluation benchmark (CLUB).
 Here are the train/dev/test splits of each dataset:
 | Dataset | Task | Total | Train | Dev  | Test |
 |:--|:--|:--|:--|:--|:--|
 | Ancora | NER |13,581 | 10,628 | 1,427 | 1,526 |
 ### Evaluation results
 This is how it compares to the teacher model when fine-tuned on the same downstream tasks:
 | Model \ Task| NER (F1)      | POS (F1)   | STS-ca (Comb)   | TeCla (Acc.) | TEca (Acc.) | VilaQuAD (F1/EM)| ViquiQuAD (F1/EM) | CatalanQA (F1/EM) | XQuAD-ca <sup>1</sup> (F1/EM) |
 | ------------|:-------------:| -----:|:------|:------|:-------|:------|:----|:----|:----|
 | RoBERTa-large-ca-v2     | 89.82 | 99.02 | 83.41 | 75.46 | 83.61 | 89.34/75.50 | 89.20/75.77 | 90.72/79.06 | 73.79/55.34 |