Update README.md
Browse files
README.md
CHANGED
|
@@ -23,22 +23,20 @@ Galician mBERT for Semantic Role Labeling (SRL) is a transformers model, leverag
|
|
| 23 |
- Identify up to 13 verbal roots within a sentence.
|
| 24 |
- Identify available arguments for each verbal root. Due to scarcity of data, this model focused solely on the identification of arguments 0, 1, and 2.
|
| 25 |
|
| 26 |
-
Labels are produced as the following: r#:tag
|
| 27 |
-
- where r# links the token to a specific verbal root of index #
|
| 28 |
-
- and tag identifies the token as the verbal root (root) or an individual argument (arg0/arg1/arg2)
|
| 29 |
|
| 30 |
- **Developed by:** [Micaella Bruton](mailto:micaellabruton@gmail.com)
|
| 31 |
- **Model type:** Transformers
|
| 32 |
- **Language(s) (NLP):** Galician (gl)
|
| 33 |
- **License:** Apache 2.0
|
| 34 |
-
- **Finetuned from model:** [
|
| 35 |
|
| 36 |
### Model Sources [optional]
|
| 37 |
|
| 38 |
<!-- Provide the basic links for the model. -->
|
| 39 |
|
| 40 |
- **Repository:** [GalicianSRL](https://github.com/mbruton0426/GalicianSRL)
|
| 41 |
-
- **Paper
|
| 42 |
|
| 43 |
## Uses
|
| 44 |
|
|
@@ -46,7 +44,7 @@ This model is intended to be used to develop and improve natural language proces
|
|
| 46 |
|
| 47 |
## Bias, Risks, and Limitations
|
| 48 |
|
| 49 |
-
Galician is a low-resource language which prior to this project lacked a semantic role labeling dataset. As such, the dataset used to train this model is extrememly limited and could benefit from the inclusion of additional
|
| 50 |
|
| 51 |
|
| 52 |
## Training Details
|
|
@@ -74,15 +72,15 @@ This model was trained on the "train" portion of the [GalicianSRL](https://huggi
|
|
| 74 |
It supplies scoring both overall and per label type.
|
| 75 |
|
| 76 |
Overall:
|
| 77 |
-
`accuracy`: the average [accuracy](https://huggingface.co/metrics/accuracy), on a scale between 0.0 and 1.0.
|
| 78 |
-
`precision`: the average [precision](https://huggingface.co/metrics/precision), on a scale between 0.0 and 1.0.
|
| 79 |
-
`recall`: the average [recall](https://huggingface.co/metrics/recall), on a scale between 0.0 and 1.0.
|
| 80 |
-
`f1`: the average [F1 score](https://huggingface.co/metrics/f1), which is the harmonic mean of the precision and recall. It also has a scale of 0.0 to 1.0.
|
| 81 |
|
| 82 |
Per label type:
|
| 83 |
-
`precision`: the average [precision](https://huggingface.co/metrics/precision), on a scale between 0.0 and 1.0.
|
| 84 |
-
`recall`: the average [recall](https://huggingface.co/metrics/recall), on a scale between 0.0 and 1.0.
|
| 85 |
-
`f1`: the average [F1 score](https://huggingface.co/metrics/f1), on a scale between 0.0 and 1.0.
|
| 86 |
|
| 87 |
### Results
|
| 88 |
|
|
|
|
| 23 |
- Identify up to 13 verbal roots within a sentence.
|
| 24 |
- Identify available arguments for each verbal root. Due to scarcity of data, this model focused solely on the identification of arguments 0, 1, and 2.
|
| 25 |
|
| 26 |
+
Labels are produced as the following: r#:tag, where r# links the token to a specific verbal root of index #, and tag identifies the token as the verbal root (root) or an individual argument (arg0/arg1/arg2)
|
|
|
|
|
|
|
| 27 |
|
| 28 |
- **Developed by:** [Micaella Bruton](mailto:micaellabruton@gmail.com)
|
| 29 |
- **Model type:** Transformers
|
| 30 |
- **Language(s) (NLP):** Galician (gl)
|
| 31 |
- **License:** Apache 2.0
|
| 32 |
+
- **Finetuned from model:** [multilingual BERT](https://huggingface.co/bert-base-multilingual-cased)
|
| 33 |
|
| 34 |
### Model Sources [optional]
|
| 35 |
|
| 36 |
<!-- Provide the basic links for the model. -->
|
| 37 |
|
| 38 |
- **Repository:** [GalicianSRL](https://github.com/mbruton0426/GalicianSRL)
|
| 39 |
+
- **Paper:** To be updated
|
| 40 |
|
| 41 |
## Uses
|
| 42 |
|
|
|
|
| 44 |
|
| 45 |
## Bias, Risks, and Limitations
|
| 46 |
|
| 47 |
+
Galician is a low-resource language which prior to this project lacked a semantic role labeling dataset. As such, the dataset used to train this model is extrememly limited and could benefit from the inclusion of additional sentences and manual validation by native speakers.
|
| 48 |
|
| 49 |
|
| 50 |
## Training Details
|
|
|
|
| 72 |
It supplies scoring both overall and per label type.
|
| 73 |
|
| 74 |
Overall:
|
| 75 |
+
- `accuracy`: the average [accuracy](https://huggingface.co/metrics/accuracy), on a scale between 0.0 and 1.0.
|
| 76 |
+
- `precision`: the average [precision](https://huggingface.co/metrics/precision), on a scale between 0.0 and 1.0.
|
| 77 |
+
- `recall`: the average [recall](https://huggingface.co/metrics/recall), on a scale between 0.0 and 1.0.
|
| 78 |
+
- `f1`: the average [F1 score](https://huggingface.co/metrics/f1), which is the harmonic mean of the precision and recall. It also has a scale of 0.0 to 1.0.
|
| 79 |
|
| 80 |
Per label type:
|
| 81 |
+
- `precision`: the average [precision](https://huggingface.co/metrics/precision), on a scale between 0.0 and 1.0.
|
| 82 |
+
- `recall`: the average [recall](https://huggingface.co/metrics/recall), on a scale between 0.0 and 1.0.
|
| 83 |
+
- `f1`: the average [F1 score](https://huggingface.co/metrics/f1), on a scale between 0.0 and 1.0.
|
| 84 |
|
| 85 |
### Results
|
| 86 |
|