|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- de |
|
|
metrics: |
|
|
- accuracy |
|
|
tags: |
|
|
- twitter |
|
|
--- |
|
|
|
|
|
# T-GBERT |
|
|
|
|
|
This is a [GBERT-base](https://huggingface.co/deepset/gbert-base) with continued pretraining |
|
|
on roughly 33 million German X/Twitter deduplicated posts from 2020. The pretraining is follows the task-adaptive pretraining setup suggested by |
|
|
[Gururangan et al. (2020)](https://aclanthology.org/2020.acl-main.740). In total, the model was trained for 10 epochs. I am sharing this model as |
|
|
it might be useful to some of you and initial result suggest (some) improvements compared to [GBERT-base](https://huggingface.co/deepset/gbert-base) |
|
|
(which is a common choice for supervised fine-tuning). |
|
|
|
|
|
## Performance |
|
|
|
|
|
| | [GermEval-2017](https://sites.google.com/view/germeval2017-absa/home)<br>(subtask B, synchronic test set)| [SB10k](https://aclanthology.org/W17-1106/) | |
|
|
|:----------:|:-------------:|:-----:| |
|
|
| GBERT-base | 79.77% | 82.29% | |
|
|
| T-GBERT | 81.50% | 82.88% | |
|
|
|
|
|
*Results report the accuracy (micro F1-score) on the test set of the respective dataset. The results represent the average of five runs |
|
|
with different seeds for data shuffling and parameter initialization.* |
|
|
|
|
|
## Preprocessing |
|
|
Weblinks in posts were replaced by 'https' and user mentions were replaced by '@user'. |
|
|
<br><br> |
|
|
|
|
|
|
|
|
|
|
|
|