|
|
--- |
|
|
datasets: |
|
|
- samirmsallem/argument_mining_de |
|
|
language: |
|
|
- de |
|
|
metrics: |
|
|
- accuracy |
|
|
base_model: |
|
|
- deepset/gbert-base |
|
|
pipeline_tag: text-classification |
|
|
library_name: transformers |
|
|
model-index: |
|
|
- name: checkpoints |
|
|
results: |
|
|
- task: |
|
|
name: Text Classification |
|
|
type: text-classification |
|
|
dataset: |
|
|
name: samirmsallem/argument_mining_de |
|
|
type: samirmsallem/argument_mining_de |
|
|
metrics: |
|
|
- name: Accuracy |
|
|
type: accuracy |
|
|
value: 0.9657534246575342 |
|
|
--- |
|
|
|
|
|
## Text classification model for argument mining and detection |
|
|
|
|
|
|
|
|
**gbert-base-argument_mining** is a text classification model in the scientific domain in German, finetuned from the model [gbert-base](https://huggingface.co/deepset/gbert-base). |
|
|
It was trained using a [synthetically created, annotated dataset](https://huggingface.co/datasets/samirmsallem/argument_mining_de) containing different sentence types occuring in conclusions of scientific theses and papers. |
|
|
|
|
|
|
|
|
### Training |
|
|
|
|
|
Training was conducted on a 10 epoch fine-tuning approach, however this repository contains the results of the fourth epoch, since it has the best accuracy: |
|
|
|
|
|
| epoch | accuracy | loss | |
|
|
|-------|-------------------|--------------------| |
|
|
| 1.0 | 0.9315 | 0.3872 | |
|
|
| 2.0 | 0.9178 | 0.2987 | |
|
|
| 3.0 | 0.9589 | 0.1519 | |
|
|
| 4.0 | **0.9658** | **0.1162** | |
|
|
| 5.0 | 0.9521 | 0.2100 | |
|
|
| 6.0 | 0.9521 | 0.1979 | |
|
|
| 7.0 | 0.9521 | 0.2453 | |
|
|
| 8.0 | 0.9521 | 0.2251 | |
|
|
| 9.0 | 0.9452 | 0.2225 | |
|
|
| 10.0 | 0.9521 | 0.2286 | |
|
|
|
|
|
|
|
|
|
|
|
In relation to the dataset, the model demonstrates that it can effectively learn to distinguish between the two classes claim and premise. However, the rapid onset of overfitting after epoch 4 suggests that the dataset is imbalanced and noisy. Further work should enable the model to be trained on more robust data to ensure better evaluation results. |
|
|
|
|
|
### Text Classification Tags |
|
|
|
|
|
|Text Classification Tag| Text Classification Label | |
|
|
| :----: | :----: | |
|
|
| 0 | CLAIM | |
|
|
| 1 | COUNTERCLAIM | |
|
|
| 2 | LINK | |
|
|
| 3 | CONC | |
|
|
| 4 | FUT | |
|
|
| 5 | OTH | |