--- datasets: - samirmsallem/argument_mining_de language: - de metrics: - accuracy base_model: - deepset/gbert-base pipeline_tag: text-classification library_name: transformers model-index: - name: checkpoints results: - task: name: Text Classification type: text-classification dataset: name: samirmsallem/argument_mining_de type: samirmsallem/argument_mining_de metrics: - name: Accuracy type: accuracy value: 0.9657534246575342 --- ## Text classification model for argument mining and detection **gbert-base-argument_mining** is a text classification model in the scientific domain in German, finetuned from the model [gbert-base](https://huggingface.co/deepset/gbert-base). It was trained using a [synthetically created, annotated dataset](https://huggingface.co/datasets/samirmsallem/argument_mining_de) containing different sentence types occuring in conclusions of scientific theses and papers. ### Training Training was conducted on a 10 epoch fine-tuning approach, however this repository contains the results of the fourth epoch, since it has the best accuracy: | epoch | accuracy | loss | |-------|-------------------|--------------------| | 1.0 | 0.9315 | 0.3872 | | 2.0 | 0.9178 | 0.2987 | | 3.0 | 0.9589 | 0.1519 | | 4.0 | **0.9658** | **0.1162** | | 5.0 | 0.9521 | 0.2100 | | 6.0 | 0.9521 | 0.1979 | | 7.0 | 0.9521 | 0.2453 | | 8.0 | 0.9521 | 0.2251 | | 9.0 | 0.9452 | 0.2225 | | 10.0 | 0.9521 | 0.2286 | In relation to the dataset, the model demonstrates that it can effectively learn to distinguish between the two classes claim and premise. However, the rapid onset of overfitting after epoch 4 suggests that the dataset is imbalanced and noisy. Further work should enable the model to be trained on more robust data to ensure better evaluation results. ### Text Classification Tags |Text Classification Tag| Text Classification Label | | :----: | :----: | | 0 | CLAIM | | 1 | COUNTERCLAIM | | 2 | LINK | | 3 | CONC | | 4 | FUT | | 5 | OTH |