samirmsallem
/

gbert-base-argument_mining

Text Classification

Eval Results (legacy)

Model card Files Files and versions

gbert-base-argument_mining / README.md

samirmsallem's picture

Update README.md

3b2e1ca verified 7 months ago

|

history blame contribute delete

2.51 kB

	---
	datasets:
	- samirmsallem/argument_mining_de
	language:
	- de
	metrics:
	- accuracy
	base_model:
	- deepset/gbert-base
	pipeline_tag: text-classification
	library_name: transformers
	model-index:
	- name: checkpoints
	results:
	- task:
	name: Text Classification
	type: text-classification
	dataset:
	name: samirmsallem/argument_mining_de
	type: samirmsallem/argument_mining_de
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.9657534246575342
	---

	## Text classification model for argument mining and detection


	gbert-base-argument_mining is a text classification model in the scientific domain in German, finetuned from the model [gbert-base](https://huggingface.co/deepset/gbert-base).
	It was trained using a [synthetically created, annotated dataset](https://huggingface.co/datasets/samirmsallem/argument_mining_de) containing different sentence types occuring in conclusions of scientific theses and papers.


	### Training

	Training was conducted on a 10 epoch fine-tuning approach, however this repository contains the results of the fourth epoch, since it has the best accuracy:

	\| epoch \| accuracy \| loss \|
	\|-------\|-------------------\|--------------------\|
	\| 1.0 \| 0.9315 \| 0.3872 \|
	\| 2.0 \| 0.9178 \| 0.2987 \|
	\| 3.0 \| 0.9589 \| 0.1519 \|
	\| 4.0 \| 0.9658 \| 0.1162 \|
	\| 5.0 \| 0.9521 \| 0.2100 \|
	\| 6.0 \| 0.9521 \| 0.1979 \|
	\| 7.0 \| 0.9521 \| 0.2453 \|
	\| 8.0 \| 0.9521 \| 0.2251 \|
	\| 9.0 \| 0.9452 \| 0.2225 \|
	\| 10.0 \| 0.9521 \| 0.2286 \|



	In relation to the dataset, the model demonstrates that it can effectively learn to distinguish between the two classes claim and premise. However, the rapid onset of overfitting after epoch 4 suggests that the dataset is imbalanced and noisy. Further work should enable the model to be trained on more robust data to ensure better evaluation results.

	### Text Classification Tags

	\|Text Classification Tag\| Text Classification Label \|
	\| :----: \| :----: \|
	\| 0 \| CLAIM \|
	\| 1 \| COUNTERCLAIM \|
	\| 2 \| LINK \|
	\| 3 \| CONC \|
	\| 4 \| FUT \|
	\| 5 \| OTH \|