ankekat1000
/

deliberative-bert-german

Text Classification

Model card Files Files and versions

deliberative-bert-german / README.md

ankekat1000's picture

Update README.md

448083e over 2 years ago

|

history blame contribute delete

2.22 kB

	---
	license: cc-by-nc-sa-4.0
	language:
	- de
	---

	## Model description
	This model is a fine-tuned version of the [bert-base-german-cased model by deepset](https://huggingface.co/bert-base-german-cased) to classify German-language deliberative comments.

	## How to use

	You can use the model with the following code.

	```python
	#!pip install transformers

	from transformers import AutoModelForSequenceClassification, AutoTokenizer, TextClassificationPipeline

	model_path = "ankekat1000/deliberative-bert-german"
	tokenizer = AutoTokenizer.from_pretrained(model_path)
	model = AutoModelForSequenceClassification.from_pretrained(model_path)

	pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer)
	print(pipeline('Tolle Idee. Ich denke, dass dieses Projekt Teil des Stadtforums werden sollte, damit wir darüber weiter nachdenken können!'))
	```


	## Training

	The pre-trained model [bert-base-german-cased model by deepset](https://huggingface.co/bert-base-german-cased) was fine-tuned on a crowd-annotated data set of 14,000 user comments that has been labeled for deliberation in a binary classification task.

	As deliberative, we defined comments that are enriching and valuble to a deliberative discussion in whole or in part, such as comments that add arguments, suggestions, or new perspectives to the discussion, or otherwise help users find them stimulating or appreciative.

	Language model: bert-base-cased (~ 12GB)
	Language: German
	Labels: Engaging (binary classification)
	Training data: User comments posted to websites and facebook pages of German news media, user comments posted to online participation platforms (~ 14,000)
	Labeling procedure: Crowd annotation
	Batch size: 32
	Epochs: 4
	Max. tokens length: 512
	Infrastructure: 1x Quadro RTX 8000
	Published: Oct 24th, 2023

	## Evaluation results

	Accuracy:: 86%
	Macro avg. f1:: 86%



	\| Label \| Precision \| Recall \| F1 \| Nr. comments in test set \|
	\| ----------- \| ----------- \| ----------- \| ----------- \| ----------- \|
	\| not deliberative \| 0.87 \| 0.84 \| 0.86 \| 701 \|
	\| deliberative \| 0.84 \| 0.87 \| 0.85 \| 667 \|