pdjohn
/

C-BERT

Token Classification

Model card Files Files and versions

C-BERT / README.md

pdjohn's picture

Update README.md

2148abd verified 7 months ago

|

history blame contribute delete

1.4 kB

	---
	library_name: transformers
	license: apache-2.0
	language:
	- de
	base_model:
	- google-bert/bert-base-german-cased
	pipeline_tag: token-classification
	---

	# C-BERT

	CausalBERT (C-BERT) is a multi-task fine-tuned German BERT that extracts causal attributions.

	## Model details
	- Model architecture: BERT-base-German-cased + token & relation heads
	- Fine-tuned on: environmental causal attribution corpus (German)
	- Tasks:
	1. Token classification (BIO tags for INDICATOR / ENTITY)
	2. Relation classification (CAUSE, EFFECT, INTERDEPENDENCY)

	## Usage
	Find the custom [library](https://github.com/norygami/causalbert). Once installed, run inference like so:
	```python
	from transformers import AutoTokenizer
	from causalbert.infer import load_model, analyze_sentence_with_confidence

	model, tokenizer, config, device = load_model("norygano/C-BERT")
	result = analyze_sentence_with_confidence(
	model, tokenizer, config, "Autoverkehr verursacht Bienensterben.", []
	)
	```

	## Training

	- Base model: `google-bert/bert-base-german-cased`
	- Epochs: 3, LR: 2e-5, Batch size: 8
	- See [train.py](https://github.com/norygami/causalbert/blob/main/causalbert/train.py) for details.

	## Limitations

	- Only German.
	- Sentence-level; doesn’t handle cross-sentence causality.
	- Relation classification depends on detected spans — errors in token tagging propagate.