C-EBERT: Multi-Task Causal Extraction

A multi-task model to extract causal attributions from German text.

Model Details

  • Multi-Task Architecture: Extends EuroBERT-610m with token and relation classification heads, using a joint loss function Ltotal=αLtoken+βLrelation\mathcal{L}_{\text{total}} = \alpha \mathcal{L}_{\text{token}} + \beta \mathcal{L}_{\text{relation}}:

    Task Output Type Labels / Classes
    1. Token Classification Sequence Labeling (BIO) 5 Span Labels (O, B-INDICATOR, I-INDICATOR, B-ENTITY, I-ENTITY)
    2. Relation Classification Sentence-Pair Classification 14 Relation Labels (e.g., MONO_POS_CAUSE, DIST_NEG_EFFECT)
  • Dataset: 4,540 manually annotated relations (see the excerpt Bundestag Causal Attribution).

Usage

Find the implementation library here.

from causalbert.infer import load_model, sentence_analysis

model, tokenizer, config, device = load_model("pdjohn/C-EBERT")
sentences = ["Autoverkehr verursacht Bienensterben."]

analysis = sentence_analysis(model, tokenizer, config, sentences, batch_size=8)
print(analysis[0]['derived_relations'])
# Output: [(['Autoverkehr', 'verursacht'], ['Bienensterben']), {'label': 'MONO_POS_CAUSE', 'confidence': 0.954}]

## Evaluation & Performance
Evaluated on a stratified held-out test set of environmental discourse data. 

Technical Highlights

  • Custom Input Formatting: Utilizes a custom <|parallel_sep|> token to handle sentence-pair classification for relation extraction.
  • Engineered for Scale: Proven performance on large-scale datasets, processing 22M sentences and generating 1.6M unique causal triples.
  • Imbalance Mitigation: Implements dynamic class weighting via normalized inverse frequencies to handle the long-tail distribution of 14 distinct causal relation types.

Training

  • Base model: EuroBERT-610m
  • Training Parameters:
    • Epochs: 8
    • Learning Rate: 1e-4
    • Batch size: 32
    • PEFT/LoRA: Enabled with r = 16

See train.py for the full configuration details.

Evaluation

Evaluated on a stratified held-out test set (1,135 Relations).

Task Accuracy F1 (Macro/Micro)
Token Classification (BIO) 0.879 0.783 (Micro)
Relation Classification 0.732 0.425 (Macro)
Downloads last month
10
Safetensors
Model size
0.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pdjohn/C-EBERT-610m

Finetuned
(17)
this model

Collection including pdjohn/C-EBERT-610m