RoBERTa Causal Span Extractor
This model is a fine-tuned version of roberta-base for causal span extraction
(token classification). It identifies cause and effect text spans in sentences.
Model Description
- Base Model: roberta-base
- Task: Token classification (BIO tagging)
- Labels: O, B-CAUSE, I-CAUSE, B-EFFECT, I-EFFECT
- Training Data: CausalNewsCorpus V2 (sentences with exactly 1 causal pair)
- Training Samples: 1105
- Dev Samples: 133
Training Results
See the training notebook for detailed metrics.
Usage
from transformers import RobertaTokenizerFast, RobertaForTokenClassification
import torch
model_name = "causal-narrative/roberta-causal-span-extractor"
tokenizer = RobertaTokenizerFast.from_pretrained(model_name, add_prefix_space=True)
model = RobertaForTokenClassification.from_pretrained(model_name)
text = "The heavy rain caused flooding in the city."
words = text.split()
inputs = tokenizer(words, is_split_into_words=True, return_tensors="pt",
truncation=True, padding=True)
with torch.no_grad():
outputs = model(**inputs)
preds = torch.argmax(outputs.logits, dim=2)[0]
id2label = model.config.id2label
word_ids = tokenizer(words, is_split_into_words=True).word_ids()
prev = None
for wid in word_ids:
if wid is not None and wid != prev:
print(f"{words[wid]:20s} {id2label[preds[word_ids.index(wid)].item()]}")
prev = wid
Labels
| Label | Description |
|---|---|
| O | Non-causal token |
| B-CAUSE | Beginning of cause span |
| I-CAUSE | Inside cause span |
| B-EFFECT | Beginning of effect span |
| I-EFFECT | Inside effect span |
- Downloads last month
- 19