teagrjohnson/narrative-llm-annotations
Updated • 10 • 1
RoBERTa-base fine-tuned for 9-dimensional narrative Likert regression (agency: focalization, emotion, cognition, change_of_state, conflict; setting: concreteness, temporal_grounding, spatial_grounding, sensory). Trained on LLM pseudo-labels (Gemma-4-31B) with held-out human gold evaluation.
Note: Full model card with training details coming soon.
Download model.pt and tokenizer/ from this repo, then:
import torch
from transformers import AutoModel, AutoTokenizer
from torch import nn
class NarrativeRoBERTa(nn.Module):
def __init__(self, model_name, n_dims):
super().__init__()
self.backbone = AutoModel.from_pretrained(model_name)
hidden = self.backbone.config.hidden_size
self.heads = nn.ModuleList([nn.Linear(hidden, 1) for _ in range(n_dims)])
def forward(self, input_ids, attention_mask):
cls = self.backbone(input_ids=input_ids, attention_mask=attention_mask).last_hidden_state[:, 0, :]
return torch.cat([h(cls) for h in self.heads], dim=1)
tokenizer = AutoTokenizer.from_pretrained("tokenizer/")
model = NarrativeRoBERTa("roberta-base", n_dims=9)
model.load_state_dict(torch.load("model.pt", map_location="cpu", weights_only=True))
model.eval()
{
"model_name": "roberta-base",
"max_len": 256,
"dims": [
"temporal_sequential",
"causal"
],
"data_source": "/projects/tejo9855/Projects/llm-narrative-annotations/event_relation/outputs/google_gemma-4-31B-it/20260518_143249",
"n_train": 6219,
"n_val": 690,
"val_frac": 0.1,
"best_epoch": 4,
"seed": 42,
"test_f1_gold": 0.805
}
Base model
FacebookAI/roberta-base