CIREE: Claim Identification for Research in Environmental Evidence

CIREE is a RoBERTa-based text classification model fine-tuned to detect claims in environmental documentary / media content. It predicts whether a text segment is a Claim or Non-claim, which is useful as a first step before deeper verification/fact-checking.

Intended Use

Primary use: Environmental media analysis (claim detection).
Typical input: Segmented transcripts
Output: label + score distribution over classes.

Model Details

Architecture: RoBERTa base (Hugging Face Transformers)
Training objective: sequence classification (claim vs non-claim)
License: MIT (user should verify compatibility with their own use)

How to Use

Python

pip install transformers torch pandas scikit-learn

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained("ez6047/CIREE-fact-detector")
tokenizer = AutoTokenizer.from_pretrained("ez6047/CIREE-fact-detector")

inputs = tokenizer("The Amazon river flows through Brazil.", return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits
prob = torch.softmax(logits, dim=-1)[0, 1].item()
print(f"Fact probability: {prob:.3f}")  # threshold: 0.635

For batch inference with CSV input, thresholding, and evaluation metrics, see run_inference.py in this repo.

Downloads last month: 7

Safetensors

Model size

0.1B params

Tensor type

F32