SpanMarker-SciClimateBERT for Climate Research NER

This model is a SpanMarker model fine-tuned for fine-grained Named Entity Recognition (NER) in the climate change research domain, extracting 28 distinct entity types. It utilizes the domain-specific P0L3/sciclimatebert as the underlying encoder.

📌 Model Details

  • Model Type: SpanMarker
  • Encoder: P0L3/sciclimatebert
  • Maximum Sequence Length: 512 tokens
  • Maximum Entity Length: 14 words
  • Language: English
  • License: cc-by-sa-4.0

Model Labels

Label Examples
Asset "mental health", "water resources", "raw material"
Body Part "plant leaves", "deep tissue compartment", "leaves"
Body of Water "Dhaleshwari river", "rivers", "peripheral rivers"
Chemical "marine algal toxin", "domoic acid", "cathode materials"
Disease "acute neurologic signs", "chronic epileptic syndrome", "seizures"
Ecosystem "cloud forests", "Tropical montane cloud forest", "polluted environment"
Energy Source "battery cells", "fossil fuels", "12-cell series battery-pack prototype"
Field of Study "study", "veterinary medicine", "reference laboratory"
Geographical Feature "heterogenous topography", "low point", "mountainous regions"
Intellectual Artefact "Veterinary medical records", "data", "Daily husbandry records"
Location "wild", "Westbrook", "beaches"
Mathematical Expression "Stepwise machine hour constraints", "difference", "gradient"
Measuring Device "MRI scan", "station", "EEG"
Meteorological Phenomenon "climatic variability", "climate change", "rainfall"
Method "serum monitoring", "clinical efficacy", "dosing"
Natural Disaster "seasonal air pollution", "heavy metal contamination", "environmental pollution"
Natural Phenomenon "biochemical changes", "changing ocean conditions", "algal blooms"
Organism "Zalophus californianus", "California sea lions", "species"
Organization "NOAA National Marine Fisheries Service", "reference laboratory", "long-term care facility"
Other "reports", "normal eating", "marine mammal health"
Person "staff", "clinicians", "Clinicians"
Physical Artefact "electric vehicle", "paved east – west road", "EVs"
Physical Phenomenon "normal food intake", "seasonal changes", "structural abnormalities"
Policy "safety", "pollution", "energy security"
Quantity "energy density", "200 mAhg − 1", ">"
Satellite "satellites", "TRMM", "Tropical Rainfall Measuring Mission"
System "climate", "system structure", "global overturning circulation"
Time Period "101 days", "periods of prolonged anorexia", "several decades"

🚀 Main Results (Selected Checkpoint)

This repository provides the best-performing checkpoint selected from 5 runs with different random seeds. While the internal training logs tracked performance on the validation split of CliReNERsilver, the final model selection and the metrics below are evaluated on the independent, expert-annotated CliReNERgold dataset.

Metric Score
Precision 56.49
Recall 49.63
F1 52.84

This checkpoint corresponds to the seed with the highest strict F1 on the gold evaluation set (Seed 3 - 3012).


📊 Results Across Seeds

We fine-tuned the model using 5 different random seeds to assess the stability and robustness of the architecture on the domain-specific text.

Seed Precision Recall Strict F1
1 54.41 45.14 49.34
2 45.58 37.05 40.87
3 56.49 49.63 52.84
4 53.84 48.69 51.14
5 53.31 45.34 49.01

Summary:

  • F1: mean = 48.64, std = 4.60
  • Precision: mean = 52.72, std = 4.17
  • Recall: mean = 45.17, std = 4.96

Model Selection Strategy: The uploaded checkpoint is the single best seed (highest strict F1 on the gold dataset), ensuring strong performance and high-fidelity alignment with domain-expert consensus.


📂 Dataset & Evaluation

  • Training Dataset: CliReNERsilver
    • Splits used: Stratified 80:10:10 ratio (Train/Validation/Test). The 80% split was used for training.
  • Evaluation Dataset: CliReNERgold
    • Splits used: Evaluated on the combined 192 sentences (expert-annotated via Weighted Expert Voting).
  • Preprocessing:
    • Texts were tokenized using the tokenizer corresponding to the SciClimateBERT encoder.
    • The dataset utilizes a flat NER schema (nested entities are excluded, and overlapping entities are resolved to the most relevant span).
  • Metric Details:
    • F1 type: Strict F1 (Entity-level exact match).
    • Evaluation was performed ensuring entities match both the exact boundary span and the exact semantic label to be considered correct.

⚖️ Precision vs Recall Behavior

(Note to author: Describe the model’s tendency here based on your results. Example: "The model exhibits a balanced precision and recall profile.")


⚙️ Usage

Direct Use for Inference

Because this model was trained using the SpanMarker framework, it requires the span_marker library for inference.

pip install span_marker
from span_marker import SpanMarkerModel

# Download from the 🤗 Hub
model = SpanMarkerModel.from_pretrained("P0L3/CliReNER-sciclimatebert")

# Run inference
text = "The effectiveness of these approaches has been demonstrated in high-resource domains, including biomedicine and chemistry (Lee et al. 2019; Fries et al. 2022; Morin et al. 2023; Wang et al. 2021). "
entities = model.predict(text)

for entity in entities:
    print(f"Entity: {entity['span']} | Label: {entity['label']} | Score: {entity['score']:.4f}")

# Entity: effectiveness | Label: Quantity | Score: 0.7880
# Entity: approach | Label: Method | Score: 0.8180
# Entity: high-resource domains | Label: Location | Score: 0.2892
# Entity: biomedicine | Label: Field of Study | Score: 0.2845
# Entity: chemistry | Label: Field of Study | Score: 0.7580
# Entity: Lee et al. | Label: Person | Score: 0.9177
# Entity: Fries et al. | Label: Person | Score: 0.9175
# Entity: 2022 | Label: Time Period | Score: 0.9332
# Entity: Morin et al. | Label: Person | Score: 0.9583
# Entity: Wang et al. | Label: Person | Score: 0.9435
# Entity: 2021 | Label: Time Period | Score: 0.8185

Downstream Use

You can easily continue fine-tuning this model on your own dataset.

Click to expand
from span_marker import SpanMarkerModel, Trainer
from datasets import load_dataset

# Download from the 🤗 Hub
model = SpanMarkerModel.from_pretrained("your-huggingface-username/your-model-name")

# Specify a Dataset with "tokens" and "ner_tags" columns
dataset = load_dataset("your_custom_dataset")

# Initialize a Trainer using the pretrained model & dataset
trainer = Trainer(
    model=model,
    train_dataset=dataset["train"],
    eval_dataset=dataset["validation"],
)
trainer.train()
trainer.save_model("span_marker_model_id-finetuned")

📉 Training Details

Training Set Metrics

Training set Min Median Max
Sentence length 3 31.4819 97
Entities per sentence 1 7.0100 22

Training Hyperparameters

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 3012
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: adamw_torch with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training Results (CliReNERsilver Validation Split)

Epoch Step Validation Loss Validation Precision Validation Recall Validation F1 Validation Accuracy
1.0 62 0.1522 0.0 0.0 0.0 0.6075
2.0 124 0.1065 0.0 0.0 0.0 0.6075
3.0 186 0.0703 0.4503 0.2209 0.2964 0.6975
4.0 248 0.0539 0.5494 0.3831 0.4514 0.7647
5.0 310 0.0499 0.5369 0.5222 0.5295 0.8056
6.0 372 0.0453 0.5947 0.5452 0.5689 0.8153
7.0 434 0.0461 0.6125 0.5897 0.6009 0.8316
8.0 496 0.0452 0.6033 0.5739 0.5882 0.8256
9.0 558 0.0483 0.5882 0.5882 0.5882 0.8283
10.0 620 0.0486 0.6175 0.5882 0.6025 0.8268
11.0 682 0.0491 0.5860 0.5868 0.5864 0.8234

Framework Versions

  • Python: 3.10.19
  • SpanMarker: 1.7.0
  • Transformers: 4.50.0
  • PyTorch: 2.9.1+cu126
  • Datasets: 3.0.0
  • Tokenizers: 0.21.4

📚 Citation

If you use this model or the CliReNER datasets in your research, please cite the project:

@misc{poleksic2026named,
  author       = {Poleksić, Andrija and Martinčić-Ipšić, Sanda},
  title        = {Named Entity Recognition for Climate Change Research},
  year         = {2026},
  howpublished = {Research Square},
  note         = {Preprint}
}

Please also acknowledge the SpanMarker framework:

@software{Aarsen_SpanMarker,
    author = {Aarsen, Tom},
    license = {Apache-2.0},
    title = {{SpanMarker for Named Entity Recognition}},
    url = {https://github.com/tomaarsen/SpanMarkerNER}
}
Downloads last month
10
Safetensors
Model size
82.3M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for P0L3/CliReNER-sciclimatebert

Finetuned
(1)
this model

Datasets used to train P0L3/CliReNER-sciclimatebert

Collection including P0L3/CliReNER-sciclimatebert

Evaluation results