GLiNER-Medium for Climate Research NER

This model is a GLiNER model fine-tuned for fine-grained Named Entity Recognition (NER) in the climate change research domain. It identifies 28 distinct entity types using a generalist span-based architecture.

📌 Model Details

Model Type: GLiNER
Base Model: gliner-community/gliner_medium-v2.5
Architecture: Bi-encoder architecture with a focal loss objective.
Language: English
License: cc-by-sa-4.0

Entity Typology (28 Classes)

The model is trained to recognize: Asset, Body Part, Body of Water, Chemical, Disease, Ecosystem, Energy Source, Field of Study, Geographical Feature, Intellectual Artefact, Location, Mathematical Expression, Measuring Device, Meteorological Phenomenon, Method, Natural Disaster, Natural Phenomenon, Organism, Organization, Other, Person, Physical Artefact, Physical Phenomenon, Policy, Quantity, Satellite, System, and Time Period.

🚀 Main Results (Selected Checkpoint)

This repository provides the best-performing checkpoint selected from 5 runs with different random seeds. While the internal training logs tracked performance on the validation split of CliReNER_silver, the final model selection and the metrics below are evaluated on the independent, expert-annotated CliReNER_gold dataset.

Metric	Score
Precision	61.78
Recall	62.54
F1	62.16

This checkpoint corresponds to the seed with the highest strict F1 on the gold evaluation set (Seed 3).

📊 Results Across Seeds

We fine-tuned the model using 5 different random seeds to assess the stability and robustness of the architecture on the domain-specific text.

Seed	Precision	Recall	Strict F1
1	61.06	61.68	61.37
2	61.63	62.21	61.92
3	61.78	62.54	62.16
4	61.27	62.05	61.66
5	60.90	62.09	61.49

Summary:

F1: mean = 61.72, std = 0.32
Precision: mean = 61.33, std = 0.37
Recall: mean = 62.12, std = 0.31

📂 Dataset & Evaluation

Training Dataset: CliReNER_silver (80% split).
Evaluation Dataset: CliReNER_gold (expert consensus via Weighted Expert Voting).
Metric Details: Strict F1 (Entity-level exact span and label match).

⚙️ Usage

Direct Use for Inference

Requires the gliner library:

pip install gliner

from gliner import GLiNER

# Load model
model = GLiNER.from_pretrained(
    "P0L3/CliReNER-gliner_medium-v2.5",
    load_tokenizer=True
)

# Input text
text = """
In recent years, climate NLP has seen its nascent with the introduction of domain-specific models such as ClimateBERT (Webersinke et al. 2022), ClimateGPT (Thulke et al. 2024), and CliReBERT (Poleksić and Martinčić-Ipšić 2025), alongside related efforts (Bhattacharjee et al. 2024; Schimanski et al. 2024)
"""

# Labels
labels = [
    "Ecosystem", "Energy Source", "Natural Disaster", 
    "Meteorological Phenomenon", "Quantity", "Intellectual Artefact", 
    "Body of Water", "Disease", "Location", 
    "Physical Phenomenon", "Chemical", "Time Period", 
    "Organization", "Natural Phenomenon", "Field of Study", 
    "Mathematical Expression", "Measuring Device", "Geographical Feature", 
    "System", "Satellite", "Organism", 
    "Method", "Other", "Person", 
    "Physical Artefact", "Body Part", "Asset",
    "Policy",
]

# Predict entities
entities = model.predict_entities(text, labels)

# Match SpanMarker-style output
for entity in entities:
    span = entity["text"]
    label = entity["label"]
    score = entity["score"]  # fallback if score not present

    print(f"Entity: {span} | Label: {label} | Score: {score:.4f}")

# Entity: recent years | Label: Time Period | Score: 0.8569
# Entity: climate NLP | Label: Method | Score: 0.6270
# Entity: domain-specific models | Label: Method | Score: 0.7560
# Entity: ClimateBERT | Label: Method | Score: 0.8053
# Entity: Webersinke et al. | Label: Person | Score: 0.7729
# Entity: 2022 | Label: Time Period | Score: 0.8569
# Entity: ClimateGPT | Label: Method | Score: 0.7935
# Entity: Thulke et al. | Label: Person | Score: 0.7884
# Entity: 2024 | Label: Time Period | Score: 0.8953
# Entity: CliReBERT | Label: Method | Score: 0.8261
# Entity: Poleksić and Martinčić-Ipšić | Label: Person | Score: 0.8301
# Entity: 2025 | Label: Time Period | Score: 0.9154
# Entity: related efforts | Label: Other | Score: 0.5821
# Entity: Bhattacharjee et al. | Label: Person | Score: 0.7508
# Entity: 2024 | Label: Time Period | Score: 0.8572
# Entity: Schimanski et al. | Label: Person | Score: 0.7806
# Entity: 2024 | Label: Time Period | Score: 0.9111

📉 Training Hyperparameters

Learning Rate: 5e-6
Seed: 3012
Encoder Learning Rate: 1e-5
Focal Loss: α=0.75, γ=2
Batch Size: 8 (Gradient Accumulation: 2)
Epochs: 20
Warmup Ratio: 0.1
Optimizer: AdamW

📚 Citation

@misc{poleksic2026named,
  author       = {Poleksić, Andrija and Martinčić-Ipšić, Sanda},
  title        = {Named Entity Recognition for Climate Change Research},
  year         = {2026},
  howpublished = {Research Square},
  note         = {Preprint}
}

Downloads last month: 10

Model tree for P0L3/CliReNER-gliner_medium-v2.5

Base model

gliner-community/gliner_medium-v2.5

Finetuned

(2)

this model

Datasets used to train P0L3/CliReNER-gliner_medium-v2.5

Collection including P0L3/CliReNER-gliner_medium-v2.5

CliReNER-Encoders

Collection

A collection of fine-tuned transformer models for climate-focused Named Entity Recognition (NER). • 13 items • Updated Mar 23