GLiNER-Medium for Climate Research NER
This model is a GLiNER model fine-tuned for fine-grained Named Entity Recognition (NER) in the climate change research domain. It identifies 28 distinct entity types using a generalist span-based architecture.
📌 Model Details
- Model Type: GLiNER
- Base Model: gliner-community/gliner_medium-v2.5
- Architecture: Bi-encoder architecture with a focal loss objective.
- Language: English
- License: cc-by-sa-4.0
Entity Typology (28 Classes)
The model is trained to recognize:
Asset, Body Part, Body of Water, Chemical, Disease, Ecosystem, Energy Source, Field of Study, Geographical Feature, Intellectual Artefact, Location, Mathematical Expression, Measuring Device, Meteorological Phenomenon, Method, Natural Disaster, Natural Phenomenon, Organism, Organization, Other, Person, Physical Artefact, Physical Phenomenon, Policy, Quantity, Satellite, System, and Time Period.
🚀 Main Results (Selected Checkpoint)
This repository provides the best-performing checkpoint selected from 5 runs with different random seeds. While the internal training logs tracked performance on the validation split of CliReNERsilver, the final model selection and the metrics below are evaluated on the independent, expert-annotated CliReNERgold dataset.
| Metric | Score |
|---|---|
| Precision | 61.78 |
| Recall | 62.54 |
| F1 | 62.16 |
This checkpoint corresponds to the seed with the highest strict F1 on the gold evaluation set (Seed 3).
📊 Results Across Seeds
We fine-tuned the model using 5 different random seeds to assess the stability and robustness of the architecture on the domain-specific text.
| Seed | Precision | Recall | Strict F1 |
|---|---|---|---|
| 1 | 61.06 | 61.68 | 61.37 |
| 2 | 61.63 | 62.21 | 61.92 |
| 3 | 61.78 | 62.54 | 62.16 |
| 4 | 61.27 | 62.05 | 61.66 |
| 5 | 60.90 | 62.09 | 61.49 |
Summary:
- F1: mean = 61.72, std = 0.32
- Precision: mean = 61.33, std = 0.37
- Recall: mean = 62.12, std = 0.31
📂 Dataset & Evaluation
- Training Dataset: CliReNERsilver (80% split).
- Evaluation Dataset: CliReNERgold (expert consensus via Weighted Expert Voting).
- Metric Details: Strict F1 (Entity-level exact span and label match).
⚙️ Usage
Direct Use for Inference
Requires the gliner library:
pip install gliner
from gliner import GLiNER
# Load model
model = GLiNER.from_pretrained(
"P0L3/CliReNER-gliner_medium-v2.5",
load_tokenizer=True
)
# Input text
text = """
In recent years, climate NLP has seen its nascent with the introduction of domain-specific models such as ClimateBERT (Webersinke et al. 2022), ClimateGPT (Thulke et al. 2024), and CliReBERT (Poleksić and Martinčić-Ipšić 2025), alongside related efforts (Bhattacharjee et al. 2024; Schimanski et al. 2024)
"""
# Labels
labels = [
"Ecosystem", "Energy Source", "Natural Disaster",
"Meteorological Phenomenon", "Quantity", "Intellectual Artefact",
"Body of Water", "Disease", "Location",
"Physical Phenomenon", "Chemical", "Time Period",
"Organization", "Natural Phenomenon", "Field of Study",
"Mathematical Expression", "Measuring Device", "Geographical Feature",
"System", "Satellite", "Organism",
"Method", "Other", "Person",
"Physical Artefact", "Body Part", "Asset",
"Policy",
]
# Predict entities
entities = model.predict_entities(text, labels)
# Match SpanMarker-style output
for entity in entities:
span = entity["text"]
label = entity["label"]
score = entity["score"] # fallback if score not present
print(f"Entity: {span} | Label: {label} | Score: {score:.4f}")
# Entity: recent years | Label: Time Period | Score: 0.8569
# Entity: climate NLP | Label: Method | Score: 0.6270
# Entity: domain-specific models | Label: Method | Score: 0.7560
# Entity: ClimateBERT | Label: Method | Score: 0.8053
# Entity: Webersinke et al. | Label: Person | Score: 0.7729
# Entity: 2022 | Label: Time Period | Score: 0.8569
# Entity: ClimateGPT | Label: Method | Score: 0.7935
# Entity: Thulke et al. | Label: Person | Score: 0.7884
# Entity: 2024 | Label: Time Period | Score: 0.8953
# Entity: CliReBERT | Label: Method | Score: 0.8261
# Entity: Poleksić and Martinčić-Ipšić | Label: Person | Score: 0.8301
# Entity: 2025 | Label: Time Period | Score: 0.9154
# Entity: related efforts | Label: Other | Score: 0.5821
# Entity: Bhattacharjee et al. | Label: Person | Score: 0.7508
# Entity: 2024 | Label: Time Period | Score: 0.8572
# Entity: Schimanski et al. | Label: Person | Score: 0.7806
# Entity: 2024 | Label: Time Period | Score: 0.9111
📉 Training Hyperparameters
- Learning Rate: 5e-6
- Seed: 3012
- Encoder Learning Rate: 1e-5
- Focal Loss: α=0.75, γ=2
- Batch Size: 8 (Gradient Accumulation: 2)
- Epochs: 20
- Warmup Ratio: 0.1
- Optimizer: AdamW
📚 Citation
@misc{poleksic2026named,
author = {Poleksić, Andrija and Martinčić-Ipšić, Sanda},
title = {Named Entity Recognition for Climate Change Research},
year = {2026},
howpublished = {Research Square},
note = {Preprint}
}
- Downloads last month
- 10
Model tree for P0L3/CliReNER-gliner_medium-v2.5
Base model
gliner-community/gliner_medium-v2.5