GLiNER-Small for Climate Research NER

This model is a GLiNER model fine-tuned for fine-grained Named Entity Recognition (NER) in the climate change research domain. It identifies 28 distinct entity types using a generalist span-based architecture.

📌 Model Details

  • Model Type: GLiNER
  • Base Model: gliner-community/gliner_small-v2.5
  • Architecture: Bi-encoder architecture with a focal loss objective.
  • Language: English
  • License: cc-by-sa-4.0

Entity Typology (28 Classes)

The model is trained to recognize: Asset, Body Part, Body of Water, Chemical, Disease, Ecosystem, Energy Source, Field of Study, Geographical Feature, Intellectual Artefact, Location, Mathematical Expression, Measuring Device, Meteorological Phenomenon, Method, Natural Disaster, Natural Phenomenon, Organism, Organization, Other, Person, Physical Artefact, Physical Phenomenon, Policy, Quantity, Satellite, System, and Time Period.


🚀 Main Results (Selected Checkpoint)

This repository provides the best-performing checkpoint selected from 5 runs with different random seeds. While the internal training logs tracked performance on the validation split of CliReNERsilver, the final model selection and the metrics below are evaluated on the independent, expert-annotated CliReNERgold dataset.

Metric Score
Precision 58.86
Recall 59.44
F1 59.15

This checkpoint corresponds to the seed with the highest strict F1 on the gold evaluation set (Seed 4 - 3012).


📊 Results Across Seeds

We fine-tuned the model using 5 different random seeds to assess the stability and robustness of the architecture on the domain-specific text.

Seed Precision Recall Strict F1
1 58.61 58.54 58.57
2 58.62 59.44 59.03
3 57.99 58.25 58.12
4 58.86 59.44 59.15
5 57.17 58.62 57.89

Summary:

  • F1: mean = 58.55, std = 0.55
  • Precision: mean = 58.25, std = 0.68
  • Recall: mean = 58.86, std = 0.55

📂 Dataset & Evaluation

  • Training Dataset: CliReNERsilver (80% split).
  • Evaluation Dataset: CliReNERgold (expert consensus via Weighted Expert Voting).
  • Metric Details: Strict F1 (Entity-level exact span and label match).

⚙️ Usage

Direct Use for Inference

Requires the gliner library:

pip install gliner
from gliner import GLiNER

# Load model
model = GLiNER.from_pretrained(
    "P0L3/CliReNER-gliner_small-v2.5",
    load_tokenizer=True
)

# Input text
text = """
Importantly, these advances are enabled not only by model architectures but also by the availability of high-quality, expert-annotated datasets, which remain limited in the climate change domain.
"""

# Labels
labels = [
    "Ecosystem", "Energy Source", "Natural Disaster", 
    "Meteorological Phenomenon", "Quantity", "Intellectual Artefact", 
    "Body of Water", "Disease", "Location", 
    "Physical Phenomenon", "Chemical", "Time Period", 
    "Organization", "Natural Phenomenon", "Field of Study", 
    "Mathematical Expression", "Measuring Device", "Geographical Feature", 
    "System", "Satellite", "Organism", 
    "Method", "Other", "Person", 
    "Physical Artefact", "Body Part", "Asset",
    "Policy",
]

# Predict entities
entities = model.predict_entities(text, labels)

# Match SpanMarker-style output
for entity in entities:
    span = entity["text"]
    label = entity["label"]
    score = entity["score"]  # fallback if score not present

    print(f"Entity: {span} | Label: {label} | Score: {score:.4f}")

# Entity: advances | Label: Other | Score: 0.7018
# Entity: model architectures | Label: Method | Score: 0.6993
# Entity: high-quality | Label: Quantity | Score: 0.5961
# Entity: expert-annotated datasets | Label: Intellectual Artefact | Score: 0.7253
# Entity: climate change domain | Label: Field of Study | Score: 0.6641

📉 Training Hyperparameters

  • Learning Rate: 5e-6
  • Seed: 3012
  • Encoder Learning Rate: 1e-5
  • Focal Loss: α=0.75, γ=2
  • Batch Size: 8 (Gradient Accumulation: 2)
  • Epochs: 20
  • Warmup Ratio: 0.1
  • Optimizer: AdamW

📚 Citation

@misc{poleksic2026named,
  author       = {Poleksić, Andrija and Martinčić-Ipšić, Sanda},
  title        = {Named Entity Recognition for Climate Change Research},
  year         = {2026},
  howpublished = {Research Square},
  note         = {Preprint}
}
Downloads last month
10
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for P0L3/CliReNER-gliner_small-v2.5

Finetuned
(1)
this model

Datasets used to train P0L3/CliReNER-gliner_small-v2.5

Collection including P0L3/CliReNER-gliner_small-v2.5