Token Classification
SpanMarker
Safetensors
English
ner
named-entity-recognition
generated_from_span_marker_trainer
climate-change
earth-science
Eval Results (legacy)
Instructions to use P0L3/CliReNER-roberta-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- SpanMarker
How to use P0L3/CliReNER-roberta-base with SpanMarker:
from span_marker import SpanMarkerModel model = SpanMarkerModel.from_pretrained("P0L3/CliReNER-roberta-base") - Notebooks
- Google Colab
- Kaggle
File size: 14,452 Bytes
d896049 984ac9a d896049 984ac9a d896049 984ac9a d896049 2ad96b3 d896049 2ad96b3 d896049 2ad96b3 d896049 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 | ---
language: en
license: cc-by-sa-4.0
tags:
- span-marker
- token-classification
- ner
- named-entity-recognition
- generated_from_span_marker_trainer
- climate-change
- earth-science
widget:
- text: While a significant positive impact of solid-state cultivation using white
rot fungi on enzymatic digestibility was reported in some studies [ 68 , 69 ]
, a negative effect of fungal pretreatment on enzymatic hydrolysis was noted by
investigators like Shi et al . ( 2009 ) [ 33 ] , who reported a glucose yield
of 55 . 6 mg g − 1 of cotton stalks pretreated with P . chrysosporium , which
was approximately 17 % lower than the yield of untreated cotton stalks after enzymatic
hydrolysis in spite of significant lignin degradation .
- text: We quantify changes in the properties and amount of bottom water entering
the basin by combining repeat hydrographic observations , direct velocity measurements
and flow structure derived from a 0 . 1 ° global ocean sea-ice model that realistically
simulates AABW formation sites and export pathways .
- text: The impact of these differences on cloud forcing can be signi or more . cant
and as high as 30 W m In recent years , observations from satellite data have
been revised considerably after significant development efforts , especially after
utilizing new high-quality reference measurements from active sensors in space
, and some datasets have also improved polar cloud detection .
- text: If the response is significant , how does the solar forcing impact the EASM
rainfall variability ? In this study , we will address these questions based on
the simulation results derived from one AD 850 control experiment ( CTRL ) and
four solar-only forcing experiments [ spectral solar irradiance ( SSI ) experiments
] , which were conducted by the Community Earth System ( CESM-LME ) Model – Last
Millennium Ensemble modeling project ( Otto-Bliesner et al . 2016 ) .
- text: Measurements from single moorings at each gateway reveal that the speed of
bottom water flow into the Australian Antarctic Basin varies with location , season
and density ( Fig . 3a , c , e ) .
pipeline_tag: token-classification
library_name: span-marker
metrics:
- precision
- recall
- f1
datasets:
- P0L3/CliReNER_v_1_1_28_SILVER
- P0L3/CliReNER_v_1_1_28_GOLD
base_model: FacebookAI/roberta-base
model-index:
- name: SpanMarker with FacebookAI/roberta-base
results:
- task:
type: token-classification
name: Named Entity Recognition
dataset:
name: CliReNER_silver
type: P0L3/CliReNER_v_1_1_28_SILVER
split: eval
metrics:
- type: f1
value: 0.6300366300366301
name: F1
- type: precision
value: 0.6437125748502994
name: Precision
- type: recall
value: 0.6169296987087518
name: Recall
---
# SpanMarker-RoBERTa for Climate Research NER
This model is a [SpanMarker](https://github.com/tomaarsen/SpanMarkerNER) model fine-tuned for fine-grained Named Entity Recognition (NER) in the climate change research domain, extracting 28 distinct entity types. It uses[FacebookAI/roberta-base](https://huggingface.co/FacebookAI/roberta-base) as the underlying encoder.
## 📌 Model Details
- **Model Type:** SpanMarker
- **Encoder:**[FacebookAI/roberta-base](https://huggingface.co/FacebookAI/roberta-base)
- **Maximum Sequence Length:** 512 tokens
- **Maximum Entity Length:** 14 words
- **Language:** English
- **License:** cc-by-sa-4.0
### Model Labels
| Label | Examples |
|:--------------------------|:--------------------------------------------------------------------------------------------|
| Asset | "mental health", "water resources", "raw material" |
| Body Part | "leaves", "plant leaves", "deep tissue compartment" |
| Body of Water | "Dhaleshwari river", "rivers", "peripheral rivers" |
| Chemical | "domoic acid", "cathode materials", "marine algal toxin" |
| Disease | "seizures", "acute neurologic signs", "chronic epileptic syndrome" |
| Ecosystem | "cloud forests", "polluted environment", "Tropical montane cloud forest" |
| Energy Source | "12-cell series battery-pack prototype", "fossil fuels", "battery cells" |
| Field of Study | "veterinary medicine", "reference laboratory", "study" |
| Geographical Feature | "heterogenous topography", "mountainous regions", "low point" |
| Intellectual Artefact | "Daily husbandry records", "data", "Veterinary medical records" |
| Location | "wild", "Westbrook", "beaches" |
| Mathematical Expression | "gradient", "Stepwise machine hour constraints", "difference" |
| Measuring Device | "station", "EEG", "MRI scan" |
| Meteorological Phenomenon | "rainfall", "climate change", "climatic variability" |
| Method | "dosing", "serum monitoring", "clinical efficacy" |
| Natural Disaster | "heavy metal contamination", "seasonal air pollution", "environmental pollution" |
| Natural Phenomenon | "algal blooms", "biochemical changes", "changing ocean conditions" |
| Organism | "Zalophus californianus", "California sea lions", "species" |
| Organization | "reference laboratory", "long-term care facility", "NOAA National Marine Fisheries Service" |
| Other | "marine mammal health", "normal eating", "reports" |
| Person | "staff", "clinicians", "Clinicians" |
| Physical Artefact | "electric vehicle", "paved east – west road", "EVs" |
| Physical Phenomenon | "normal food intake", "structural abnormalities", "seasonal changes" |
| Policy | "energy security", "safety", "pollution" |
| Quantity | "200 mAhg − 1", ">", "energy density" |
| Satellite | "TRMM", "Tropical Rainfall Measuring Mission", "satellites" |
| System | "global overturning circulation", "system structure", "climate" |
| Time Period | "periods of prolonged anorexia", "101 days", "several decades" |
---
## 🚀 Main Results (Selected Checkpoint)
This repository provides the **best-performing checkpoint** selected from 5 runs with different random seeds. While the internal training logs tracked performance on the validation split of **CliReNER<sub>silver</sub>**, the final model selection and the metrics below are evaluated on the independent, expert-annotated **CliReNER<sub>gold</sub>** dataset.
| Metric | Score |
|------------|-------|
| Precision | 55.33 |
| Recall | 49.18 |
| F1 | 52.08 |
> This checkpoint corresponds to the **seed with the highest strict F1 on the gold evaluation set** (Seed 4 - 33).
---
## 📊 Results Across Seeds
We fine-tuned the model using 5 different random seeds to assess the stability and robustness of the architecture on the domain-specific text.
| Seed | Precision | Recall | Strict F1 |
|------|-----------|--------|-----------|
| 1 | 55.39 | 48.69 | 51.83 |
| 2 | 58.32 | 44.12 | 50.23 |
| 3 | 54.80 | 45.92 | 49.97 |
| 4 | 55.33 | 49.18 | 52.08 |
| 5 | 51.19 | 43.95 | 47.30 |
**Summary:**
- **F1:** mean = 50.28, std = 1.91
- **Precision:** mean = 55.01, std = 2.54
- **Recall:** mean = 46.37, std = 2.47
**Model Selection Strategy:**
The uploaded checkpoint is the **single best seed** (highest strict F1 on the gold dataset), ensuring strong real-world performance and high-fidelity alignment with domain-expert consensus.
---
## 📂 Dataset & Evaluation
- **Training Dataset:**[CliReNER<sub>silver</sub>](https://huggingface.co/datasets/P0L3/CliReNER_v_1_1_28_SILVER)
- **Splits used:** Stratified 80:10:10 ratio (Train/Validation/Test). The 80% split was used for training.
- **Evaluation Dataset:** [CliReNER<sub>gold</sub>](https://huggingface.co/datasets/P0L3/CliReNER_v_1_1_28_GOLD)
- **Splits used:** Evaluated on the combined 192 sentences (expert-annotated via Weighted Expert Voting).
- **Preprocessing:**
- Texts were tokenized using the standard RoBERTa tokenizer.
- The dataset utilizes a flat NER schema (nested entities are excluded, and overlapping entities are resolved to the most relevant span).
- **Metric Details:**
- **F1 type:** Strict F1 (Entity-level exact match).
- Evaluation was performed ensuring entities match both the **exact boundary span and the exact semantic label** to be considered correct.
---
## ⚖️ Precision vs Recall Behavior
*(Note to author: Describe the model’s tendency here based on your results. Example: "The model slightly favors recall over precision" or "Balanced precision and recall")*
---
## ⚙️ Usage
### Direct Use for Inference
Because this model was trained using the SpanMarker framework, it requires the `span_marker` library for inference.
```bash
pip install span_marker
```
```python
from span_marker import SpanMarkerModel
# Download from the 🤗 Hub
model = SpanMarkerModel.from_pretrained("P0L3/CliReNER-roberta-base")
# Run inference
text = "Anthropogenic climate change is fundamentally altering weather patterns and climate extremes, causing widespread adverse impacts to both nature and human systems (IPCC 2023)."
entities = model.predict(text)
for entity in entities:
print(f"Entity: {entity['span']} | Label: {entity['label']} | Score: {entity['score']:.4f}")
# Entity: climate change | Label: Meteorological Phenomenon | Score: 0.4065
# Entity: weather patterns | Label: Meteorological Phenomenon | Score: 0.6808
# Entity: climate extremes | Label: Meteorological Phenomenon | Score: 0.7115
# Entity: nature | Label: Other | Score: 0.4608
# Entity: human systems | Label: System | Score: 0.6562
# Entity: IPCC 2023 | Label: Other | Score: 0.4812
```
### Downstream Use
You can easily continue fine-tuning this model on your own dataset.
<details><summary>Click to expand</summary>
```python
from span_marker import SpanMarkerModel, Trainer
from datasets import load_dataset
# Download from the 🤗 Hub
model = SpanMarkerModel.from_pretrained("your-huggingface-username/your-model-name")
# Specify a Dataset with "tokens" and "ner_tags" columns
dataset = load_dataset("your_custom_dataset")
# Initialize a Trainer using the pretrained model & dataset
trainer = Trainer(
model=model,
train_dataset=dataset["train"],
eval_dataset=dataset["validation"],
)
trainer.train()
trainer.save_model("span_marker_model_id-finetuned")
```
</details>
---
## 📉 Training Details
### Training Set Metrics
| Training set | Min | Median | Max |
|:----------------------|:----|:--------|:----|
| Sentence length | 3 | 31.4819 | 97 |
| Entities per sentence | 1 | 7.0100 | 22 |
### Training Hyperparameters
- **learning_rate:** 5e-05
- **train_batch_size:** 8
- **eval_batch_size:** 8
- **seed:** 33
- **gradient_accumulation_steps:** 2
- **total_train_batch_size:** 16
- **optimizer:** adamw_torch with betas=(0.9,0.999) and epsilon=1e-08
- **lr_scheduler_type:** linear
- **lr_scheduler_warmup_ratio:** 0.1
- **num_epochs:** 20
### Training Results (CliReNER<sub>silver</sub> Validation Split)
| Epoch | Step | Validation Loss | Validation Precision | Validation Recall | Validation F1 | Validation Accuracy |
|:-----:|:----:|:---------------:|:--------------------:|:-----------------:|:-------------:|:-------------------:|
| 1.0 | 62 | 0.1324 | 0.0 | 0.0 | 0.0 | 0.6075 |
| 2.0 | 124 | 0.0839 | 0.3333 | 0.0273 | 0.0504 | 0.6166 |
| 3.0 | 186 | 0.0530 | 0.5845 | 0.4218 | 0.4900 | 0.7807 |
| 4.0 | 248 | 0.0460 | 0.6913 | 0.4433 | 0.5402 | 0.7971 |
| 5.0 | 310 | 0.0488 | 0.5965 | 0.6298 | 0.6127 | 0.8307 |
| 6.0 | 372 | 0.0447 | 0.6532 | 0.6026 | 0.6269 | 0.8340 |
| 7.0 | 434 | 0.0466 | 0.6365 | 0.6356 | 0.6360 | 0.8486 |
| 8.0 | 496 | 0.0522 | 0.6388 | 0.6370 | 0.6379 | 0.8468 |
| 9.0 | 558 | 0.0520 | 0.6437 | 0.6169 | 0.6300 | 0.8428 |
### Framework Versions
- **Python:** 3.10.19
- **SpanMarker:** 1.7.0
- **Transformers:** 4.50.0
- **PyTorch:** 2.9.1+cu126
- **Datasets:** 3.0.0
- **Tokenizers:** 0.21.4
---
## 📚 Citation
If you use this model or the CliReNER datasets in your research, please cite the project:
```latex
@misc{poleksic2026named,
author = {Poleksić, Andrija and Martinčić-Ipšić, Sanda},
title = {Named Entity Recognition for Climate Change Research},
year = {2026},
howpublished = {Research Square},
note = {Preprint}
}
```
Please also acknowledge the SpanMarker framework:
```latex
@software{Aarsen_SpanMarker,
author = {Aarsen, Tom},
license = {Apache-2.0},
title = {{SpanMarker for Named Entity Recognition}},
url = {https://github.com/tomaarsen/SpanMarkerNER}
}
``` |