---
library_name: transformers
language:
  - en
license: apache-2.0
tags:
  - text-classification
  - climate
  - esg
  - environment
  - adaptation
  - roberta
  - binary-classification
pipeline_tag: text-classification
base_model: ESGBERT/EnvRoBERTa-base
datasets:
  - custom
model-index:
  - name: AdaptationBERT
    results: []
---

# AdaptationBERT

A fine-tuned RoBERTa model for binary classification of climate adaptation and resilience texts in the ESG/environmental domain.

Built on top of [ESGBERT/EnvRoBERTa-base](https://huggingface.co/ESGBERT/EnvRoBERTa-base), AdaptationBERT is additionally fine-tuned on a 2,000-sample adaptation dataset to detect whether a given text is related to **climate adaptation and resilience**.

## Model Details

### Model Description

AdaptationBERT is a domain-specific language model designed for the automatic classification of environmental texts. It identifies whether a text passage discusses climate adaptation topics such as resilience planning, adaptive capacity, vulnerability reduction, or climate risk management.

- **Model type:** RoBERTa-based binary text classifier (`RobertaForSequenceClassification`)
- **Language(s):** English
- **License:** Apache 2.0
- **Fine-tuned from:** [ESGBERT/EnvRoBERTa-base](https://huggingface.co/ESGBERT/EnvRoBERTa-base)

### Architecture

| Parameter | Value |
|---|---|
| Hidden size | 768 |
| Layers | 12 |
| Attention heads | 12 |
| Intermediate size | 3,072 |
| Vocabulary size | 50,265 |
| Max sequence length | 512 tokens |
| Parameters | ~125M |
| Model format | SafeTensors |

### Labels

| Label | Description |
|---|---|
| `0` | Non-adaptation-related |
| `1` | Adaptation-related |

## Uses

### Direct Use

AdaptationBERT is designed for classifying English text passages as related or unrelated to climate adaptation. Typical use cases include:

- Screening corporate sustainability reports for adaptation-related disclosures
- Analyzing ESG filings and environmental policy documents
- Large-scale text mining of climate adaptation mentions across document corpora
- Supporting research on climate resilience discourse

### Recommended Pipeline

It is **highly recommended** to use a two-stage classification pipeline:

1. First, classify whether a text is "environmental" using the [EnvironmentalBERT-environmental](https://huggingface.co/ESGBERT/EnvironmentalBERT-environmental) model.
2. Then, apply **AdaptationBERT** only to texts classified as environmental to determine if they are adaptation-related.

This two-stage approach improves precision by filtering out non-environmental texts before adaptation classification.

### Out-of-Scope Use

- Texts in languages other than English
- Non-environmental domains (e.g., finance-only, legal, medical) without the upstream environmental filter
- Real-time or safety-critical decision systems where misclassification could cause harm
- As a sole basis for regulatory compliance decisions

## How to Get Started with the Model

```python
from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="ClimateLouie/AdaptationBERT",
    tokenizer="ClimateLouie/AdaptationBERT",
)

text = "The city implemented a flood resilience plan to protect coastal infrastructure from rising sea levels."
result = classifier(text)
print(result)
# [{'label': 'adaptation-related', 'score': 0.98}]
```

Or load the model and tokenizer directly:

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("your-username/AdaptationBERT")
model = AutoModelForSequenceClassification.from_pretrained("your-username/AdaptationBERT")

text = "Communities are developing drought-resistant farming techniques to adapt to changing rainfall patterns."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.softmax(outputs.logits, dim=-1)
    predicted_label = torch.argmax(predictions, dim=-1).item()

label_map = {0: "non-adaptation-related", 1: "adaptation-related"}
print(f"Prediction: {label_map[predicted_label]} (confidence: {predictions[0][predicted_label]:.4f})")
```

For detailed tutorials, see these guides by Tobias Schimanski on Medium:
- [Model usage](https://medium.com/@schimanski.tobi/analyzing-esg-with-ai-and-nlp-tutorial-2-large-scale-analyses-of-environmental-actions-0735cc8dc9c2)
- [Large-scale analysis](https://medium.com/@schimanski.tobi/analyzing-esg-with-ai-and-nlp-tutorial-2-large-scale-analyses-of-environmental-actions-0735cc8dc9c2)
- [Fine-tuning your own models](https://medium.com/@schimanski.tobi/analyzing-esg-with-ai-and-nlp-tutorial-3-fine-tune-your-own-models-e3692fc0b3c0)

## Training Details

### Training Data

The model was fine-tuned on a curated dataset of approximately **2,000 text samples** annotated for climate adaptation relevance. The dataset contains examples from ESG reports, sustainability disclosures, and environmental policy texts, with binary labels indicating whether each sample discusses climate adaptation and resilience.

### Training Procedure

#### Base Model

Training starts from [ESGBERT/EnvRoBERTa-base](https://huggingface.co/ESGBERT/EnvRoBERTa-base), which is itself a RoBERTa model further pre-trained on environmental text corpora. This provides a strong domain-specific foundation for the adaptation classification task.

#### Training Hyperparameters

- **Training regime:** fp32
- **Problem type:** Single-label classification
- **Framework:** PyTorch + Hugging Face Transformers (v4.40.2)

## Bias, Risks, and Limitations

- **Training data size:** The model was fine-tuned on only ~2,000 samples, which may limit its ability to generalize across all types of adaptation-related text.
- **Language limitation:** The model only supports English text. Climate adaptation texts in other languages will not be classified correctly.
- **Domain specificity:** Performance is optimized for ESG/environmental domain text. Texts from other domains discussing adaptation in non-climate contexts (e.g., biological adaptation, software adaptation) may produce false positives.
- **Temporal bias:** The training data reflects adaptation terminology and framing as of the time of dataset creation. Emerging adaptation concepts or evolving terminology may not be captured.
- **Geographic bias:** The training corpus may over-represent adaptation discourse from certain regions or regulatory frameworks, potentially underperforming on texts from underrepresented geographies.

### Recommendations

- Always use the recommended two-stage pipeline (environmental filter + adaptation classification) for best results.
- Validate model outputs on your specific corpus before using in production.
- Do not use model predictions as the sole input for policy or regulatory decisions.
- Consider supplementing with human review, especially for high-stakes applications.

## Technical Specifications

### Model Architecture and Objective

RoBERTa (Robustly Optimized BERT Pretraining Approach) with a sequence classification head. The model uses 12 transformer layers with 12 attention heads each, a hidden size of 768, and GELU activation. Classification is performed via a linear layer on top of the `[CLS]` token representation.

### Software

- **Transformers:** 4.40.2
- **Model format:** SafeTensors
- **Tokenizer:** RoBERTa BPE tokenizer (50,265 tokens)

## Citation

If you use this model in your research, please cite:

**BibTeX:**

```bibtex
@misc{adaptationbert,
  title={AdaptationBERT: A Fine-tuned Language Model for Climate Adaptation Text Classification},
  author={Louie Woodall, inspired by Tobias Schimanski},
  year={2024},
  url={https://huggingface.co/ClimateLouie/AdaptationBERT}
}
```

## More Information

This model is part of the [ESGBERT](https://huggingface.co/ESGBERT) family of models for ESG and environmental text analysis. Related models include:

- [EnvRoBERTa-base](https://huggingface.co/ESGBERT/EnvRoBERTa-base) - Base environmental language model
- [EnvironmentalBERT-environmental](https://huggingface.co/ESGBERT/EnvironmentalBERT-environmental) - Environmental text classifier (recommended upstream filter)