ClimateLouie
/

AdaptationBERT

+---
+library_name: transformers
+language:
+  - en
+license: apache-2.0
+tags:
+  - text-classification
+  - climate
+  - esg
+  - environment
+  - adaptation
+  - roberta
+  - binary-classification
+pipeline_tag: text-classification
+base_model: ESGBERT/EnvRoBERTa-base
+datasets:
+  - custom
+model-index:
+  - name: AdaptationBERT
+    results: []
+---
+# AdaptationBERT
+A fine-tuned RoBERTa model for binary classification of climate adaptation and resilience texts in the ESG/environmental domain.
+Built on top of [ESGBERT/EnvRoBERTa-base](https://huggingface.co/ESGBERT/EnvRoBERTa-base), AdaptationBERT is additionally fine-tuned on a 2,000-sample adaptation dataset to detect whether a given text is related to **climate adaptation and resilience**.
+## Model Details
+### Model Description
+AdaptationBERT is a domain-specific language model designed for the automatic classification of environmental texts. It identifies whether a text passage discusses climate adaptation topics such as resilience planning, adaptive capacity, vulnerability reduction, or climate risk management.
+- **Model type:** RoBERTa-based binary text classifier (`RobertaForSequenceClassification`)
+- **Language(s):** English
+- **License:** Apache 2.0
+- **Fine-tuned from:** [ESGBERT/EnvRoBERTa-base](https://huggingface.co/ESGBERT/EnvRoBERTa-base)
+### Architecture
+| Parameter | Value |
+|---|---|
+| Hidden size | 768 |
+| Layers | 12 |
+| Attention heads | 12 |
+| Intermediate size | 3,072 |
+| Vocabulary size | 50,265 |
+| Max sequence length | 512 tokens |
+| Parameters | ~125M |
+| Model format | SafeTensors |
+### Labels
+| Label | Description |
+|---|---|
+| `0` | Non-adaptation-related |
+| `1` | Adaptation-related |
+## Uses
+### Direct Use
+AdaptationBERT is designed for classifying English text passages as related or unrelated to climate adaptation. Typical use cases include:
+- Screening corporate sustainability reports for adaptation-related disclosures
+- Analyzing ESG filings and environmental policy documents
+- Large-scale text mining of climate adaptation mentions across document corpora
+- Supporting research on climate resilience discourse
+### Recommended Pipeline
+It is **highly recommended** to use a two-stage classification pipeline:
+1. First, classify whether a text is "environmental" using the [EnvironmentalBERT-environmental](https://huggingface.co/ESGBERT/EnvironmentalBERT-environmental) model.
+2. Then, apply **AdaptationBERT** only to texts classified as environmental to determine if they are adaptation-related.
+This two-stage approach improves precision by filtering out non-environmental texts before adaptation classification.
+### Out-of-Scope Use
+- Texts in languages other than English
+- Non-environmental domains (e.g., finance-only, legal, medical) without the upstream environmental filter
+- Real-time or safety-critical decision systems where misclassification could cause harm
+- As a sole basis for regulatory compliance decisions
+## How to Get Started with the Model
+```python
+from transformers import pipeline
+classifier = pipeline(
+    "text-classification",
+    model="your-username/AdaptationBERT",
+    tokenizer="your-username/AdaptationBERT",
+)
+text = "The city implemented a flood resilience plan to protect coastal infrastructure from rising sea levels."
+result = classifier(text)
+print(result)
+# [{'label': 'adaptation-related', 'score': 0.98}]
+```
+Or load the model and tokenizer directly:
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+tokenizer = AutoTokenizer.from_pretrained("your-username/AdaptationBERT")
+model = AutoModelForSequenceClassification.from_pretrained("your-username/AdaptationBERT")
+text = "Communities are developing drought-resistant farming techniques to adapt to changing rainfall patterns."
+inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
+with torch.no_grad():
+    outputs = model(**inputs)
+    predictions = torch.softmax(outputs.logits, dim=-1)
+    predicted_label = torch.argmax(predictions, dim=-1).item()
+label_map = {0: "non-adaptation-related", 1: "adaptation-related"}
+print(f"Prediction: {label_map[predicted_label]} (confidence: {predictions[0][predicted_label]:.4f})")
+```
+For detailed tutorials, see these guides by Tobias Schimanski on Medium:
+- [Model usage](https://medium.com/@schimanski.tobi/analyzing-esg-with-ai-and-nlp-tutorial-2-large-scale-analyses-of-environmental-actions-0735cc8dc9c2)
+- [Large-scale analysis](https://medium.com/@schimanski.tobi/analyzing-esg-with-ai-and-nlp-tutorial-2-large-scale-analyses-of-environmental-actions-0735cc8dc9c2)
+- [Fine-tuning your own models](https://medium.com/@schimanski.tobi/analyzing-esg-with-ai-and-nlp-tutorial-3-fine-tune-your-own-models-e3692fc0b3c0)
+## Training Details
+### Training Data
+The model was fine-tuned on a curated dataset of approximately **2,000 text samples** annotated for climate adaptation relevance. The dataset contains examples from ESG reports, sustainability disclosures, and environmental policy texts, with binary labels indicating whether each sample discusses climate adaptation and resilience.
+### Training Procedure
+#### Base Model
+Training starts from [ESGBERT/EnvRoBERTa-base](https://huggingface.co/ESGBERT/EnvRoBERTa-base), which is itself a RoBERTa model further pre-trained on environmental text corpora. This provides a strong domain-specific foundation for the adaptation classification task.
+#### Training Hyperparameters
+- **Training regime:** fp32
+- **Problem type:** Single-label classification
+- **Framework:** PyTorch + Hugging Face Transformers (v4.40.2)
+## Bias, Risks, and Limitations
+- **Training data size:** The model was fine-tuned on only ~2,000 samples, which may limit its ability to generalize across all types of adaptation-related text.
+- **Language limitation:** The model only supports English text. Climate adaptation texts in other languages will not be classified correctly.
+- **Domain specificity:** Performance is optimized for ESG/environmental domain text. Texts from other domains discussing adaptation in non-climate contexts (e.g., biological adaptation, software adaptation) may produce false positives.
+- **Temporal bias:** The training data reflects adaptation terminology and framing as of the time of dataset creation. Emerging adaptation concepts or evolving terminology may not be captured.
+- **Geographic bias:** The training corpus may over-represent adaptation discourse from certain regions or regulatory frameworks, potentially underperforming on texts from underrepresented geographies.
+### Recommendations
+- Always use the recommended two-stage pipeline (environmental filter + adaptation classification) for best results.
+- Validate model outputs on your specific corpus before using in production.
+- Do not use model predictions as the sole input for policy or regulatory decisions.
+- Consider supplementing with human review, especially for high-stakes applications.
+## Technical Specifications
+### Model Architecture and Objective
+RoBERTa (Robustly Optimized BERT Pretraining Approach) with a sequence classification head. The model uses 12 transformer layers with 12 attention heads each, a hidden size of 768, and GELU activation. Classification is performed via a linear layer on top of the `[CLS]` token representation.
+### Software
+- **Transformers:** 4.40.2
+- **Model format:** SafeTensors
+- **Tokenizer:** RoBERTa BPE tokenizer (50,265 tokens)
+## Citation
+If you use this model in your research, please cite:
+**BibTeX:**
+```bibtex
+@misc{adaptationbert,
+  title={AdaptationBERT: A Fine-tuned Language Model for Climate Adaptation Text Classification},
+  author={Tobias Schimanski},
+  year={2024},
+  url={https://huggingface.co/ESGBERT/AdaptationBERT}
+}
+```
+## More Information
+This model is part of the [ESGBERT](https://huggingface.co/ESGBERT) family of models for ESG and environmental text analysis. Related models include:
+- [EnvRoBERTa-base](https://huggingface.co/ESGBERT/EnvRoBERTa-base) - Base environmental language model
+- [EnvironmentalBERT-environmental](https://huggingface.co/ESGBERT/EnvironmentalBERT-environmental) - Environmental text classifier (recommended upstream filter)