|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
- f1 |
|
|
base_model: |
|
|
- climatebert/distilroberta-base-climate-detector |
|
|
pipeline_tag: text-classification |
|
|
tags: |
|
|
- islamic finance |
|
|
- islamic banks |
|
|
- text classification |
|
|
- climate |
|
|
- binary classification |
|
|
- NLP |
|
|
- finance |
|
|
--- |
|
|
|
|
|
|
|
|
# Islamic-FinClimateBERT: Fine-Tuned ClimateBERT for Islamic Finance Climate Discourse |
|
|
|
|
|
A domain-adapted binary classifier fine-tuned on *climate-related vs. non-climate* sentences from Islamic finance corpora. This model is based on [`ClimateBERT`](https://huggingface.co/climatebert/distilroberta-base-climate-detector) and is specialized for detecting climate relevance in **Islamic financial narratives**. |
|
|
|
|
|
|
|
|
## Model Summary |
|
|
|
|
|
- **Base model**: [`ClimateBERT`](https://huggingface.co/climatebert/distilroberta-base-climate-detector) |
|
|
- **Architecture**: RoBERTa-based, distilled |
|
|
- **Task**: Binary sentence classification |
|
|
- **Domain**: Islamic Finance + Climate Discourse |
|
|
- **Labels**: |
|
|
- `0` → Not Climate-Relevant |
|
|
- `1` → Climate-Relevant |
|
|
- **Language**: English (Islamic finance-specific vocabulary) |
|
|
- **Training Data Size**: 1,132 manually annotated sentences |
|
|
|
|
|
|
|
|
## Training Pipeline |
|
|
- **Framework**: Hugging Face `transformers` + `datasets` |
|
|
- **Tokenizer**: ClimateBERT tokenizer (BPE) |
|
|
- **Training split**: Stratified 80/20 (train/test) |
|
|
- **Evaluation metric**: F1 (macro), accuracy |
|
|
- **Optimizer**: AdamW with weight decay |
|
|
- **Epochs**: 4 |
|
|
- **Batch size**: 16 |
|
|
- **Precision**: FP16 enabled |
|
|
|
|
|
### Evaluation |
|
|
| Metric | Value | |
|
|
|------------|-----------| |
|
|
| Accuracy | 0.9868 | |
|
|
| F1-score | 0.9868 | |
|
|
| Eval loss | 0.0553 | |
|
|
|
|
|
--- |
|
|
|
|
|
## Evaluation & Domain Comparison |
|
|
|
|
|
The **Islamic-FinClimateBERT** model was evaluated against the original [`ClimateBERT`](https://huggingface.co/climatebert/distilroberta-base-climate-detector) using **79,876** sentence-level samples extracted from 838 annual reports of 103 Islamic banks across 25 jurisdictions (2015–2024). |
|
|
|
|
|
This comparative evaluation assesses how domain fine-tuning affects climate relevance detection within Islamic finance discourse. |
|
|
|
|
|
### Evaluation Summary |
|
|
|
|
|
| Metric | Fine-Tuned | Original | Description | |
|
|
|------------------|------------|-----------|--------------| |
|
|
| **Total Sentences** | 79,876 | – | Sentences compared 1-to-1 | |
|
|
| **Agreements** | 70,209 | – | Sentences where both models agreed | |
|
|
| **Disagreements** | 9,667 | – | Sentences with differing predictions | |
|
|
| **Overall Accuracy** | 0.88 | – | Agreement between models | |
|
|
|
|
|
|
|
|
### Classification Report (Fine-Tuned vs. Original) |
|
|
|
|
|
| Label | Precision | Recall | F1-score | Support | |
|
|
|:------|:-----------:|:--------:|:----------:|:---------:| |
|
|
| **Climate** | 0.92 | 0.83 | 0.87 | 39,558 | |
|
|
| **Non-Climate** | 0.85 | 0.93 | 0.89 | 40,318 | |
|
|
| **Overall Accuracy** | – | – | **0.88** | 79,876 | |
|
|
| **Macro Avg** | 0.88 | 0.88 | 0.88 | – | |
|
|
|
|
|
### Confusion Matrix |
|
|
|
|
|
| | **Fine = Climate** | **Fine = Non-Climate** | |
|
|
|----------------------:|------------------:|-----------------------:| |
|
|
| **Orig = Climate** | 32,887 | 6,671 | |
|
|
| **Orig = Non-Climate**| 2,996 | 37,322 | |
|
|
|
|
|
- The fine-tuned model shows **strong domain adaptation**, improving contextual sensitivity to Islamic finance climate narratives. |
|
|
- It tends to **classify fewer sentences as “climate-relevant”** compared to the base model, reflecting a **more conservative and context-aware** understanding of climate-related terminology in Islamic finance reporting. |
|
|
|
|
|
--- |
|
|
## GitHub Repository |
|
|
|
|
|
The full project repository, including training notebooks, dataset scripts, and evaluation pipelines, is available at [https://github.com/bilalezafar/Islamic-FinClimateBERT](https://github.com/bilalezafar/Islamic-FinClimateBERT). |
|
|
|
|
|
--- |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch |
|
|
|
|
|
# Load model and tokenizer |
|
|
tokenizer = AutoTokenizer.from_pretrained("bilalzafar/Islamic-FinClimateBERT") |
|
|
model = AutoModelForSequenceClassification.from_pretrained("bilalzafar/Islamic-FinClimateBERT") |
|
|
|
|
|
# Define classifier function |
|
|
def clf(text): |
|
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) |
|
|
outputs = model(**inputs) |
|
|
probs = torch.softmax(outputs.logits, dim=-1) |
|
|
label = probs.argmax().item() |
|
|
score = probs.max().item() |
|
|
return [{"label": "Climate" if label == 1 else "Not Climate", "score": round(score, 4)}] |
|
|
|
|
|
# Example usage |
|
|
text = "The bank’s green sukuk issuance aims to support renewable energy projects in the country." |
|
|
print(clf(text)[0]) |
|
|
|
|
|
# Example output: {'label': 'Climate', 'score': 0.9995} |
|
|
``` |
|
|
|
|
|
--- |
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@article{zafar2026islamicfinclimatebert, |
|
|
title = {Talk or Action? Unveiling the Nature and Depth of Climate Disclosures in Islamic Banks Using Machine Learning}, |
|
|
author = {Zafar, Muhammad Bilal}, |
|
|
journal = {Borsa Istanbul Review}, |
|
|
year = {2026}, |
|
|
doi = {10.1016/j.bir.2026.100789} |
|
|
} |
|
|
``` |
|
|
|
|
|
Zafar, M. B. (2026). Talk or action? Unveiling the nature and depth of climate disclosures in Islamic banks using machine learning. Borsa Istanbul Review. https://doi.org/10.1016/j.bir.2026.100789 |