File size: 5,164 Bytes
9f441d5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f375092
9f441d5
f375092
9f441d5
f375092
9f441d5
f375092
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9f441d5
f375092
 
 
 
 
 
9f441d5
f375092
 
9f441d5
00a6585
 
 
7970c23
00a6585
9f441d5
 
f375092
9f441d5
 
 
 
 
0146d1a
f375092
0146d1a
9f441d5
0146d1a
 
 
 
 
 
 
 
f375092
0146d1a
 
 
9f441d5
0146d1a
9352173
9f441d5
66d26c4
138e265
 
66d26c4
 
 
 
 
 
 
 
9352173
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
---
license: mit
language:
- en
metrics:
- accuracy
- f1
base_model:
- climatebert/distilroberta-base-climate-detector
pipeline_tag: text-classification
tags:
- islamic finance
- islamic banks
- text classification
- climate
- binary classification
- NLP
- finance
---


# Islamic-FinClimateBERT: Fine-Tuned ClimateBERT for Islamic Finance Climate Discourse

A domain-adapted binary classifier fine-tuned on *climate-related vs. non-climate* sentences from Islamic finance corpora. This model is based on [`ClimateBERT`](https://huggingface.co/climatebert/distilroberta-base-climate-detector) and is specialized for detecting climate relevance in **Islamic financial narratives**.


## Model Summary

- **Base model**: [`ClimateBERT`](https://huggingface.co/climatebert/distilroberta-base-climate-detector)
- **Architecture**: RoBERTa-based, distilled
- **Task**: Binary sentence classification
- **Domain**: Islamic Finance + Climate Discourse
- **Labels**: 
  - `0` → Not Climate-Relevant  
  - `1` → Climate-Relevant
- **Language**: English (Islamic finance-specific vocabulary)
- **Training Data Size**: 1,132 manually annotated sentences


## Training Pipeline
- **Framework**: Hugging Face `transformers` + `datasets`
- **Tokenizer**: ClimateBERT tokenizer (BPE)
- **Training split**: Stratified 80/20 (train/test)
- **Evaluation metric**: F1 (macro), accuracy
- **Optimizer**: AdamW with weight decay
- **Epochs**: 4
- **Batch size**: 16
- **Precision**: FP16 enabled

### Evaluation
| Metric     | Value     |
|------------|-----------|
| Accuracy   | 0.9868    |
| F1-score   | 0.9868    |
| Eval loss  | 0.0553    |

---

## Evaluation & Domain Comparison

The **Islamic-FinClimateBERT** model was evaluated against the original [`ClimateBERT`](https://huggingface.co/climatebert/distilroberta-base-climate-detector) using **79,876** sentence-level samples extracted from 838 annual reports of 103 Islamic banks across 25 jurisdictions (2015–2024).

This comparative evaluation assesses how domain fine-tuning affects climate relevance detection within Islamic finance discourse.

### Evaluation Summary

| Metric           | Fine-Tuned | Original | Description |
|------------------|------------|-----------|--------------|
| **Total Sentences** | 79,876 | – | Sentences compared 1-to-1 |
| **Agreements** | 70,209 | – | Sentences where both models agreed |
| **Disagreements** | 9,667 | – | Sentences with differing predictions |
| **Overall Accuracy** | 0.88 | – | Agreement between models |


### Classification Report (Fine-Tuned vs. Original)

| Label | Precision | Recall | F1-score | Support |
|:------|:-----------:|:--------:|:----------:|:---------:|
| **Climate** | 0.92 | 0.83 | 0.87 | 39,558 |
| **Non-Climate** | 0.85 | 0.93 | 0.89 | 40,318 |
| **Overall Accuracy** | – | – | **0.88** | 79,876 |
| **Macro Avg** | 0.88 | 0.88 | 0.88 | – |

### Confusion Matrix

|                      | **Fine = Climate** | **Fine = Non-Climate** |
|----------------------:|------------------:|-----------------------:|
| **Orig = Climate**    | 32,887 | 6,671 |
| **Orig = Non-Climate**| 2,996 | 37,322 |

- The fine-tuned model shows **strong domain adaptation**, improving contextual sensitivity to Islamic finance climate narratives.  
- It tends to **classify fewer sentences as “climate-relevant”** compared to the base model, reflecting a **more conservative and context-aware** understanding of climate-related terminology in Islamic finance reporting.  

---
## GitHub Repository

The full project repository, including training notebooks, dataset scripts, and evaluation pipelines, is available at [https://github.com/bilalezafar/Islamic-FinClimateBERT](https://github.com/bilalezafar/Islamic-FinClimateBERT).

---

## Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("bilalzafar/Islamic-FinClimateBERT")
model = AutoModelForSequenceClassification.from_pretrained("bilalzafar/Islamic-FinClimateBERT")

# Define classifier function
def clf(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=-1)
    label = probs.argmax().item()
    score = probs.max().item()
    return [{"label": "Climate" if label == 1 else "Not Climate", "score": round(score, 4)}]

# Example usage
text = "The bank’s green sukuk issuance aims to support renewable energy projects in the country."
print(clf(text)[0])

# Example output: {'label': 'Climate', 'score': 0.9995}
```

---
## Citation

```bibtex
@article{zafar2026islamicfinclimatebert,
  title   = {Talk or Action? Unveiling the Nature and Depth of Climate Disclosures in Islamic Banks Using Machine Learning},
  author  = {Zafar, Muhammad Bilal},
  journal = {Borsa Istanbul Review},
  year    = {2026},
  doi     = {10.1016/j.bir.2026.100789}
}
```

Zafar, M. B. (2026). Talk or action? Unveiling the nature and depth of climate disclosures in Islamic banks using machine learning. Borsa Istanbul Review. https://doi.org/10.1016/j.bir.2026.100789