Upload ESG Action

7ca3abf verified about 2 months ago

4.39 kB


	---
	language: vi
	license: mit
	library_name: transformers
	pipeline_tag: text-classification
	tags:
	- esg
	- esg-washing
	- actionability
	- banking
	- vietnamese
	- nlp
	- sustainability
	---

	# ESG Actionability Classifier for Vietnamese Banking Reports

	## Model description
	This model is a Vietnamese text classification model fine-tuned from PhoBERT-large to classify ESG-related sentences in banking annual reports according to their actionability level.

	The model is designed as Module 3 (Actionability Classification) in a multi-stage ESG-washing analysis framework. It does not assess factual correctness or ESG performance, but focuses on identifying whether a disclosure describes concrete actions, future plans, or vague commitments.

	The model predicts one of three labels:
	- Implemented: concrete actions or achieved results (often with time references or quantitative indicators)
	- Planning: stated plans, targets, or future-oriented commitments
	- Indeterminate: general or symbolic statements without specific actions or evidence

	---

	## Intended use

	### Primary intended use
	- Analyzing ESG disclosure quality in Vietnamese banking annual reports.
	- Supporting ESG-washing risk analysis by distinguishing substantive actions from symbolic language.

	### Example downstream usage
	- Measuring the proportion of Implemented vs. Indeterminate ESG statements at the bank-year level.
	- Serving as an intermediate module before evidence linking and ESG-washing risk scoring.

	### Out-of-scope use
	- Determining the factual truthfulness of ESG claims.
	- Legal, regulatory, or investment decision-making without human review.
	- Application to non-banking or non-Vietnamese text without re-validation.

	---

	## Training data

	The model was trained using a hybrid labeling strategy:
	- LLM-generated labels as a semantic teacher for actionability
	- Weak labeling rules based on linguistic and domain-specific patterns (e.g., time references, quantitative indicators)
	- A pseudo-gold set sampled from high-confidence LLM labels for calibration and evaluation

	Training/validation data:
	- Total labeled samples: 5,997
	- Train set: 5,097
	- Validation set: 900

	Label distribution (train):
	- Implemented: ~37%
	- Planning: ~3%
	- Indeterminate: ~60%

	Class imbalance was handled using class-weighted loss during training.

	---

	## Training procedure
	- Base model: PhoBERT-large
	- Task: 3-class sentence-level classification
	- Loss: Cross-entropy with class weights
	- Evaluation metric: Macro-F1
	- Input representation:
	- Narrative text: sentence with local context (previous + next sentence)
	- Tables/KPI-like text: sentence only

	---

	## Evaluation results

	### Validation set (900 samples)
	- Accuracy: 0.839
	- Macro-F1: 0.734

	Per-class (validation):

	\| Label \| Precision \| Recall \| F1 \|
	\|------\|-----------\|--------\|----\|
	\| Implemented \| 0.79 \| 0.82 \| 0.81 \|
	\| Planning \| 0.48 \| 0.55 \| 0.52 \|
	\| Indeterminate \| 0.89 \| 0.86 \| 0.88 \|

	### Pseudo-gold test set (498 samples, balanced)
	- Accuracy: 0.916
	- Macro-F1: 0.916

	> Note: The pseudo-gold set is derived from high-confidence LLM labels and is balanced across classes. It may not fully reflect real-world class distributions.

	---

	## How to use

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	model_id = "huypham71/esg-action"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForSequenceClassification.from_pretrained(model_id)

	labels = ["Implemented", "Planning", "Indeterminate"]

	text = "Năm 2023, ngân hàng đã giảm 15% lượng khí thải CO2."
	inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)

	with torch.no_grad():
	logits = model(**inputs).logits
	probs = torch.softmax(logits, dim=-1).squeeze()

	pred = labels[int(probs.argmax())]
	print(pred, float(probs.max()))
	```

	---

	## Limitations

	The model captures linguistic actionability, not actual ESG performance.

	Planning statements are relatively rare, which may affect robustness on unseen corpora.

	Performance may degrade on domains outside Vietnamese banking reports.

	## Ethical considerations

	Outputs should be interpreted as analytical signals, not definitive judgments.

	Automated classification may reflect biases present in disclosure styles or training data.