Biodiversity Sentiment Classifier for Corporate Disclosures
Model Overview: This multi-class text classification model analyzes the sentiment of biodiversity-related content in corporate sustainability reports. It categorizes paragraphs into three distinct sentiment classes:
Negative: Content describing actual or potential risks, losses, adverse impacts on the firm, or the firm's negative impacts on biodiversity; uses negative framing or adjectives to describe biodiversity-related developments Positive: Content highlighting business opportunities from biodiversity, positive impacts, mitigation of negative impacts, or employing positive framing and adjectives Neutral: Factual statements about biodiversity without positive or negative framing; objective reporting of data, statistics, or descriptive information Model Architecture Built on ClimateBERT, a DistilRoBERTa-based model pre-trained on climate-related text, this classifier was fine-tuned specifically for sentiment analysis of biodiversity disclosures in corporate contexts.
Training Data: The model was trained on a curated dataset of 2,000 manually annotated paragraphs extracted from sustainability reports of Fortune Global 500 companies, with sentiment labels capturing both linguistic tone and substantive framing of biodiversity-related content.
Performance Metrics: Average of 5-fold cross-validation
Weighted F1: 0.901 Weighted Precision: 0.902 Weighted Recall: AUC-ROC:0.968
- Downloads last month
- 6