File size: 13,219 Bytes
f705b02 ee6fd18 f705b02 ee6fd18 9dfeb0d ee6fd18 9dfeb0d ee6fd18 9dfeb0d ee6fd18 ad45744 ee6fd18 447fea0 ee6fd18 110be9f ee6fd18 c163df6 110be9f c163df6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 |
---
license: apache-2.0
datasets:
- FinGPT/fingpt-sentiment-train
language:
- en
metrics:
- accuracy
- f1
- recall
- precision
base_model:
- ProsusAI/finbert
pipeline_tag: text-classification
tags:
- finance
- financial
- news
- sentiment-analysis
- finbert
- transfomer
- text-classification
- financial-news
- financial-news-sentiment
library_name: transformers
---
# 📊 FinBERT Fine-Tuned on Financial News/Texts
A fine-tuned version of [`ProsusAI/finbert`](https://huggingface.co/ProsusAI/finbert) trained for **financial sentiment analysis** on financial news texts and headlines.
This fine-tuned model achieves a significant improvement over the original finbert, **outperforming it by over 38% in accuracy** on financial sentiment classification tasks.
---
## 🔧 Model Objective
The goal of this model is to detect **positive**, **neutral**, or **negative sentiment** on financial texts and headlines.
---
## 🗂️ Training Dataset
**Primary Dataset**: [`fingpt-sentiment-train`](https://huggingface.co/datasets/FinGPT/fingpt-sentiment-train) (~60,000 examples)
- Labeled financial text samples (positive / neutral / negative)
- Includes earnings statements, market commentary, and financial news headlines
- Only included **neutral**, **positive** and **negative** texts.
---
## 🧪 Benchmark Evaluation
The model was evaluated against **three benchmark datasets**:
- **[Financial PhraseBank (All Agree and All Combined)](https://www.researchgate.net/publication/251231364_FinancialPhraseBank-v10)**
- **[FiQA + PhraseBank Kaggle Merge](https://www.kaggle.com/datasets/sbhatti/financial-sentiment-analysis/data)**
- **[fingpt-sentiment-train (test split)](https://huggingface.co/datasets/FinGPT/fingpt-sentiment-train)**
Metrics used:
- **Accuracy**
- **F1 Score**
- **Precision**
- **Recall**
We benchmarked this model against the original [`ProsusAI/finbert`](https://huggingface.co/ProsusAI/finbert) on multiple financial datasets:
| Dataset | Samples | Model | Accuracy | F1 (Macro) | F1 (Weighted) | Precision (Macro) | Precision (Weighted) | Recall (Macro) | Recall (Weighted) |
|------------------------------------|---------|--------------------------|---------------|---------------|----------------|--------------------|------------------------|----------------|--------------------|
| **fingpt-sentiment-train Eval** | 12511 | FinBERT | 0.7131 | 0.70 | 0.71 | 0.71 | 0.72 | 0.70 | 0.71 |
| | | **FinBERT-Finetuned (Ours)** | **0.9894 (+38.8%)** | **0.99 (+41.4%)** | **0.99 (+39.4%)** | **0.99 (+39.4%)** | **0.99 (+37.5%)** | **0.99 (+41.4%)** | **0.99 (+39.4%)** |
| **Financial Phrasebank (Agree)** | 2264 | FinBERT | 0.9717 | 0.96 | 0.97 | 0.95 | 0.97 | 0.98 | 0.97 |
| | | **FinBERT-Finetuned (Ours)** | **0.9912 (+2.0%)** | **0.99 (+3.1%)** | **0.99 (+2.1%)** | **0.99 (+4.2%)** | **0.99 (+2.1%)** | **0.99 (+1.0%)** | **0.99 (+2.1%)** |
| **Financial Phrasebank (Combined)**| 14780 | FinBERT | 0.9238 | 0.91 | 0.92 | 0.89 | 0.93 | 0.94 | 0.92 |
| | | **FinBERT-Finetuned (Ours)** | **0.9792 (+6.0%)** | **0.98 (+7.7%)** | **0.98 (+6.5%)** | **0.98 (+10.1%)** | **0.98 (+5.4%)** | **0.98 (+4.3%)** | **0.98 (+6.5%)** |
| **FiQA + PhraseBank (Kaggle)** | 5842 | FinBERT | 0.7581 | 0.74 | 0.77 | 0.73 | 0.79 | 0.77 | 0.76 |
| | | **FinBERT-Finetuned (Ours)** | **0.8879 (+17.1%)** | **0.87 (+17.6%)** | **0.89 (+15.6%)** | **0.85 (+16.4%)** | **0.92 (+16.5%)** | **0.92 (+19.5%)** | **0.89 (+17.1%)** |
> **Note:** All metrics represent classification performance improvements after fine-tuning FinBERT on respective financial sentiment datasets. Metrics in parentheses represent relative improvement over base FinBERT performance.
---
## 🧠 Text-Level Comparison: FinBERT vs FinBERT-Finetuned (Ours)
### 🔴 FinBERT Failed Texts (as per discussed in its [`Paper`](https://arxiv.org/abs/1908.10063)) (Correctly Predicted by Ours)
| Text | Expected | FinBERT | Ours |
|-----------------------------------------------------------------------------------------------------------------------------|-----------|------------------------------|-------------------------------|
| Pre-tax loss totaled euro 0.3 million, compared to a loss of euro 2.2 million in the first quarter of 2005. | Positive | ❌ Negative (0.7223) | ✅ Positive (0.9997) |
| This implementation is very important to the operator, since it is about to launch its Fixed to Mobile convergence service | Neutral | ❌ Positive (0.7204) | ✅ Neutral (0.9998) |
| The situation of coated magazine printing paper will continue to be weak. | Negative | ✅ Negative (0.8811) | ✅ Negative (0.9996) |
### 🟡 FinBERT Incorrect, Ours Corrected It
| Text | Expected | FinBERT | Ours |
|----------------------------------------------------------------------------------------------------------------|-----------|------------------------------|-------------------------------|
| The debt-to-equity ratio was 1.15, flat quarter-over-quarter. | Neutral | ❌ Negative (0.6239) | ✅ Neutral (0.9998) |
| Earnings smashed expectations $AAPL posts $0.89 EPS vs $0.78 est. Bullish momentum incoming! | Positive | ❌ Neutral (0.4237) | ✅ Positive (0.9998) |
| $TSLA growth is slowing — but hey, at least Elon tweeted something funny today. #Tesla #markets | Negative | ❌ Neutral (0.5884) | ✅ Negative (0.7084) |
### ⚪ Out-of-Context Texts (FinBERT Misclassified, Ours Handled Properly)
| Text | Expected | FinBERT | Ours |
|--------------------------------------------------------------------------------------------|-----------|------------------------------|-------------------------------|
| Unexpected Snowstorm Hits Sahara Desert, Blanketing Sand Dunes | Neutral | ❌ Negative (0.8675) | ✅ Neutral (0.9993) |
| Virtual Reality Therapy Shows Promise for Treating PTSD | Neutral | ❌ Positive (0.8522) | ✅ Neutral (0.9997) |
> **Note**: These examples demonstrate improvements in real-world understanding, context handling, and sentiment differentiation with our FinBERT-finetuned model. Values in parentheses (e.g., `0.9485`) indicate the model’s confidence score for its predicted sentiment.
---
## ⚠️ Limitations & Failure Cases
While the model outperformed the base FinBERT across benchmarks, **some failure cases** were observed in statements involving **fine-grained numerical reasoning**, particularly when numerical comparison semantics are complex or subtle.
| Text | Expected | FinBERT | Ours |
|---------------------------------------------------------------------------------------------------------|-----------|------------------------------|-------------------------------|
| Net profit to euro 203 million from euro 172 million in the previous year. | Positive | ✅ Positive (0.9485) | ✅ Positive (0.9995) |
| Net profit to euro 103 million from euro 172 million in the previous year. | Negative | ❌ Positive (0.9486) | ❌ Positive (0.9994) |
| Pre-tax loss totaled euro 0.3 million, compared to a loss of euro 2.2 million in Q1 2005. | Positive | ❌ Negative (0.7223) | ✅ Positive (0.9997) |
| Pre-tax loss totaled euro 5.3 million, compared to a loss of euro 2.2 million in Q1 2005. | Negative | ✅ Negative (0.7205) | ❌ Positive (0.9997) |
| Net profit totaled euro 5.3 million, compared to euro 2.2 million in the previous quarter of 2005. | Positive | ❌ Negative (0.6347) | ❌ Negative (0.9996) |
| Net profit totaled euro 0.3 million, compared to euro 2.2 million in the previous quarter of 2005. | Negative | ✅ Negative (0.6320) | ✅ Negative (0.9996) |
> **Note**: Values in parentheses (e.g., `0.9485`) indicate the model’s confidence score for its predicted sentiment.
This suggests that **explicit numerical comparison reasoning** still remains challenging without targeted pretraining or numerical reasoning augmentation.
---
## Hyperparameters
During fine-tuning, the following hyperparameters were used to optimize model performance:
- **Learning Rate:** 2e-5
- **Batch Size:** 32
- **Number of Epochs:** 3
- **Max Sequence Length:** 128 tokens
- **Optimizer:** AdamW
- **Weight Decay:** 0.01
- **Evaluation Strategy:** Evaluation performed after each epoch
> **Note**: These settings were chosen to balance training efficiency and accuracy for financial news sentiment classification.
---
## 💡 Summary
✅ **Better generalization** than FinBERT on both benchmark and noisy real-world samples
✅ **Strong accuracy and F1 scores**
⚠️ Room to improve on **numerical reasoning comparisons** — potential for integration with numerical-aware transformers or contrastive fine-tuning
---
## Usage
### Pipeline Approach
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
import torch
model_name = "project-aps/finbert-finetune"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Override the config's id2label and label2id
label_map = {0: "neutral", 1: "negative", 2: "positive"}
model.config.id2label = label_map
model.config.label2id = {v: k for k, v in label_map.items()}
pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
text = "Earnings smashed expectations AAPL posts $0.89 EPS vs $0.78 est. Bullish momentum incoming! #EarningsSeason"
print(pipe(text)) #Output: [{'label': 'positive', 'score': 0.9997484087944031}]
```
### Simple Approach
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_name = "project-aps/finbert-finetune"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
text = "Earnings smashed expectations AAPL posts $0.89 EPS vs $0.78 est. Bullish momentum incoming! #EarningsSeason"
inputs = tokenizer(text, return_tensors="pt", truncation=True)
outputs = model(**inputs)
predicted_class = torch.argmax(outputs.logits, dim=1).item()
label_map = {0: "neutral", 1: "negative", 2: "positive"}
print(f"Text : {text}")
print(f"Sentiment: {label_map[predicted_class]}")
```
---
## Acknowledgements
We gratefully acknowledge the creators and maintainers of the resources used in this project:
- **[ProsusAI/FinBERT](https://huggingface.co/ProsusAI/finbert)** – A pre-trained BERT model specifically designed for financial sentiment analysis, which served as the foundation for our fine-tuning efforts.
- **[FinGPT Sentiment Train Dataset](https://huggingface.co/datasets/FinGPT/fingpt-sentiment-train)** – The dataset used for fine-tuning, containing a large collection of finance-related news headlines and sentiment annotations.
- **[Financial PhraseBank Dataset](https://www.researchgate.net/publication/251231364_FinancialPhraseBank-v10)** – A widely used benchmark dataset for financial sentiment classification, including the *All Agree* and *All Combined* subsets.
- **[FiQA + PhraseBank Kaggle Merged Dataset](https://www.kaggle.com/datasets/sbhatti/financial-sentiment-analysis/data)** – A merged dataset combining FiQA and Financial PhraseBank entries, used for broader benchmarking of sentiment performance.
We thank these contributors for making their models and datasets publicly available, enabling high-quality research and development in financial NLP.
--- |