imkiasu commited on
Commit
56190f0
·
verified ·
1 Parent(s): ce43f0e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +93 -66
README.md CHANGED
@@ -1,66 +1,93 @@
1
- # FinBERT Fine-Tuned for Financial Sentiment Analysis
2
-
3
- This repository contains a fine-tuned version of FinBERT (RoBERTa-based) for financial sentiment classification. The model predicts whether a financial news headline or sentence is **positive**, **neutral**, or **negative**.
4
-
5
- ## Model Overview
6
- - **Base model:** FinBERT (RoBERTa)
7
- - **Task:** Financial sentiment classification (3 classes)
8
- - **Training data:** Financial news headlines and sentences
9
- - **Dataset source:** [Kaggle - Finance News Sentiments](https://www.kaggle.com/datasets/antobenedetti/finance-news-sentiments/data?select=dataset.csv)
10
- - **Output labels:**
11
- - 0: Negative
12
- - 1: Neutral
13
- - 2: Positive
14
-
15
- ## Evaluation Results
16
- - **Test Accuracy:** 0.7565
17
- - **Multiclass ROC AUC (macro-average):** 0.9096
18
-
19
- ## Model Folder Structure
20
- ```
21
- finbert_finetuned/
22
- config.json
23
- merges.txt
24
- model.safetensors
25
- special_tokens_map.json
26
- tokenizer_config.json
27
- tokenizer.json
28
- vocab.json
29
- ```
30
- **Note:** Only the model files are stored in `finbert_finetuned_news_sentiment/`. Scripts and datasets are kept separate and are not included in this folder or in the model upload.
31
-
32
- ## How to Use the Fine-Tuned Model
33
-
34
- ### 1. Load and Use the Model in Python
35
- ```python
36
- import torch
37
- from transformers import AutoTokenizer, AutoModelForSequenceClassification
38
- # Directory of the model folder
39
- model_dir = "finbert_finetuned_news_sentiment"
40
- # read the model
41
- tokenizer = AutoTokenizer.from_pretrained(model_dir)
42
- model = AutoModelForSequenceClassification.from_pretrained(model_dir)
43
- model.eval()
44
-
45
- text = "Apple stock surges after strong earnings report."
46
- inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
47
- with torch.no_grad():
48
- logits = model(**inputs).logits
49
- pred = torch.argmax(logits, dim=1).item()
50
-
51
- label_map = {0: 'negative', 1: 'neutral', 2: 'positive'}
52
- print(f"Predicted sentiment: {label_map[pred]}")
53
- ```
54
-
55
- ## Notes
56
- - The model was trained and evaluated on data from the Kaggle dataset linked above.
57
- - The `finbert_finetuned_news_sentiment/` folder contains only the files needed for inference.
58
- - Scripts and datasets are not included in the model folder or in the model upload.
59
- - For best results, use a GPU for inference if available.
60
-
61
- ## Citation
62
- If you use this model, please cite the original [FinBERT paper](https://arxiv.org/abs/2006.08097) and the [Kaggle dataset](https://www.kaggle.com/datasets/antobenedetti/finance-news-sentiments/data?select=dataset.csv).
63
-
64
- ---
65
-
66
- **Date:** June 2025
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ tags:
5
+ - finance
6
+ - sentiment-analysis
7
+ - finbert
8
+ - roberta
9
+ - classification
10
+ model-index:
11
+ - name: FinBERT Fine-Tuned for Financial Sentiment Analysis
12
+ results:
13
+ - task:
14
+ type: text-classification
15
+ name: Sentiment Analysis
16
+ dataset:
17
+ name: Finance News Sentiments (Kaggle)
18
+ type: text
19
+ metrics:
20
+ - type: accuracy
21
+ value: 0.7565
22
+ - type: multiclass_roc_auc
23
+ value: 0.9096
24
+ base_model:
25
+ - FacebookAI/roberta-large
26
+ ---
27
+
28
+ # FinBERT Fine-Tuned for Financial Sentiment Analysis
29
+
30
+ This repository contains a fine-tuned version of FinBERT (RoBERTa-based) for financial sentiment classification. The model predicts whether a financial news headline or sentence is **positive**, **neutral**, or **negative**.
31
+
32
+ ## Model Overview
33
+ - **Base model:** FinBERT (RoBERTa)
34
+ - **Task:** Financial sentiment classification (3 classes)
35
+ - **Training data:** Financial news headlines and sentences
36
+ - **Dataset source:** [Kaggle - Finance News Sentiments](https://www.kaggle.com/datasets/antobenedetti/finance-news-sentiments/data?select=dataset.csv)
37
+ - **Output labels:**
38
+ - 0: Negative
39
+ - 1: Neutral
40
+ - 2: Positive
41
+
42
+ ## Evaluation Results
43
+ - **Test Accuracy:** 0.7565
44
+ - **Multiclass ROC AUC (macro-average):** 0.9096
45
+
46
+ ## Model Folder Structure
47
+ ```
48
+ finbert_finetuned/
49
+ config.json
50
+ merges.txt
51
+ model.safetensors
52
+ special_tokens_map.json
53
+ tokenizer_config.json
54
+ tokenizer.json
55
+ vocab.json
56
+ ```
57
+ **Note:** Only the model files are stored in `finbert_finetuned_news_sentiment/`. Scripts and datasets are kept separate and are not included in this folder or in the model upload.
58
+
59
+ ## How to Use the Fine-Tuned Model
60
+
61
+ ### 1. Load and Use the Model in Python
62
+ ```python
63
+ import torch
64
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
65
+ # Directory of the model folder
66
+ model_dir = "finbert_finetuned_news_sentiment"
67
+ # read the model
68
+ tokenizer = AutoTokenizer.from_pretrained(model_dir)
69
+ model = AutoModelForSequenceClassification.from_pretrained(model_dir)
70
+ model.eval()
71
+
72
+ text = "Apple stock surges after strong earnings report."
73
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
74
+ with torch.no_grad():
75
+ logits = model(**inputs).logits
76
+ pred = torch.argmax(logits, dim=1).item()
77
+
78
+ label_map = {0: 'negative', 1: 'neutral', 2: 'positive'}
79
+ print(f"Predicted sentiment: {label_map[pred]}")
80
+ ```
81
+
82
+ ## Notes
83
+ - The model was trained and evaluated on data from the Kaggle dataset linked above.
84
+ - The `finbert_finetuned_news_sentiment/` folder contains only the files needed for inference.
85
+ - Scripts and datasets are not included in the model folder or in the model upload.
86
+ - For best results, use a GPU for inference if available.
87
+
88
+ ## Citation
89
+ If you use this model, please cite the original [FinBERT paper](https://arxiv.org/abs/2006.08097) and the [Kaggle dataset](https://www.kaggle.com/datasets/antobenedetti/finance-news-sentiments/data?select=dataset.csv).
90
+
91
+ ---
92
+
93
+ **Date:** June 2025