Safetensors
bert
muhalwan commited on
Commit
dd0f040
·
verified ·
1 Parent(s): 87d4787

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -0
README.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - zeroshot/twitter-financial-news-sentiment
5
+ ---
6
+
7
+ # Financial Sentiment Analysis with FinBERT
8
+
9
+ This repository contains a financial sentiment analysis model fine-tuned on `ProsusAI/finbert`. The model classifies financial text (like tweets or news headlines) into three categories: **Bullish**, **Bearish**, or **Neutral**.
10
+
11
+ The project includes scripts for data preprocessing, model training with hyperparameter optimization, and a Streamlit web application for interactive predictions.
12
+
13
+ ## Model Card
14
+
15
+ ### Model Description
16
+
17
+ This model is a `BertForSequenceClassification` based on the `ProsusAI/finbert` architecture. It has been fine-tuned to predict the sentiment of financial text. The model was trained on a dataset of financial tweets and headlines, and it outputs one of three labels: `Bullish`, `Bearish`, or `Neutral`.
18
+
19
+ ```python
20
+ from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
21
+
22
+ MODEL_PATH = "path to your model"
23
+
24
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
25
+ model = AutoModelForSequenceClassification.from_pretrained(MODEL_PATH)
26
+
27
+ pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)
28
+
29
+ # Analyze sentiment
30
+ results = pipe("Adobe price target raised to $350 vs. $320 at Canaccord")
31
+ print(results)
32
+ # [{'label': 'Bullish', 'score': 0.9...}]
33
+ ```
34
+
35
+ ### Training Data
36
+
37
+ The model was trained on the [Twitter Financial News Sentiment](https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment) dataset. The text data undergoes a comprehensive cleaning process (`data_preprocessing.py`) which includes:
38
+
39
+ ### Training Procedure
40
+
41
+ The model was trained using the `transformers` library in PyTorch. The training script (`model_development.py`) includes the following features:
42
+
43
+ - **Hyperparameter Optimization**: Optuna was used to find the best learning rate and batch size.
44
+ - **Optimizer**: AdamW with a linear learning rate scheduler and warmup.
45
+ - **Early Stopping**: Training stops if the validation accuracy does not improve for a set number of epochs.
46
+ - **Mixed-Precision Training**: `torch.amp` was used for faster training.
47
+ - **Gradient Accumulation**: To simulate a larger batch size.