Update README.md
Browse files
README.md
CHANGED
|
@@ -37,12 +37,34 @@ The preprocessing steps included:
|
|
| 37 |
Additionally, for fine-tuning the model for your own data, the preprocessing step involves converting new financial headlines into embeddings and feeding them into the RandomForest model.
|
| 38 |
|
| 39 |
### Model Evaluation
|
| 40 |
-
The model has been evaluated using metrics such as:
|
| 41 |
-
- **Accuracy**: The percentage of correctly classified headlines.
|
| 42 |
-
- **F1-score**: The harmonic mean of precision and recall, providing a better measure of model performance when dealing with imbalanced data.
|
| 43 |
-
- **Confusion Matrix**: Helps identify how well the model distinguishes between the different sentiment categories (positive, neutral, and negative).
|
| 44 |
|
| 45 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
|
| 47 |
### Usage
|
| 48 |
|
|
@@ -50,17 +72,3 @@ To use the model, first install the necessary dependencies:
|
|
| 50 |
|
| 51 |
```bash
|
| 52 |
pip install sentence-transformers scikit-learn
|
| 53 |
-
|
| 54 |
-
```
|
| 55 |
-
|
| 56 |
-
license: apache-2.0
|
| 57 |
-
datasets:
|
| 58 |
-
- NickyNicky/Finance_sentiment_and_topic_classification_En
|
| 59 |
-
language:
|
| 60 |
-
- en
|
| 61 |
-
metrics:
|
| 62 |
-
- accuracy
|
| 63 |
-
base_model:
|
| 64 |
-
- sentence-transformers/all-MiniLM-L6-v2
|
| 65 |
-
pipeline_tag: text-classification
|
| 66 |
-
---
|
|
|
|
| 37 |
Additionally, for fine-tuning the model for your own data, the preprocessing step involves converting new financial headlines into embeddings and feeding them into the RandomForest model.
|
| 38 |
|
| 39 |
### Model Evaluation
|
|
|
|
|
|
|
|
|
|
|
|
|
| 40 |
|
| 41 |
+
|
| 42 |
+
On the test data, the model achieves an **accuracy of 61%**, with an **F1-score of 0.61**. Not optimal, but acceptable in terms of the simplicity and few data the model is trained on.
|
| 43 |
+
|
| 44 |
+
#### Hyperparameters:
|
| 45 |
+
- **Number of Estimators (n_estimators)**: 200
|
| 46 |
+
- **Max Depth (max_depth)**: 20
|
| 47 |
+
- **Min Samples Split (min_samples_split)**: 5
|
| 48 |
+
- **Min Samples Leaf (min_samples_leaf)**: 1
|
| 49 |
+
- **Random State (random_state)**: 42
|
| 50 |
+
- **Max Features (max_features)**: 'sqrt' (default value for RandomForest)
|
| 51 |
+
|
| 52 |
+
#### Classification Report:
|
| 53 |
+
- **Precision**:
|
| 54 |
+
- Class 0: 0.66
|
| 55 |
+
- Class 1: 0.62
|
| 56 |
+
- Class 2: 0.55
|
| 57 |
+
- **Recall**:
|
| 58 |
+
- Class 0: 0.52
|
| 59 |
+
- Class 1: 0.80
|
| 60 |
+
- Class 2: 0.52
|
| 61 |
+
- **F1-Score**:
|
| 62 |
+
- Class 0: 0.58
|
| 63 |
+
- Class 1: 0.70
|
| 64 |
+
- Class 2: 0.54
|
| 65 |
+
- **Overall Accuracy**: 0.61
|
| 66 |
+
- **Macro Average**: 0.61 (Precision, Recall, F1-Score)
|
| 67 |
+
- **Weighted Average**: 0.61 (Precision, Recall, F1-Score)
|
| 68 |
|
| 69 |
### Usage
|
| 70 |
|
|
|
|
| 72 |
|
| 73 |
```bash
|
| 74 |
pip install sentence-transformers scikit-learn
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|