File size: 2,097 Bytes

93cb78c

# FinancialNewsSentimentClassifier_DistilBERT

## 📰 Overview

This is a fine-tuned **DistilBERT** model optimized for **Sequence Classification** to analyze the sentiment of financial news headlines and short articles. It categorizes the text into three classes: **Bullish**, **Neutral**, and **Bearish**, providing a quantifiable measure of market outlook derived from textual data. The model was trained on a comprehensive dataset of news articles from major financial publications, labeled by human experts.

## 🧠 Model Architecture

This model is built upon the **DistilBERT base uncased** architecture, a smaller, faster, and lighter version of BERT.

* **Base Model:** `distilbert-base-uncased`
* **Task:** Sequence Classification (`DistilBertForSequenceClassification`)
* **Input:** Tokenized financial news headlines or short-form texts (max sequence length 512).
* **Output:** Logits for three classes:
    * `0`: Bullish (Positive market sentiment)
    * `1`: Neutral (No significant market impact)
    * `2`: Bearish (Negative market sentiment)
* **Training Details:** Fine-tuned for 3 epochs with a batch size of 16 and AdamW optimizer. Achieved an F1-score of 0.89 on the validation set.


## 💡 Intended Use

* **Quantitative Finance:** Generating sentiment scores for stocks, sectors, or the entire market based on real-time news feeds.
* **Algorithmic Trading:** Using the sentiment output as an input feature for high-frequency trading models.
* **Market Research:** Tracking historical shifts in market sentiment towards specific companies or topics.
* **News Filtering:** Prioritizing news articles based on their potential market impact.

### How to use

```python
from transformers import pipeline

classifier = pipeline(
    "sentiment-analysis", 
    model="[YOUR_HF_USERNAME]/FinancialNewsSentimentClassifier_DistilBERT",
    tokenizer="distilbert-base-uncased"
)

# Example usage
result = classifier("Tesla stock surges 5% on better-than-expected Q4 earnings and new China factory plans.")
print(result) 
# Expected output: [{'label': 'Bullish', 'score': 0.98...}]