MTL-FinancialNews-Topic-Sentiment / README.md

updated model card

16ebaf5 6 months ago

3.98 kB

	---
	license: mit
	language:
	- en
	tags:
	- financial-nlp
	- sentiment-analysis
	- topic-classification
	- multitask-learning
	- bert
	- financial-news
	library_name: transformers
	pipeline_tag: text-classification
	datasets:
	- financial-news
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	---

	# Multi-Task BERT for Financial News Topic Classification and Sentiment Analysis

	## Model Description

	This model is a multi-task BERT-based architecture designed to simultaneously perform topic classification and sentiment analysis on financial news text. The model leverages shared representations to improve performance on both tasks through multi-task learning.

	## Model Details

	- Model Type: Multi-task BERT for text classification
	- Language: English
	- License: MIT
	- Tasks:
	- Topic Classification (financial news categories)
	- Sentiment Analysis (positive, negative, neutral)

	## Intended Uses

	### Direct Use

	This model can be used for:
	- Analyzing sentiment in financial news articles
	- Classifying financial news into relevant topics/categories
	- Automated content analysis for financial research
	- Risk assessment based on news sentiment

	### Downstream Use

	The model can be fine-tuned for:
	- Specific financial domains (stocks, forex, commodities)
	- Custom topic taxonomies
	- Different sentiment granularities

	## How to Use

	```python
	import torch
	import pickle
	from transformers import AutoTokenizer, AutoModel

	# Load the model
	with open('multitask_bert_model.pkl', 'rb') as f:
	model = pickle.load(f)

	# Load tokenizer (adjust model name as needed)
	tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

	# Example usage
	text = "Apple stock rises 5% after strong quarterly earnings report"
	inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)

	# Get predictions (adjust based on your model's output format)
	with torch.no_grad():
	outputs = model(**inputs)
	# Process outputs for topic and sentiment predictions
	```

	## Training Data

	The model was trained on financial news data for multi-task learning. The training involved:
	- Topic classification task
	- Sentiment analysis task
	- Joint optimization with shared BERT representations

	## Training Procedure

	### Training Hyperparameters

	- Training regime: Multi-task learning with shared encoder
	- Model variants:
	- `multitask_bert_model.pkl`: Base model
	- `multitask_bert_model_weight.pth`: Weighted version
	- `multitask_bert_model_imbalanced.pth`: Version trained on imbalanced data

	### Training Details

	The model uses a shared BERT encoder with task-specific classification heads for topic classification and sentiment analysis. The multi-task approach allows the model to learn shared representations that benefit both tasks.

	## Evaluation

	### Testing Data & Metrics

	The model should be evaluated on:
	- Topic Classification: Accuracy, F1-score, Precision, Recall
	- Sentiment Analysis: Accuracy, F1-score, Precision, Recall

	### Results

	[Add your evaluation results here]

	\| Task \| Metric \| Score \|
	\|------\|--------\|-------\|
	\| Topic Classification \| Accuracy \| 0.76 \|
	\| Sentiment Analysis \| Accuracy \| 0.87 \|

	## Limitations and Bias

	### Limitations

	- Performance may vary on financial news from different time periods
	- Model may not generalize well to non-financial text
	- Limited to English language text
	- Performance depends on the quality and diversity of training data

	### Bias Considerations

	- Model may reflect biases present in financial news training data
	- Sentiment predictions may be influenced by market conditions during training
	- Topic classifications may favor certain financial sectors represented in training data

	## Technical Specifications

	### Model Architecture

	- Base Model: BERT
	- Architecture: Multi-task learning with shared encoder