--- license: mit language: - en tags: - financial-nlp - sentiment-analysis - topic-classification - multitask-learning - bert - financial-news library_name: transformers pipeline_tag: text-classification datasets: - financial-news metrics: - accuracy - f1 - precision - recall --- # Multi-Task BERT for Financial News Topic Classification and Sentiment Analysis ## Model Description This model is a multi-task BERT-based architecture designed to simultaneously perform topic classification and sentiment analysis on financial news text. The model leverages shared representations to improve performance on both tasks through multi-task learning. ## Model Details - **Model Type**: Multi-task BERT for text classification - **Language**: English - **License**: MIT - **Tasks**: - Topic Classification (financial news categories) - Sentiment Analysis (positive, negative, neutral) ## Intended Uses ### Direct Use This model can be used for: - Analyzing sentiment in financial news articles - Classifying financial news into relevant topics/categories - Automated content analysis for financial research - Risk assessment based on news sentiment ### Downstream Use The model can be fine-tuned for: - Specific financial domains (stocks, forex, commodities) - Custom topic taxonomies - Different sentiment granularities ## How to Use ```python import torch import pickle from transformers import AutoTokenizer, AutoModel # Load the model with open('multitask_bert_model.pkl', 'rb') as f: model = pickle.load(f) # Load tokenizer (adjust model name as needed) tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased') # Example usage text = "Apple stock rises 5% after strong quarterly earnings report" inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512) # Get predictions (adjust based on your model's output format) with torch.no_grad(): outputs = model(**inputs) # Process outputs for topic and sentiment predictions ``` ## Training Data The model was trained on financial news data for multi-task learning. The training involved: - Topic classification task - Sentiment analysis task - Joint optimization with shared BERT representations ## Training Procedure ### Training Hyperparameters - **Training regime**: Multi-task learning with shared encoder - **Model variants**: - `multitask_bert_model.pkl`: Base model - `multitask_bert_model_weight.pth`: Weighted version - `multitask_bert_model_imbalanced.pth`: Version trained on imbalanced data ### Training Details The model uses a shared BERT encoder with task-specific classification heads for topic classification and sentiment analysis. The multi-task approach allows the model to learn shared representations that benefit both tasks. ## Evaluation ### Testing Data & Metrics The model should be evaluated on: - **Topic Classification**: Accuracy, F1-score, Precision, Recall - **Sentiment Analysis**: Accuracy, F1-score, Precision, Recall ### Results [Add your evaluation results here] | Task | Metric | Score | |------|--------|-------| | Topic Classification | Accuracy | 0.76 | | Sentiment Analysis | Accuracy | 0.87 | ## Limitations and Bias ### Limitations - Performance may vary on financial news from different time periods - Model may not generalize well to non-financial text - Limited to English language text - Performance depends on the quality and diversity of training data ### Bias Considerations - Model may reflect biases present in financial news training data - Sentiment predictions may be influenced by market conditions during training - Topic classifications may favor certain financial sectors represented in training data ## Technical Specifications ### Model Architecture - **Base Model**: BERT - **Architecture**: Multi-task learning with shared encoder