File size: 5,821 Bytes

---
library_name: transformers
license: apache-2.0
base_model: answerdotai/ModernBERT-base
tags:
- generated_from_trainer
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: modernbert_fingpt_results
  results: []
datasets:
- FinGPT/fingpt-sentiment-train
---

# ModernBERT Fine-tuned for Financial Text Sentiment Analysis

This project fine-tunes the **ModernBERT** model on the **FinGPT** sentiment dataset for financial text sentiment analysis.

## Dataset & Model

- **Model**: [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)
- **Dataset**: [FinGPT/fingpt-sentiment-train](https://huggingface.co/datasets/FinGPT/fingpt-sentiment-train)
- **Task**: Multi-class sentiment classification (9 categories)
- **Domain**: Financial text analysis

### ModernBert
ModernBERT is a modernized bidirectional encoder-only Transformer model (BERT-style) pre-trained on 2 trillion tokens of English and code data with a native context length of up to 8,192 tokens. 
It leverages architectural improvements such as Rotary Positional Embeddings (RoPE) for long-context support, Local-Global Alternating Attention for efficiency on long inputs, Unpadding and Flash Attention for efficient inference.

### FinGPT Sentiment Analysis Dataset
Contains 76,772 test rows (17,919,695 tokens)

## Sentiment Categories

The model classifies text into 9 fine-grained sentiment levels:

| Label ID | Sentiment Category | Description |
|----------|-------------------|-------------|
| 0 | Strong Negative | Very pessimistic |
| 1 | Moderately Negative | Somewhat pessimistic |
| 2 | Mildly Negative | Slightly pessimistic |
| 3 | Negative | General negative sentiment |
| 4 | Neutral | No clear positive or negative bias |
| 5 | Mildly Positive | Slightly optimistic |
| 6 | Moderately Positive | Somewhat optimistic |
| 7 | Positive | General positive sentiment |
| 8 | Strong Positive | Very optimistic |

## Model Configuration

### Parameters
- **Max Sequence Length**: 512 tokens
- **Batch Size**: 16
- **Learning Rate**: 2e-5 with warmup
- **Epochs**: 3 with early stopping
- **Optimizer**: AdamW with weight decay (0.01)

### Features
- **Early Stopping**: Prevents overfitting (patience=3)
- **Best Model Loading**: Automatically loads best checkpoint
- **Mixed Precision**: FP16 training for speed optimization
- **Stratified Splitting**: 80/20 train/validation split

## Evaluation Metrics
- **Accuracy**: Overall classification accuracy
- **F1-Score**: Weighted F1-score across all classes
- **Precision**: Weighted precision
- **Recall**: Weighted recall
- **Confusion Matrix**: Visual analysis of classification performance
- **Classification Report**: Detailed per-class metrics

## Performance

### Training Time (on T4 GPU)
- **Total Training**: ~30-45 minutes
- **Per Epoch**: ~10-15 minutes
- **Evaluation**: ~2-3 minutes

### Training Results (Actual)

- Loss: 0.3741
- Accuracy: 0.9043
- F1: 0.9026
- Precision: 0.9022
- Recall: 0.9043


| Training Loss | Epoch  | Step  | Validation Loss | Accuracy | F1     | Precision | Recall |
|:-------------:|:------:|:-----:|:---------------:|:--------:|:------:|:---------:|:------:|
| 0.9551        | 0.1302 | 500   | 0.8504          | 0.6769   | 0.6623 | 0.6589    | 0.6769 |
| 0.6639        | 0.2605 | 1000  | 0.7921          | 0.7162   | 0.6952 | 0.7444    | 0.7162 |
| 0.5221        | 0.3907 | 1500  | 0.5066          | 0.8134   | 0.8083 | 0.8147    | 0.8134 |
| 0.4415        | 0.5210 | 2000  | 0.4247          | 0.8381   | 0.8363 | 0.8410    | 0.8381 |
| 0.4276        | 0.6512 | 2500  | 0.3884          | 0.8594   | 0.8486 | 0.8484    | 0.8594 |
| 0.3767        | 0.7815 | 3000  | 0.3472          | 0.8756   | 0.8661 | 0.8689    | 0.8756 |
| 0.3281        | 0.9117 | 3500  | 0.3463          | 0.8754   | 0.8631 | 0.8611    | 0.8754 |
| 0.2419        | 1.0419 | 4000  | 0.3556          | 0.8883   | 0.8737 | 0.8728    | 0.8883 |
| 0.2859        | 1.1722 | 4500  | 0.3162          | 0.8922   | 0.8859 | 0.8829    | 0.8922 |
| 0.226         | 1.3024 | 5000  | 0.3269          | 0.8914   | 0.8857 | 0.8851    | 0.8914 |
| 0.2378        | 1.4327 | 5500  | 0.3281          | 0.8903   | 0.8834 | 0.8881    | 0.8903 |
| 0.2654        | 1.5629 | 6000  | 0.3038          | 0.8938   | 0.8862 | 0.8896    | 0.8938 |
| 0.2319        | 1.6931 | 6500  | 0.3032          | 0.8993   | 0.8919 | 0.8905    | 0.8993 |
| 0.2116        | 1.8234 | 7000  | 0.3013          | 0.9023   | 0.8919 | 0.8937    | 0.9023 |
| 0.1922        | 1.9536 | 7500  | 0.2959          | 0.9017   | 0.8968 | 0.8941    | 0.9017 |
| 0.1536        | 2.0839 | 8000  | 0.3983          | 0.9009   | 0.8986 | 0.9000    | 0.9009 |
| 0.1438        | 2.2141 | 8500  | 0.3982          | 0.8990   | 0.8968 | 0.8954    | 0.8990 |
| 0.1329        | 2.3444 | 9000  | 0.3809          | 0.9021   | 0.8990 | 0.8968    | 0.9021 |
| 0.1175        | 2.4746 | 9500  | 0.3944          | 0.9019   | 0.8991 | 0.8977    | 0.9019 |
| 0.1634        | 2.6048 | 10000 | 0.3899          | 0.9043   | 0.8999 | 0.8989    | 0.9043 |
| 0.1049        | 2.7351 | 10500 | 0.4006          | 0.9037   | 0.9016 | 0.9009    | 0.9037 |
| 0.1247        | 2.8653 | 11000 | 0.3828          | 0.9053   | 0.9019 | 0.9006    | 0.9053 |
| 0.1511        | 2.9956 | 11500 | 0.3741          | 0.9043   | 0.9026 | 0.9022    | 0.9043 |

## Deployment Options
- **API Deployment**: Create REST API using FastAPI
- **Batch Processing**: Set up automated sentiment analysis pipeline
- **Real-time Analysis**: Integrate with financial data streams

## References
- [ModernBERT Paper](https://arxiv.org/abs/2412.13663)
- [FinGPT Project](https://github.com/AI4Finance-Foundation/FinGPT)
- [Hugging Face Transformers](https://huggingface.co/docs/transformers)
- [Financial Sentiment Analysis Survey](https://arxiv.org/abs/2212.14197)