|
|
--- |
|
|
library_name: transformers |
|
|
license: apache-2.0 |
|
|
base_model: answerdotai/ModernBERT-base |
|
|
tags: |
|
|
- generated_from_trainer |
|
|
metrics: |
|
|
- accuracy |
|
|
model-index: |
|
|
- name: Fin-ModernBERT |
|
|
results: [] |
|
|
datasets: |
|
|
- clapAI/FinData-dedup |
|
|
language: |
|
|
- en |
|
|
pipeline_tag: fill-mask |
|
|
--- |
|
|
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
|
|
|
|
|
# Fin-ModernBERT |
|
|
|
|
|
Fin-ModernBERT is a domain-adapted pretrained language model for the **financial domain**, obtained by continual pretraining of [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) with a **context length of 1024 tokens** on large-scale finance-related corpora. |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Description |
|
|
|
|
|
- **Base model:** ModernBERT-base (context length = 1024) |
|
|
- **Domain:** Finance, Stock Market, Cryptocurrency |
|
|
- **Objective:** Improve representation and understanding of financial text for downstream NLP tasks (sentiment analysis, NER, classification, QA, retrieval, etc.) |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Data |
|
|
|
|
|
We collected and combined multiple publicly available finance-related datasets, including: |
|
|
|
|
|
- [danidanou/Bloomberg_Financial_News](https://huggingface.co/datasets/danidanou/Bloomberg_Financial_News) |
|
|
- [juanberasategui/Crypto_Tweets](https://huggingface.co/datasets/juanberasategui/Crypto_Tweets) |
|
|
- [StephanAkkerman/crypto-stock-tweets](https://huggingface.co/datasets/StephanAkkerman/crypto-stock-tweets) |
|
|
- [SahandNZ/cryptonews-articles-with-price-momentum-labels](https://huggingface.co/datasets/SahandNZ/cryptonews-articles-with-price-momentum-labels) |
|
|
- [edaschau/financial_news](https://huggingface.co/datasets/edaschau/financial_news) |
|
|
- [sabareesh88/FNSPID_nasdaq](https://huggingface.co/datasets/sabareesh88/FNSPID_nasdaq) |
|
|
- [BAAI/IndustryCorpus_finance](https://huggingface.co/datasets/BAAI/IndustryCorpus_finance) |
|
|
- [mjw/stock_market_tweets](https://huggingface.co/datasets/mjw/stock_market_tweets) |
|
|
|
|
|
After aggregation, we obtained **~50M financial records**. |
|
|
A deduplication process reduced this to **~20M records**, available at: |
|
|
👉 [clapAI/FinData-dedup](https://huggingface.co/datasets/clapAI/FinData-dedup) |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Hyperparameters |
|
|
|
|
|
The following hyperparameters were used during training: |
|
|
|
|
|
- **Learning rate:** 2e-4 |
|
|
- **Train batch size:** 24 |
|
|
- **Eval batch size:** 24 |
|
|
- **Seed:** 0 |
|
|
- **Gradient accumulation steps:** 128 |
|
|
- **Effective total train batch size:** 3072 |
|
|
- **Optimizer:** `AdamW_Torch_Fused` with betas=(0.9, 0.999), epsilon=1e-08 |
|
|
- **LR scheduler:** Linear |
|
|
- **Epochs:** 1 |
|
|
|
|
|
--- |
|
|
|
|
|
## Evaluation Benchmarks |
|
|
|
|
|
We benchmarked **Fin-ModernBERT** against two strong baselines: |
|
|
- [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert) |
|
|
- [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) |
|
|
|
|
|
### Fine-tuning Setup |
|
|
All models were fine-tuned under the same configuration: |
|
|
- **Optimizer:** AdamW |
|
|
- **Learning rate:** 5e-5 |
|
|
- **Batch size:** 16 |
|
|
- **Epochs:** 5 |
|
|
- **Scheduler:** Linear |
|
|
|
|
|
### Results |
|
|
|
|
|
| Dataset | Metric | FinBERT (ProsusAI) | ModernBERT-base | Fin-ModernBERT | |
|
|
|---------|--------|---------------------|-----------------|----------------| |
|
|
| CIKM (datht/fin-cikm) | F1-score | 42.77 | 53.08 | **54.89** | |
|
|
| PhraseBank (soumakchak/phrasebank) | F1-score | 86.33 | 85.03 | **88.09** | |
|
|
|
|
|
> Further evaluations on additional datasets and tasks are ongoing to provide a more comprehensive view of its performance. |
|
|
|
|
|
|
|
|
--- |
|
|
|
|
|
## Use Cases |
|
|
|
|
|
Fin-ModernBERT can be used for various financial NLP applications, such as: |
|
|
|
|
|
- **Financial Sentiment Analysis** (e.g., market mood detection from news/tweets) |
|
|
- **Event-driven Stock Prediction** |
|
|
- **Financial Named Entity Recognition (NER)** (companies, tickers, financial instruments) |
|
|
- **Document Classification & Clustering** |
|
|
- **Question Answering over financial reports and news** |
|
|
|
|
|
--- |
|
|
|
|
|
## How to Use |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModel |
|
|
|
|
|
model_name = "clapAI/Fin-ModernBERT" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
model = AutoModel.from_pretrained(model_name) |
|
|
|
|
|
text = "Federal Reserve hints at possible interest rate cuts." |
|
|
inputs = tokenizer(text, return_tensors="pt") |
|
|
outputs = model(**inputs) |
|
|
|
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
```@misc{finmodernbert2025, |
|
|
title={Fin-ModernBERT: Continual Pretraining of ModernBERT for Financial Domain}, |
|
|
author={ClapAI}, |
|
|
year={2025}, |
|
|
publisher={Hugging Face}, |
|
|
howpublished={\url{https://huggingface.co/clapAI/Fin-ModernBERT}} |
|
|
} |