Fin-ModernBERT / README.md
hungnm's picture
Update README.md
31d3d96 verified
---
library_name: transformers
license: apache-2.0
base_model: answerdotai/ModernBERT-base
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: Fin-ModernBERT
results: []
datasets:
- clapAI/FinData-dedup
language:
- en
pipeline_tag: fill-mask
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# Fin-ModernBERT
Fin-ModernBERT is a domain-adapted pretrained language model for the **financial domain**, obtained by continual pretraining of [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) with a **context length of 1024 tokens** on large-scale finance-related corpora.
---
## Model Description
- **Base model:** ModernBERT-base (context length = 1024)
- **Domain:** Finance, Stock Market, Cryptocurrency
- **Objective:** Improve representation and understanding of financial text for downstream NLP tasks (sentiment analysis, NER, classification, QA, retrieval, etc.)
---
## Training Data
We collected and combined multiple publicly available finance-related datasets, including:
- [danidanou/Bloomberg_Financial_News](https://huggingface.co/datasets/danidanou/Bloomberg_Financial_News)
- [juanberasategui/Crypto_Tweets](https://huggingface.co/datasets/juanberasategui/Crypto_Tweets)
- [StephanAkkerman/crypto-stock-tweets](https://huggingface.co/datasets/StephanAkkerman/crypto-stock-tweets)
- [SahandNZ/cryptonews-articles-with-price-momentum-labels](https://huggingface.co/datasets/SahandNZ/cryptonews-articles-with-price-momentum-labels)
- [edaschau/financial_news](https://huggingface.co/datasets/edaschau/financial_news)
- [sabareesh88/FNSPID_nasdaq](https://huggingface.co/datasets/sabareesh88/FNSPID_nasdaq)
- [BAAI/IndustryCorpus_finance](https://huggingface.co/datasets/BAAI/IndustryCorpus_finance)
- [mjw/stock_market_tweets](https://huggingface.co/datasets/mjw/stock_market_tweets)
After aggregation, we obtained **~50M financial records**.
A deduplication process reduced this to **~20M records**, available at:
👉 [clapAI/FinData-dedup](https://huggingface.co/datasets/clapAI/FinData-dedup)
---
## Training Hyperparameters
The following hyperparameters were used during training:
- **Learning rate:** 2e-4
- **Train batch size:** 24
- **Eval batch size:** 24
- **Seed:** 0
- **Gradient accumulation steps:** 128
- **Effective total train batch size:** 3072
- **Optimizer:** `AdamW_Torch_Fused` with betas=(0.9, 0.999), epsilon=1e-08
- **LR scheduler:** Linear
- **Epochs:** 1
---
## Evaluation Benchmarks
We benchmarked **Fin-ModernBERT** against two strong baselines:
- [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert)
- [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)
### Fine-tuning Setup
All models were fine-tuned under the same configuration:
- **Optimizer:** AdamW
- **Learning rate:** 5e-5
- **Batch size:** 16
- **Epochs:** 5
- **Scheduler:** Linear
### Results
| Dataset | Metric | FinBERT (ProsusAI) | ModernBERT-base | Fin-ModernBERT |
|---------|--------|---------------------|-----------------|----------------|
| CIKM (datht/fin-cikm) | F1-score | 42.77 | 53.08 | **54.89** |
| PhraseBank (soumakchak/phrasebank) | F1-score | 86.33 | 85.03 | **88.09** |
> Further evaluations on additional datasets and tasks are ongoing to provide a more comprehensive view of its performance.
---
## Use Cases
Fin-ModernBERT can be used for various financial NLP applications, such as:
- **Financial Sentiment Analysis** (e.g., market mood detection from news/tweets)
- **Event-driven Stock Prediction**
- **Financial Named Entity Recognition (NER)** (companies, tickers, financial instruments)
- **Document Classification & Clustering**
- **Question Answering over financial reports and news**
---
## How to Use
```python
from transformers import AutoTokenizer, AutoModel
model_name = "clapAI/Fin-ModernBERT"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)
text = "Federal Reserve hints at possible interest rate cuts."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
```
## Citation
If you use this model, please cite:
```@misc{finmodernbert2025,
title={Fin-ModernBERT: Continual Pretraining of ModernBERT for Financial Domain},
author={ClapAI},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/clapAI/Fin-ModernBERT}}
}